errors with cuda ops installation #2

dvschultz · 2021-02-01T17:07:22Z

tested on a fresh Colab V100 and P100

dvschultz · 2021-02-01T17:19:46Z

fixed by installing ninja. Might recommend adding that to the readme as a requirement

woctezuma · 2021-02-01T17:30:31Z

I have encountered the same issue on Colab, and your fix works!

%pip install ninja

nurpax · 2021-02-02T10:16:39Z

@dvschultz Thanks for the report! README.md will be updated.

futscdav · 2021-02-04T12:39:44Z

Also note that nvcc doesn't work with new gcc, so if you have system default gcc > 8, pytorch will honor the CC env variable, do
export CC=g++-8
before you run any scripts that would build the cuda kernels.

tasinislam21 · 2021-02-04T21:30:41Z

I did install ninja but then I got -> OMP: Error #15: Initializing libiomp5md.dll, but found libiomp5md.dll already initialized. This is happened during evaluating the metrics.

TheodoreGalanos · 2021-02-07T14:35:04Z

Still getting this error on a vast.ai VM with pytorch 1.7.1 and cuda 11 installed. I installed all required packages but whatever I try I keep getting the errors when trying to compile the custom cuda ops.

Is there perhaps a guide for Linux and ada-pytorch? It seems to be that it should work out of the box, but unfortunately it does not.

p.s. I have already made it work in windows by installing vc2019 and cuda 11. Would love to make it run on the VM so that I can train larger models.

wuyuyu1024 · 2021-02-09T14:45:23Z

Having same error. I'm on win10 with RTX-3070 GPU and torch 1.7.1+cu110. I also installed required packages and deleted torch_extensions.
Here is my log:
log.txt
Anyone could help me? 🙏

gokhanbaydar · 2021-02-14T20:29:31Z

If still not working try installing Windows 10 SDK, I had the same problem, installed Windows 10 SDK and now its working fine.

Dhruva-Storz · 2021-02-15T20:49:44Z

still have this issue on linux with CUDA 11.0 after installing ninja, is there a specific version of ninja we need to install? I have the same error on both conda and pip (after pasting the line in the README)

nurpax · 2021-02-15T21:33:06Z

The original poster in this bug filed this for Colab. Not sure what @Dhruva-Storz is running on.

Try changing this line to get more details about what could be going wrong:

verbosity = 'brief' # Verbosity level: 'none', 'brief', 'full'

to

verbosity = 'full'

and check if you get anything relevant in the log.

Remember to completely remove your torch extensions dir (search for TORCH_EXTENSIONS_DIR on https://pytorch.org/docs/stable/cpp_extension.html for details) when re-running the code.

Usually this is a matter of CUDA SDK (the one you have to install yourself, not the pytorch bundled cuda toolkit) not being installed properly, or there being multiple versions of it and some old or otherwise incompatible version gets used when building our custom extensions.

Dhruva-Storz · 2021-02-15T22:25:25Z

My apologies for not giving enough info.

Im running on :

ubuntu 20.04.1,
CUDA 11.1,
RTX 3090

My pytorch installation is 1.7.1 with cuda toolkit 11.0
All packages installed on python virtual environment, same errors when using conda virtual environment

To reproduce error

python generate.py --outdir=out --trunc=1 --seeds=85,265,297,849 --network=models/network-snapshot-000144.pkl

Important bits of error after deleting pytorch extensions dir, setting verbosity to full

FAILED: bias_act.cuda.o 
...
nvcc fatal   : Unsupported gpu architecture 'compute_86'
...
FAILED: upfirdn2d.cuda.o 
...
nvcc fatal   : Unsupported gpu architecture 'compute_86'
...
Error building extension 'upfirdn2d_plugin'
  warnings.warn('Failed to build CUDA kernels for upfirdn2d. Falling back to slow reference implementation. Details:\n\n' + str(sys.exc_info()[1]))
Setting up PyTorch plugin "upfirdn2d_plugin"...
Using /media/SharedUsers/DhruvG/home/.cache/torch_extensions as PyTorch extensions root...
No modifications detected for re-loaded extension module upfirdn2d_plugin, skipping build step...
Loading extension module upfirdn2d_plugin...
/media/SharedUsers/DhruvG/home/Documents/stylegan2-ada-pytorch/torch_utils/ops/upfirdn2d.py:34: UserWarning: Failed to build CUDA kernels for upfirdn2d. Falling back to slow reference implementation. Details:

You mentioned in the README that we need to use cuda-toolkit 11.1, but the website has no installation instructions for 11.1. This may be the source of the problem. do you have any suggestions on what might be causing this?

nurpax · 2021-02-15T22:32:57Z

I now see that the README is quite confusing about this.

In order to run on RTX 3090, you need to install:

pytorch 1.7 built for cuda 11.0 (or later, but at the time of writing cuda 11.0 build is the latest).
CUDA 11.1 toolkit (from NVIDIA's website). If you can't find CUDA 11.1, CUDA 11.2 probably works too.

The latter is required to build our custom pytorch extensions. Nvcc from CUDA 11.0 will fail with the error you saw above if you're running it on RTX 3090. Nvcc from CUDA 11.1 should work.

ghost · 2021-02-17T19:40:56Z

No solution yet?

ghost · 2021-02-17T19:43:43Z

TORCH_EXTENSIONS_DIR

this solution does not work either.

ghost · 2021-02-17T19:54:30Z

If still not working try installing Windows 10 SDK, I had the same problem, installed Windows 10 SDK and now its working fine.

your solution did not fix my problem either.

Dhruva-Storz · 2021-02-17T19:55:24Z

TORCH_EXTENSIONS_DIR

Can you please elaborate on Remember to completely remove your torch extensions dir (search for TORCH_EXTENSIONS_DIR on https://pytorch.org/docs/stable/cpp_extension.html for details) when re-running the code..... I am not sure I understand what you want us to do

I havent found a way to safely install cuda 11.1 on my work computer because it might interfere with the work of others, so I havent been able to test nurpax's solution. However, it seems like this should fix the problem as the build errors seem to be related to nvcc. If not, the code still runs, you just have to disable warnings with

python -W ignore foo.py

When they say remove torch_extensions_dir, I believe they mean that you delete the folder where the custom torch extensions were installed. Mine was in ~/.cache/torch_extensions

Im probably going to wait for official cuda 11.1 support from pytorch so I can safely install it in an environment. However, if anyone has solutions on how to install two different cuda toolkits safely, do let me know.

nurpax · 2021-02-17T20:17:43Z

I’m not sure if installing CUDA toolkit from Conda is enough (ie. as part of pytorch installation). I think you really do need a separate full CUDA installation with nvcc, headers, the whole nine yards. Not from Conda or Pip but using NVIDIA’s packages/installers. I recall trying without it, using just what’s bundled with pytorch installation and I don’t think it contained everything that’s required to build our CUDA kernels.

I’d be happy to be shown wrong on this as it’d simplify the installation instructions.

At least on Windows, you can have multiple CUDA versions installed simultaneously. Safer is of course to match what CUDA you have in the PATH with what your pytorch was built with.

If you do end up installing different CUDA SDKs, don’t let the installers touch your GPU drivers. Those are best kept at yiur most recent version.

ghost · 2021-02-17T20:38:43Z

@nurpax I am using a separate full CUDA installation with nvcc, cudNN etc..

Print full traceback when custom extension build fails. Also allow pytorch 1.9 so that this runs against pytorch upstream devel builds. issues #2, #28, #35, #37, #39

dokluch · 2021-03-18T08:01:09Z

Stuck here big time with ImportError: No module named 'upfirdn2d_plugin'

I am using a vast.ai instance nvidia/cuda:11.2.1-cudnn8-runtime-ubuntu18.04

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla V100-PCIE...  Off  | 00000000:00:07.0 Off |                    0 |
| N/A   30C    P0    35W / 250W |      0MiB / 16160MiB |      0%      Default |

Conda environment is set with
conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=11.0 -c pytorch --yes
(doesn't matter if I try a newer one)

What I've tried

FIrst I made sure my VM has CUDA 11.2 installed. Then I've installed a newer torch with CUDA 11.1.1, which did not help and I've rolled back.

Removed torch_extensions
Just as described here:
#11

Didn't help

gcc
I found this thread and
#35

And tried installing gcc7
conda install -c conda-forge/label/gcc7 gcc_linux-64 (didn't help)

and even gcc5
conda install -c psi4 gcc-5
The latter sent me in a weird loop and I've abandoned this path.

This does not help either
#2 (comment)

Google Colab works fine and has ubuntu 18.04 with gcc 7.5.0 installed which I am trying to mimic. Hope that is the correct logic.

UPD:
Another instance with gcc 7.5.0 throws the same error as well

gcc --version
gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Copyright (C) 2017 Free Software Foundation, Inc.

UPD2
Installing gcc 5 as described here: https://askubuntu.com/questions/1087150/install-gcc-5-on-ubuntu-18-04
Did not help either

Please advice on any possible next steps. No idea where to move next.

zjgt · 2021-06-01T13:08:50Z

Thanks for all the discussions above. It has been very helpful. I probably had all of the above problems, the visual studio definitely helped taking care of the C++ related compiling issues, and the installation of the whole cuda11.3 package (2.7G) from nvidia website took care of the upfirdn2d bug. Now my program is running in pycharm with pytorch 1.7.1, cuda 11.3, python 3.7.

Add index and seed feature to image and video generation

halfjoe · 2021-07-06T07:03:32Z

Thanks for all the discussions above.

I have successfully set up the environment with 3090, and would like to share my settings.
Ubuntu 18.04.4, gcc 7.5.0, CUDA 11.1, CUDNN 8.0.5, python 3.7, pytorch 1.7.1

Here CUDA and CUDNN are installed manually, and pytorch is built from source (https://github.com/pytorch/pytorch/tree/v1.7.1). After installing pytorch, print(torch.__version__) returns 1.7.0a0+57bffc3, which is OK.

tasinislam21 · 2022-04-01T09:17:53Z

I have encountered the same issue on Colab, and your fix works!
%pip install ninja

works on colab and windows but not on ubuntu 20.04

nurpax pushed a commit that referenced this issue Feb 2, 2021

Update README.md python requirements (#2)

bb409d8

nurpax closed this as completed Feb 2, 2021

nurpax mentioned this issue Feb 3, 2021

RuntimeError: CUDA error: no kernel image is available for execution on the device #6

Closed

woctezuma mentioned this issue Feb 3, 2021

stucks on Setting up PyTorch plugin "upfirdn2d_plugin"... #11

Closed

hadaev8 mentioned this issue Feb 12, 2021

No module named 'upfirdn2d_plugin' #34

Closed

nurpax mentioned this issue Feb 15, 2021

No module named 'upfirdn2d_plugin' #35

Closed

nurpax mentioned this issue Feb 16, 2021

No module named 'upfirdn2d_plugin' #37

Closed

nurpax pushed a commit that referenced this issue Feb 17, 2021

CUDA requirements update in README.md (issues #2, #28, #35, #37)

386669a

nurpax mentioned this issue Feb 17, 2021

some errors with windows10 with RTX-3070 #28

Closed

ghost mentioned this issue Feb 17, 2021

upfirdn2d_plugin Problem #39

Closed

dokluch mentioned this issue Mar 18, 2021

Vast.ai instance - **No module named 'upfirdn2d_plugin'** #72

Closed

woctezuma mentioned this issue Apr 26, 2021

ImportError: No module named 'upfirdn2d_plugin' #97

Open

snakch pushed a commit to snakch/stylegan2-ada-pytorch that referenced this issue Jun 20, 2021

Merge pull request NVlabs#2 from pbizimis/main

c9deee2

Add index and seed feature to image and video generation

woctezuma mentioned this issue Feb 11, 2022

Problem running Stylegan2 in collar autonomousvision/projected-gan#54

Closed

SushkoVadim mentioned this issue May 30, 2022

Additional setup of DA boschresearch/one-shot-synthesis#1

Closed

dwkim78 mentioned this issue Jun 13, 2024

ImportError: No module name upfirdn2d_plugin " fenglinglwb/MAT#104

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

errors with cuda ops installation #2

errors with cuda ops installation #2

dvschultz commented Feb 1, 2021

dvschultz commented Feb 1, 2021

woctezuma commented Feb 1, 2021

nurpax commented Feb 2, 2021

futscdav commented Feb 4, 2021

tasinislam21 commented Feb 4, 2021

TheodoreGalanos commented Feb 7, 2021

wuyuyu1024 commented Feb 9, 2021 •

edited

Loading

gokhanbaydar commented Feb 14, 2021

Dhruva-Storz commented Feb 15, 2021

nurpax commented Feb 15, 2021 •

edited

Loading

Dhruva-Storz commented Feb 15, 2021

nurpax commented Feb 15, 2021 •

edited

Loading

ghost commented Feb 17, 2021

ghost commented Feb 17, 2021 •

edited by ghost

Loading

ghost commented Feb 17, 2021

Dhruva-Storz commented Feb 17, 2021 •

edited

Loading

nurpax commented Feb 17, 2021

ghost commented Feb 17, 2021

dokluch commented Mar 18, 2021 •

edited

Loading

zjgt commented Jun 1, 2021

halfjoe commented Jul 6, 2021 •

edited

Loading

tasinislam21 commented Apr 1, 2022

errors with cuda ops installation #2

errors with cuda ops installation #2

Comments

dvschultz commented Feb 1, 2021

dvschultz commented Feb 1, 2021

woctezuma commented Feb 1, 2021

nurpax commented Feb 2, 2021

futscdav commented Feb 4, 2021

tasinislam21 commented Feb 4, 2021

TheodoreGalanos commented Feb 7, 2021

wuyuyu1024 commented Feb 9, 2021 • edited Loading

gokhanbaydar commented Feb 14, 2021

Dhruva-Storz commented Feb 15, 2021

nurpax commented Feb 15, 2021 • edited Loading

Dhruva-Storz commented Feb 15, 2021

nurpax commented Feb 15, 2021 • edited Loading

ghost commented Feb 17, 2021

ghost commented Feb 17, 2021 • edited by ghost Loading

ghost commented Feb 17, 2021

Dhruva-Storz commented Feb 17, 2021 • edited Loading

nurpax commented Feb 17, 2021

ghost commented Feb 17, 2021

dokluch commented Mar 18, 2021 • edited Loading

What I've tried

zjgt commented Jun 1, 2021

halfjoe commented Jul 6, 2021 • edited Loading

tasinislam21 commented Apr 1, 2022

wuyuyu1024 commented Feb 9, 2021 •

edited

Loading

nurpax commented Feb 15, 2021 •

edited

Loading

nurpax commented Feb 15, 2021 •

edited

Loading

ghost commented Feb 17, 2021 •

edited by ghost

Loading

Dhruva-Storz commented Feb 17, 2021 •

edited

Loading

dokluch commented Mar 18, 2021 •

edited

Loading

halfjoe commented Jul 6, 2021 •

edited

Loading