-
Notifications
You must be signed in to change notification settings - Fork 218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failed CUDA.jl initialization breaks Flux? #1952
Comments
That's just an informational message. If you have the NVIDIA driver installed, CUDA.jl is going to assume that you want the package to work, so it'll let you know if it isn't functional. If you aren't interested in GPU functionality, you should try to avoid loading CUDA.jl (i.e. using Revise, package extensions, preferences, etc). |
I am actually using Flux, which calls CUDA by default. Thus, not using CUDA is not an option. The application is a small ML model designed to run on multi-core only. |
Extensive details discussed at Julia Discourse: https://discourse.julialang.org/t/bring-julia-code-to-embedded-hardware-arm/19979/79?u=cirobr |
AFAIU Flux will be moving to package extensions at some point in the future. For now, you can safely ignore the message, or uninstall the NVIDIA driver to get rid of it. |
The "Error: Failed to initialize CUDA" persists at three scenarios tested: no driver installed, with nvidia-530-driver, and with nvidia-530-driver-open installed. In a few attempts, code execution breaks at "using Flux". By resuming from the next line of code, the rest of the code and all Flux calls were executed. |
That is impossible. The
The output you quoted above comes from Lines 127 to 128 in a719eb3
using Flux (i.e., it does not 'break execution').
|
Evidences show otherwise. Usage of Flux became impossible without GPU. Thanks anyway. |
Can you help me understand then? I linked to the source code that generates the output you presented, and nothing there should break precompilation as the messages are purely informational. And the NO_DEVICE error really is generated by |
And to add some more to this point, as we've recently had another user run into that: the NVIDIA driver is often split into multiple parts, so it's possible you removed what provides I agree that it's confusing that CUDA.jl generates an error message only when you have the NVIDIA driver installed, but that's just the heuristic we've currently come up with, as people are likely to want a functional CUDA stack when they have the NVIDIA driver installed. In the future, I plan to make this message an error, "forcing" downstream users to only install CUDA.jl when they want to use GPU support. That will make the situation much more clear, however, it requires that downstream packages like Flux use package extensions. That's not the case yet, so we remain in the situation where CUDA.jl needs to be importable on systems without a GPU, because Flux.jl depends on it unconditionally. |
Let's re-open this until we figure out your problem. |
Have already volunteered in this topic for several days to document the issue in the best possible way, in order to be duplicated by an expert. As the claim has already been dismissed (couple of seconds after being opened) and you are certain that "this is impossible", let's wait for comments from others. Sorry for the inconvenient, and thanks for your time. |
I did not (intend to) dismiss your problem, only certain elements of your report, like the fact that you mention the exact problem re-occurs after removing the NVIDIA driver, which is in fact impossible. Stating that is a matter of debugging the issue, hoping that it would help you to e.g. fully remove the driver, or otherwise resolve the problem. In any case, I re-opened this issue in order to help you. Without additional information from your end, it won't be possible to resolve this, so I'll close this again. Note that this is just to prevent unresolved issues from lingering on, feel free to post more information when you want to resolve this issue again. |
Cheers,
Using latest Julia 1.9.1 on a AArch64 instance with no GPU and Ubuntu Server 22.04, latest CUDA package.
Pkg.add("CUDA") gives the following warning:
1 dependency had warnings during precompilation: ┌ Random123 [74087812-796a-5b5d-8853-05524746bad3] │ ┌ Warning: AES-NI is not enabled, so AESNI and ARS are not available. │ └ @ Random123 ~/.julia/packages/Random123/u5oEp/src/Random123.jl:55
Despite of that, CUDA package precompiles.
using CUDA, however, is not possible:
julia> using CUDA ┌ Error: Failed to initialize CUDA │ exception = │ CUDA error (code 100, CUDA_ERROR_NO_DEVICE)
Regards,
The text was updated successfully, but these errors were encountered: