Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Examples don't run with CUDA12 #599

Open
EtienneT opened this issue Mar 13, 2024 · 17 comments
Open

Examples don't run with CUDA12 #599

EtienneT opened this issue Mar 13, 2024 · 17 comments
Labels
bug Something isn't working

Comments

@EtienneT
Copy link

I have CUDA 12, I know it works because I ran custom pytorch model for other projects.

But when I try to run LlamaSharp.Examples with LLamaSharp.Backend.Cpu, it works fine. But when I try to use it with LLamaSharp.Backend.Cuda12, it crash right away with the following error:

System.TypeInitializationException
  HResult=0x80131534
  Message=The type initializer for 'LLama.Native.NativeApi' threw an exception.
  Source=LLamaSharp
  StackTrace:
   at LLama.Native.NativeApi.llama_empty_call() in C:\work\Projects\LLamaSharp\LLama\Native\NativeApi.cs:line 27
   at Program.<<Main>$>d__0.MoveNext() in C:\work\Projects\LLamaSharp\LLama.Examples\Program.cs:line 24

  This exception was originally thrown at this call stack:
    LLama.Native.NativeApi.NativeApi() in NativeApi.Load.cs

Inner Exception 1:
RuntimeError: The native library cannot be correctly loaded. It could be one of the following reasons: 
1. No LLamaSharp backend was installed. Please search LLamaSharp.Backend and install one of them. 
2. You are using a device with only CPU but installed cuda backend. Please install cpu backend instead. 
3. One of the dependency of the native library is missed. Please use `ldd` on linux, `dumpbin` on windows and `otool`to check if all the dependency of the native library is satisfied. Generally you could find the libraries under your output folder.
4. Try to compile llama.cpp yourself to generate a libllama library, then use `LLama.Native.NativeLibraryConfig.WithLibrary` to specify it at the very beginning of your code. For more informations about compilation, please refer to LLamaSharp repo on github.

I tried running the project in Debug, also with GPU in the configuration manager, tried running it in .net 8, .net 6, and all combinations but always the same error. I am running the latest version for the nuget packages, 0.10.0.

@zsogitbe
Copy link
Contributor

You have probably added both backends CPU and CUDA. That is the reason for the crash. You need to remove one and thus keep only one backend.

@EtienneT
Copy link
Author

Pretty sure I was testing it one at a time. I just to confirm I tested it again this morning, making sure GPU was alone:
image

Also to make sure, I deleted the bin and obj folders of the example project before testing again.

Same problem unfortunately.

@zsogitbe
Copy link
Contributor

llava_shared.dll is missing in the distribution for CUDA v12. Try to download it from llama.cpp and put it manually into the right runtime folder.

@EtienneT
Copy link
Author

Took this file from llama.cpp: llama-b2418-bin-win-cublas-cu12.2.0-x64.zip, then got the llava_shared.dll file and put it in LLama.Examples\bin\Debug\net8.0\runtimes\win-x64\native\cuda12.

Same problem.

@zsogitbe
Copy link
Contributor

Try the right version maybe: https://github.com/ggerganov/llama.cpp/tree/d71ac90985854b0905e1abba778e407e17f9f887
The C++ dlls need to be compatible.

@SignalRT
Copy link
Collaborator

I will introduce the libraries in the Update Binary artifacts ASAP

@KieranFoot
Copy link

Even despite todays update, this issue persists.

@AsakusaRinne AsakusaRinne added the bug Something isn't working label Apr 1, 2024
@AsakusaRinne
Copy link
Collaborator

Hi, it could be confirmed as a BUG since it persists in v0.11.1. Could you please provide some information for us to find the problem? @KieranFoot @EtienneT

  1. What is your full cuda version?
  2. What is your CPU and GPU device? (It would be best if you follow this guide to print the cpu information)
  3. Are you using x86 or x64?

@EtienneT
Copy link
Author

EtienneT commented Apr 1, 2024

This seems to be fixed for me now in the latest version.

Thanks,

@KieranFoot
Copy link

@AsakusaRinne Apologies, it isn't made clear in the repos docs that additional files are needed to use CUDA12. I assumed it would work out of the box as CUDA11 does.

Possibly the documentation could be improved to reflect this.

@AsakusaRinne
Copy link
Collaborator

@KieranFoot Is it because you installed CUDA12 instead of CUDA11?

@KieranFoot
Copy link

@AsakusaRinne I never installed CUDA11 manually, it just worked. So, when I switched the code to use CUDA12, I wrongly assumed it would also work out of the box.

@AsakusaRinne
Copy link
Collaborator

I never installed CUDA11 manually, it just worked. So, when I switched the code to use CUDA12, I wrongly assumed it would also work out of the box.

It's weird that CUDA11 backend could work without CUDA installed. Have you ever installed cublas?

@zsogitbe
Copy link
Contributor

You need to update your display driver. Here is a reference: https://tech.amikelive.com/node-930/cuda-compatibility-of-nvidia-display-gpu-drivers/comment-page-1/

@AsakusaRinne
Copy link
Collaborator

@martindevans If I'm not misunderstanding it, we could append some cublas files to the same folder of llama.dll to make it possible to run cuda backend without having cuda installed? As shown in llama.cpp releases, there's a compressed file named cudart-llama-bin-win-cu11.7.1-x64.zip, which contains cublas64_11.dll, cublasLt64_11.dll and cudart64_110.dll.

@martindevans
Copy link
Member

martindevans commented Apr 12, 2024

I don't know much about CUDA, but yes I think that would fix it (Onkitova tested it out in #371)

Last time we discussed this (ref) I think we decided they were too big to include in the main CUDA packages, but instead we could create another package which the CUDA packages depend on.

@AsakusaRinne
Copy link
Collaborator

I don't know much about CUDA, but yes I think that would fix it (Onkitova tested it out in #371)

Last time we discussed this (ref) I think we decided they were too big to include in the main CUDA packages, but instead we could create another package which the CUDA packages depend on.

Yes, thank you for the clarification. I'll look into this issue. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants