Examples don't run with CUDA12 #599

EtienneT · 2024-03-13T17:10:28Z

I have CUDA 12, I know it works because I ran custom pytorch model for other projects.

But when I try to run LlamaSharp.Examples with LLamaSharp.Backend.Cpu, it works fine. But when I try to use it with LLamaSharp.Backend.Cuda12, it crash right away with the following error:

System.TypeInitializationException
  HResult=0x80131534
  Message=The type initializer for 'LLama.Native.NativeApi' threw an exception.
  Source=LLamaSharp
  StackTrace:
   at LLama.Native.NativeApi.llama_empty_call() in C:\work\Projects\LLamaSharp\LLama\Native\NativeApi.cs:line 27
   at Program.<<Main>$>d__0.MoveNext() in C:\work\Projects\LLamaSharp\LLama.Examples\Program.cs:line 24

  This exception was originally thrown at this call stack:
    LLama.Native.NativeApi.NativeApi() in NativeApi.Load.cs

Inner Exception 1:
RuntimeError: The native library cannot be correctly loaded. It could be one of the following reasons: 
1. No LLamaSharp backend was installed. Please search LLamaSharp.Backend and install one of them. 
2. You are using a device with only CPU but installed cuda backend. Please install cpu backend instead. 
3. One of the dependency of the native library is missed. Please use `ldd` on linux, `dumpbin` on windows and `otool`to check if all the dependency of the native library is satisfied. Generally you could find the libraries under your output folder.
4. Try to compile llama.cpp yourself to generate a libllama library, then use `LLama.Native.NativeLibraryConfig.WithLibrary` to specify it at the very beginning of your code. For more informations about compilation, please refer to LLamaSharp repo on github.

I tried running the project in Debug, also with GPU in the configuration manager, tried running it in .net 8, .net 6, and all combinations but always the same error. I am running the latest version for the nuget packages, 0.10.0.

The text was updated successfully, but these errors were encountered:

zsogitbe · 2024-03-14T14:05:51Z

You have probably added both backends CPU and CUDA. That is the reason for the crash. You need to remove one and thus keep only one backend.

EtienneT · 2024-03-14T14:29:06Z

Pretty sure I was testing it one at a time. I just to confirm I tested it again this morning, making sure GPU was alone:

Also to make sure, I deleted the bin and obj folders of the example project before testing again.

Same problem unfortunately.

zsogitbe · 2024-03-14T14:36:15Z

llava_shared.dll is missing in the distribution for CUDA v12. Try to download it from llama.cpp and put it manually into the right runtime folder.

EtienneT · 2024-03-14T14:45:45Z

Took this file from llama.cpp: llama-b2418-bin-win-cublas-cu12.2.0-x64.zip, then got the llava_shared.dll file and put it in LLama.Examples\bin\Debug\net8.0\runtimes\win-x64\native\cuda12.

Same problem.

zsogitbe · 2024-03-14T14:48:20Z

Try the right version maybe: https://github.com/ggerganov/llama.cpp/tree/d71ac90985854b0905e1abba778e407e17f9f887
The C++ dlls need to be compatible.

SignalRT · 2024-03-14T21:52:31Z

I will introduce the libraries in the Update Binary artifacts ASAP

KieranFoot · 2024-04-01T01:42:34Z

Even despite todays update, this issue persists.

AsakusaRinne · 2024-04-01T06:26:45Z

Hi, it could be confirmed as a BUG since it persists in v0.11.1. Could you please provide some information for us to find the problem? @KieranFoot @EtienneT

What is your full cuda version?
What is your CPU and GPU device? (It would be best if you follow this guide to print the cpu information)
Are you using x86 or x64?

EtienneT · 2024-04-01T15:30:48Z

This seems to be fixed for me now in the latest version.

Thanks,

KieranFoot · 2024-04-11T22:40:10Z

@AsakusaRinne Apologies, it isn't made clear in the repos docs that additional files are needed to use CUDA12. I assumed it would work out of the box as CUDA11 does.

Possibly the documentation could be improved to reflect this.

AsakusaRinne · 2024-04-12T06:14:14Z

@KieranFoot Is it because you installed CUDA12 instead of CUDA11?

KieranFoot · 2024-04-12T08:53:39Z

@AsakusaRinne I never installed CUDA11 manually, it just worked. So, when I switched the code to use CUDA12, I wrongly assumed it would also work out of the box.

AsakusaRinne · 2024-04-12T10:59:29Z

I never installed CUDA11 manually, it just worked. So, when I switched the code to use CUDA12, I wrongly assumed it would also work out of the box.

It's weird that CUDA11 backend could work without CUDA installed. Have you ever installed cublas?

zsogitbe · 2024-04-12T11:26:43Z

You need to update your display driver. Here is a reference: https://tech.amikelive.com/node-930/cuda-compatibility-of-nvidia-display-gpu-drivers/comment-page-1/

AsakusaRinne · 2024-04-12T15:40:15Z

@martindevans If I'm not misunderstanding it, we could append some cublas files to the same folder of llama.dll to make it possible to run cuda backend without having cuda installed? As shown in llama.cpp releases, there's a compressed file named cudart-llama-bin-win-cu11.7.1-x64.zip, which contains cublas64_11.dll, cublasLt64_11.dll and cudart64_110.dll.

martindevans · 2024-04-12T16:38:55Z

I don't know much about CUDA, but yes I think that would fix it (Onkitova tested it out in #371)

Last time we discussed this (ref) I think we decided they were too big to include in the main CUDA packages, but instead we could create another package which the CUDA packages depend on.

AsakusaRinne · 2024-04-12T18:03:48Z

I don't know much about CUDA, but yes I think that would fix it (Onkitova tested it out in #371)

Last time we discussed this (ref) I think we decided they were too big to include in the main CUDA packages, but instead we could create another package which the CUDA packages depend on.

Yes, thank you for the clarification. I'll look into this issue. :)

AsakusaRinne mentioned this issue Mar 15, 2024

build: add llama.cpp as submodule. #602

Merged

AsakusaRinne added the bug Something isn't working label Apr 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Examples don't run with CUDA12 #599

Examples don't run with CUDA12 #599

EtienneT commented Mar 13, 2024

zsogitbe commented Mar 14, 2024

EtienneT commented Mar 14, 2024

zsogitbe commented Mar 14, 2024

EtienneT commented Mar 14, 2024

zsogitbe commented Mar 14, 2024

SignalRT commented Mar 14, 2024

KieranFoot commented Apr 1, 2024

AsakusaRinne commented Apr 1, 2024

EtienneT commented Apr 1, 2024

KieranFoot commented Apr 11, 2024

AsakusaRinne commented Apr 12, 2024

KieranFoot commented Apr 12, 2024

AsakusaRinne commented Apr 12, 2024

zsogitbe commented Apr 12, 2024

AsakusaRinne commented Apr 12, 2024

martindevans commented Apr 12, 2024 •

edited

Loading

AsakusaRinne commented Apr 12, 2024

Examples don't run with CUDA12 #599

Examples don't run with CUDA12 #599

Comments

EtienneT commented Mar 13, 2024

zsogitbe commented Mar 14, 2024

EtienneT commented Mar 14, 2024

zsogitbe commented Mar 14, 2024

EtienneT commented Mar 14, 2024

zsogitbe commented Mar 14, 2024

SignalRT commented Mar 14, 2024

KieranFoot commented Apr 1, 2024

AsakusaRinne commented Apr 1, 2024

EtienneT commented Apr 1, 2024

KieranFoot commented Apr 11, 2024

AsakusaRinne commented Apr 12, 2024

KieranFoot commented Apr 12, 2024

AsakusaRinne commented Apr 12, 2024

zsogitbe commented Apr 12, 2024

AsakusaRinne commented Apr 12, 2024

martindevans commented Apr 12, 2024 • edited Loading

AsakusaRinne commented Apr 12, 2024

martindevans commented Apr 12, 2024 •

edited

Loading