CUDA Error : CUDA driver version is insufficient for CUDA runtime version #1425
Unanswered
VijayAsokkumar
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi All,
I am using llamacpppython in my app, which I have installed in a conda environment. I have built a chat application using the LLaMA 2 7b model with Python Flask. I was able to use it on my laptop with an M1 chip. However, when I try to deploy the app on an AWS g4dn.xlarge instance with a Tesla T4 GPU, I am facing the following error whenever the app tries to use llamacpppython:
CUDA error 35 at /home/conda/feedstock_root/build_artifacts/llama.cpp_1703017359354/work/ggml-cuda.cu:493: CUDA driver version is insufficient for CUDA runtime version
GGML_ASSERT: /home/conda/feedstock_root/build_artifacts/llama.cpp_1703017359354/work/ggml-cuda.cu:493: !"CUDA error"
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Wed_Nov_22_10:17:15_PST_2023
Cuda compilation tools, release 12.3, V12.3.107
Build cuda_12.3.r12.3/compiler.33567101_0
Essentially, I need suggestions on the following areas:
The supported CUDA driver version for the Tesla T4 GPU, as the AWS instance runs on Ubuntu 18.04.
I noticed that the default CUDA driver is version 9, and I have installed version 12.3. I need guidance on how to configure the latest CUDA driver within the conda environment.
Instructions on how to enable the app to utilize the GPU.
Thanks,
Vijay Asokkumar
Beta Was this translation helpful? Give feedback.
All reactions