-
Notifications
You must be signed in to change notification settings - Fork 951
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ERROR: Could not build wheels for llama-cpp-python #1617
Comments
same issue here ... |
same issure here too |
1 similar comment
same issure here too |
Same here |
Same here as well. |
Same here too |
I solved this problem. You need to check and check cuda-toolkit version wih My version were 12.2 , 11.8 |
Same here. |
same issure here too in WSL2 |
same issue here too, WSL2 on Windows 10. |
same issue here |
I found a workaround to fix this issue:
# [build-system]
# requires = ["scikit-build-core[pyproject]>=0.9.2"]
# build-backend = "scikit_build_core.build"
[build-system]
requires = ["setuptools>=61.0"]
build-backend = "setuptools.build_meta"
[project]
name = "llama_cpp_python"
dynamic = ["version"]
description = "Python bindings for the llama.cpp library"
readme = "README.md"
license = { text = "MIT" }
authors = [
{ name = "Andrei Betlen", email = "abetlen@gmail.com" },
]
dependencies = [
"typing-extensions>=4.5.0",
"numpy>=1.20.0",
"diskcache>=5.6.1",
"jinja2>=2.11.3",
]
requires-python = ">=3.8"
classifiers = [
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.8",
"Programming Language :: Python :: 3.9",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.12",
]
[project.optional-dependencies]
server = [
"uvicorn>=0.22.0",
"fastapi>=0.100.0",
"pydantic-settings>=2.0.1",
"sse-starlette>=1.6.1",
"starlette-context>=0.3.6,<0.4",
"PyYAML>=5.1",
]
test = [
"pytest>=7.4.0",
"httpx>=0.24.1",
"scipy>=1.10",
]
dev = [
"black>=23.3.0",
"twine>=4.0.2",
"mkdocs>=1.4.3",
"mkdocstrings[python]>=0.22.0",
"mkdocs-material>=9.1.18",
"pytest>=7.4.0",
"httpx>=0.24.1",
]
all = [
"llama_cpp_python[server,test,dev]",
]
# [tool.scikit-build]
# wheel.packages = ["llama_cpp"]
# cmake.verbose = true
# cmake.minimum-version = "3.21"
# minimum-version = "0.5.1"
# sdist.include = [".git", "vendor/llama.cpp/*"]
[tool.setuptools.packages.find]
include = ["llama_cpp"]
[tool.setuptools.package-data]
"llama_cpp" = ["lib/*"]
[tool.scikit-build.metadata.version]
provider = "scikit_build_core.metadata.regex"
input = "llama_cpp/__init__.py"
[project.urls]
Homepage = "https://github.com/abetlen/llama-cpp-python"
Issues = "https://github.com/abetlen/llama-cpp-python/issues"
Documentation = "https://llama-cpp-python.readthedocs.io/en/latest/"
Changelog = "https://llama-cpp-python.readthedocs.io/en/latest/changelog/"
[tool.pytest.ini_options]
testpaths = "tests"
|
Adding the path to |
Hello @blkqi How did it work for you? Can you please share what all env or path settings to try? |
Thank you @blkqi. Your advice really helped me. In my case, I used Dockerfile like that ENV LD_LIBRARY_PATH=/usr/local/cuda-12.4/compat/libcuda.so
RUN CMAKE_ARGS="-DGGML_CUDA=on" pip install llama-cpp-python==0.2.90 |
|
Current Behavior
I run the following:
CMAKE_ARGS="-DGGML_CUDA=on" pip install llama-cpp-python --verbose
an error occured:
ERROR: Failed building wheel for llama-cpp-python
Environment and Context
Physical hardware:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 46 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 16
On-line CPU(s) list: 0-15
Vendor ID: GenuineIntel
Model name: Intel Xeon Processor (Skylake, IBRS)
CPU family: 6
Model: 85
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 16
Stepping: 4
BogoMIPS: 4389.68
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopol
ogy cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3d
nowprefetch invpcid_single pti ssbd ibrs ibpb stibp fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clflushopt cl
wb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 arat pku ospke avx512_vnni md_clear
Virtualization features:
Hypervisor vendor: KVM
Virtualization type: full
Caches (sum of all):
L1d: 512 KiB (16 instances)
L1i: 512 KiB (16 instances)
L2: 64 MiB (16 instances)
L3: 256 MiB (16 instances)
NUMA:
NUMA node(s): 1
NUMA node0 CPU(s): 0-15
Vulnerabilities:
Itlb multihit: KVM: Mitigation: VMX unsupported
L1tf: Mitigation; PTE Inversion
Mds: Mitigation; Clear CPU buffers; SMT Host state unknown
Meltdown: Mitigation; PTI
Mmio stale data: Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown
Retbleed: Mitigation; IBRS
Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Spectre v2: Mitigation; IBRS, IBPB conditional, RSB filling, PBRSB-eIBRS Not affected
Srbds: Not affected
Tsx async abort: Mitigation; Clear CPU buffers; SMT Host state unknown
Operating System:
ubuntu1~22.04
SDK version:
Failure Information (for bugs)
...
FAILED: vendor/llama.cpp/examples/llava/llama-llava-cli
: && /usr/bin/g++ -pthread -B /mnt/x_env/compiler_compat -O3 -DNDEBUG vendor/llama.cpp/examples/llava/CMakeFiles/llava.dir/llava.cpp.o vendor/llama.cpp/examples/llava/CMakeFiles/llava.dir/clip.cpp.o vendor/llama.cpp/examples/llava/CMakeFiles/llama-llava-cli.dir/llava-cli.cpp.o -o vendor/llama.cpp/examples/llava/llama-llava-cli -Wl,-rpath,/tmp/tmp6bws6ysg/build/vendor/llama.cpp/src:/tmp/tmp6bws6ysg/build/vendor/llama.cpp/ggml/src: vendor/llama.cpp/common/libcommon.a vendor/llama.cpp/src/libllama.so vendor/llama.cpp/ggml/src/libggml.so && :
/mnt/x_env/compiler_compat/ld: warning: libcuda.so.1, needed by vendor/llama.cpp/ggml/src/libggml.so, not found (try using -rpath or -rpath-link)
/mnt/x_env/compiler_compat/ld: warning: libgomp.so.1, needed by vendor/llama.cpp/ggml/src/libggml.so, not found (try using -rpath or -rpath-link)
/mnt/x_env/compiler_compat/ld: warning: libdl.so.2, needed by /usr/local/cuda-12.4/lib64/libcudart.so.12, not found (try using -rpath or -rpath-link)
/mnt/x_env/compiler_compat/ld: warning: libpthread.so.0, needed by /usr/local/cuda-12.4/lib64/libcudart.so.12, not found (try using -rpath or -rpath-link)
/mnt/x_env/compiler_compat/ld: warning: librt.so.1, needed by /usr/local/cuda-12.4/lib64/libcudart.so.12, not found (try using -rpath or -rpath-link)
/mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to
cuMemCreate' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to
GOMP_barrier@GOMP_1.0'/mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to
cuMemAddressReserve' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to
cuMemUnmap'/mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to
GOMP_parallel@GOMP_4.0' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to
cuMemSetAccess'/mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to
cuDeviceGet' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to
omp_get_thread_num@OMP_1.0'/mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to
cuMemAddressFree' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to
cuGetErrorString'/mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to
GOMP_single_start@GOMP_1.0' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to
cuDeviceGetAttribute'/mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to
cuMemMap' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to
cuMemRelease'/mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to
omp_get_num_threads@OMP_1.0' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to
cuMemGetAllocationGranularity'collect2: error: ld returned 1 exit status
ninja: build stopped: subcommand failed.
*** CMake build failed
error: subprocess-exited-with-error
× Building wheel for llama-cpp-python (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> See above for output.
Steps to Reproduce
Nvidia Driver Version: 550.54.14
Cuda Tookit Verion: V12.4.99
The text was updated successfully, but these errors were encountered: