ERROR: Could not build wheels for llama-cpp-python #1617

inst32i · 2024-07-23T18:35:48Z

Current Behavior

I run the following:
CMAKE_ARGS="-DGGML_CUDA=on" pip install llama-cpp-python --verbose

an error occured:
ERROR: Failed building wheel for llama-cpp-python

Environment and Context

Physical hardware:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 46 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 16
On-line CPU(s) list: 0-15
Vendor ID: GenuineIntel
Model name: Intel Xeon Processor (Skylake, IBRS)
CPU family: 6
Model: 85
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 16
Stepping: 4
BogoMIPS: 4389.68
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopol
ogy cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3d
nowprefetch invpcid_single pti ssbd ibrs ibpb stibp fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clflushopt cl
wb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 arat pku ospke avx512_vnni md_clear
Virtualization features:
Hypervisor vendor: KVM
Virtualization type: full
Caches (sum of all):
L1d: 512 KiB (16 instances)
L1i: 512 KiB (16 instances)
L2: 64 MiB (16 instances)
L3: 256 MiB (16 instances)
NUMA:
NUMA node(s): 1
NUMA node0 CPU(s): 0-15
Vulnerabilities:
Itlb multihit: KVM: Mitigation: VMX unsupported
L1tf: Mitigation; PTE Inversion
Mds: Mitigation; Clear CPU buffers; SMT Host state unknown
Meltdown: Mitigation; PTI
Mmio stale data: Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown
Retbleed: Mitigation; IBRS
Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Spectre v2: Mitigation; IBRS, IBPB conditional, RSB filling, PBRSB-eIBRS Not affected
Srbds: Not affected
Tsx async abort: Mitigation; Clear CPU buffers; SMT Host state unknown
Operating System:
ubuntu1~22.04
SDK version:

$ python3 --3.11
$ make --4.3
$ g++ --11.4.0

Failure Information (for bugs)

...
FAILED: vendor/llama.cpp/examples/llava/llama-llava-cli
: && /usr/bin/g++ -pthread -B /mnt/x_env/compiler_compat -O3 -DNDEBUG vendor/llama.cpp/examples/llava/CMakeFiles/llava.dir/llava.cpp.o vendor/llama.cpp/examples/llava/CMakeFiles/llava.dir/clip.cpp.o vendor/llama.cpp/examples/llava/CMakeFiles/llama-llava-cli.dir/llava-cli.cpp.o -o vendor/llama.cpp/examples/llava/llama-llava-cli -Wl,-rpath,/tmp/tmp6bws6ysg/build/vendor/llama.cpp/src:/tmp/tmp6bws6ysg/build/vendor/llama.cpp/ggml/src: vendor/llama.cpp/common/libcommon.a vendor/llama.cpp/src/libllama.so vendor/llama.cpp/ggml/src/libggml.so && :
/mnt/x_env/compiler_compat/ld: warning: libcuda.so.1, needed by vendor/llama.cpp/ggml/src/libggml.so, not found (try using -rpath or -rpath-link)
/mnt/x_env/compiler_compat/ld: warning: libgomp.so.1, needed by vendor/llama.cpp/ggml/src/libggml.so, not found (try using -rpath or -rpath-link)
/mnt/x_env/compiler_compat/ld: warning: libdl.so.2, needed by /usr/local/cuda-12.4/lib64/libcudart.so.12, not found (try using -rpath or -rpath-link)
/mnt/x_env/compiler_compat/ld: warning: libpthread.so.0, needed by /usr/local/cuda-12.4/lib64/libcudart.so.12, not found (try using -rpath or -rpath-link)
/mnt/x_env/compiler_compat/ld: warning: librt.so.1, needed by /usr/local/cuda-12.4/lib64/libcudart.so.12, not found (try using -rpath or -rpath-link)
/mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to cuMemCreate' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to GOMP_barrier@GOMP_1.0'
/mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to cuMemAddressReserve' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to cuMemUnmap'
/mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to GOMP_parallel@GOMP_4.0' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to cuMemSetAccess'
/mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to cuDeviceGet' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to omp_get_thread_num@OMP_1.0'
/mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to cuMemAddressFree' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to cuGetErrorString'
/mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to GOMP_single_start@GOMP_1.0' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to cuDeviceGetAttribute'
/mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to cuMemMap' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to cuMemRelease'
/mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to omp_get_num_threads@OMP_1.0' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to cuMemGetAllocationGranularity'
collect2: error: ld returned 1 exit status
ninja: build stopped: subcommand failed.

*** CMake build failed
error: subprocess-exited-with-error

× Building wheel for llama-cpp-python (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> See above for output.

Steps to Reproduce

conda activate <my_env>
CMAKE_ARGS="-DGGML_CUDA=on" pip install llama-cpp-python --verbose

Nvidia Driver Version: 550.54.14
Cuda Tookit Verion: V12.4.99

The text was updated successfully, but these errors were encountered:

gillbates · 2024-07-23T22:24:48Z

same issue here ...

bteinstein · 2024-07-24T00:08:55Z

same issure here too

XingchenMengxiang · 2024-07-24T10:41:22Z

same issure here too

TobiasKlapper · 2024-07-24T11:31:24Z

Same here

SweetestRug · 2024-07-24T16:57:58Z

Same here as well.

bodybreaker · 2024-07-25T00:36:18Z

Same here too

bodybreaker · 2024-07-29T01:21:01Z

Current Behavior

I run the following: CMAKE_ARGS="-DGGML_CUDA=on" pip install llama-cpp-python --verbose

an error occured: ERROR: Failed building wheel for llama-cpp-python

Environment and Context

Physical hardware:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 46 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 16
On-line CPU(s) list: 0-15
Vendor ID: GenuineIntel
Model name: Intel Xeon Processor (Skylake, IBRS)
CPU family: 6
Model: 85
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 16
Stepping: 4
BogoMIPS: 4389.68
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopol
ogy cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3d
nowprefetch invpcid_single pti ssbd ibrs ibpb stibp fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clflushopt cl
wb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 arat pku ospke avx512_vnni md_clear
Virtualization features:
Hypervisor vendor: KVM
Virtualization type: full
Caches (sum of all):
L1d: 512 KiB (16 instances)
L1i: 512 KiB (16 instances)
L2: 64 MiB (16 instances)
L3: 256 MiB (16 instances)
NUMA:
NUMA node(s): 1
NUMA node0 CPU(s): 0-15
Vulnerabilities:
Itlb multihit: KVM: Mitigation: VMX unsupported
L1tf: Mitigation; PTE Inversion
Mds: Mitigation; Clear CPU buffers; SMT Host state unknown
Meltdown: Mitigation; PTI
Mmio stale data: Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown
Retbleed: Mitigation; IBRS
Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Spectre v2: Mitigation; IBRS, IBPB conditional, RSB filling, PBRSB-eIBRS Not affected
Srbds: Not affected
Tsx async abort: Mitigation; Clear CPU buffers; SMT Host state unknown

Operating System:
ubuntu1~22.04

SDK version:
$ python3 --3.11
$ make --4.3
$ g++ --11.4.0
Failure Information (for bugs)

... FAILED: vendor/llama.cpp/examples/llava/llama-llava-cli : && /usr/bin/g++ -pthread -B /mnt/x_env/compiler_compat -O3 -DNDEBUG vendor/llama.cpp/examples/llava/CMakeFiles/llava.dir/llava.cpp.o vendor/llama.cpp/examples/llava/CMakeFiles/llava.dir/clip.cpp.o vendor/llama.cpp/examples/llava/CMakeFiles/llama-llava-cli.dir/llava-cli.cpp.o -o vendor/llama.cpp/examples/llava/llama-llava-cli -Wl,-rpath,/tmp/tmp6bws6ysg/build/vendor/llama.cpp/src:/tmp/tmp6bws6ysg/build/vendor/llama.cpp/ggml/src: vendor/llama.cpp/common/libcommon.a vendor/llama.cpp/src/libllama.so vendor/llama.cpp/ggml/src/libggml.so && : /mnt/x_env/compiler_compat/ld: warning: libcuda.so.1, needed by vendor/llama.cpp/ggml/src/libggml.so, not found (try using -rpath or -rpath-link) /mnt/x_env/compiler_compat/ld: warning: libgomp.so.1, needed by vendor/llama.cpp/ggml/src/libggml.so, not found (try using -rpath or -rpath-link) /mnt/x_env/compiler_compat/ld: warning: libdl.so.2, needed by /usr/local/cuda-12.4/lib64/libcudart.so.12, not found (try using -rpath or -rpath-link) /mnt/x_env/compiler_compat/ld: warning: libpthread.so.0, needed by /usr/local/cuda-12.4/lib64/libcudart.so.12, not found (try using -rpath or -rpath-link) /mnt/x_env/compiler_compat/ld: warning: librt.so.1, needed by /usr/local/cuda-12.4/lib64/libcudart.so.12, not found (try using -rpath or -rpath-link) /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to cuMemCreate' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to GOMP_barrier@GOMP_1.0' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to cuMemAddressReserve' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to cuMemUnmap' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to GOMP_parallel@GOMP_4.0' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to cuMemSetAccess' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to cuDeviceGet' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to omp_get_thread_num@OMP_1.0' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to cuMemAddressFree' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to cuGetErrorString' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to GOMP_single_start@GOMP_1.0' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to cuDeviceGetAttribute' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to cuMemMap' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to cuMemRelease' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to omp_get_num_threads@OMP_1.0' /mnt/x_env/compiler_compat/ld: vendor/llama.cpp/ggml/src/libggml.so: undefined reference to cuMemGetAllocationGranularity' collect2: error: ld returned 1 exit status ninja: build stopped: subcommand failed.

*** CMake build failed error: subprocess-exited-with-error

× Building wheel for llama-cpp-python (pyproject.toml) did not run successfully. │ exit code: 1 ╰─> See above for output.

Steps to Reproduce

conda activate <my_env>

CMAKE_ARGS="-DGGML_CUDA=on" pip install llama-cpp-python --verbose

Nvidia Driver Version: 550.54.14 Cuda Tookit Verion: V12.4.99

I solved this problem.
This happend when Cuda version is different with Cuda toolkit version.

You need to check
cuda-version with nvidia-smi

and check cuda-toolkit version wih conda list | grep cuda-toolkit

My version were 12.2 , 11.8

Viagounet · 2024-07-31T17:30:24Z

Same here.
Installation worked fine with CMAKE_ARGS="-DLLAMA_CUBLAS=on" for llama-cpp-python <= 2.79.0.
I now get the same error as OP for llama-cpp-python >= 2.80.0, whether I use CMAKE_ARGS="-DLLAMA_CUBLAS=on" or CMAKE_ARGS="-DGGML_CUDA=on"

hhhhpaaa · 2024-08-10T11:36:43Z

same issure here too in WSL2

gilbertc · 2024-08-12T02:27:50Z

same issue here too, WSL2 on Windows 10.

tigert1998 · 2024-08-20T02:50:33Z

same issue here

tigert1998 · 2024-08-20T09:06:40Z

I found a workaround to fix this issue:

clone this project and check out the version you would like to install
build this project with CMake
then here comes the key part: overwrite pyproject.toml with the following content

# [build-system]
# requires = ["scikit-build-core[pyproject]>=0.9.2"]
# build-backend = "scikit_build_core.build"

[build-system]
requires = ["setuptools>=61.0"]
build-backend = "setuptools.build_meta"

[project]
name = "llama_cpp_python"
dynamic = ["version"]
description = "Python bindings for the llama.cpp library"
readme = "README.md"
license = { text = "MIT" }
authors = [
    { name = "Andrei Betlen", email = "abetlen@gmail.com" },
]
dependencies = [
    "typing-extensions>=4.5.0",
    "numpy>=1.20.0",
    "diskcache>=5.6.1",
    "jinja2>=2.11.3",
]
requires-python = ">=3.8"
classifiers = [
    "Programming Language :: Python :: 3",
    "Programming Language :: Python :: 3.8",
    "Programming Language :: Python :: 3.9",
    "Programming Language :: Python :: 3.10",
    "Programming Language :: Python :: 3.11",
    "Programming Language :: Python :: 3.12",
]


[project.optional-dependencies]
server = [
    "uvicorn>=0.22.0",
    "fastapi>=0.100.0",
    "pydantic-settings>=2.0.1",
    "sse-starlette>=1.6.1",
    "starlette-context>=0.3.6,<0.4",
    "PyYAML>=5.1",
]
test = [
    "pytest>=7.4.0",
    "httpx>=0.24.1",
    "scipy>=1.10",
]
dev = [
    "black>=23.3.0",
    "twine>=4.0.2",
    "mkdocs>=1.4.3",
    "mkdocstrings[python]>=0.22.0",
    "mkdocs-material>=9.1.18",
    "pytest>=7.4.0",
    "httpx>=0.24.1",
]
all = [
    "llama_cpp_python[server,test,dev]",
]

# [tool.scikit-build]
# wheel.packages = ["llama_cpp"]
# cmake.verbose = true
# cmake.minimum-version = "3.21"
# minimum-version = "0.5.1"
# sdist.include = [".git", "vendor/llama.cpp/*"]

[tool.setuptools.packages.find]
include = ["llama_cpp"]

[tool.setuptools.package-data]
"llama_cpp" = ["lib/*"]

[tool.scikit-build.metadata.version]
provider = "scikit_build_core.metadata.regex"
input = "llama_cpp/__init__.py"

[project.urls]
Homepage = "https://github.com/abetlen/llama-cpp-python"
Issues = "https://github.com/abetlen/llama-cpp-python/issues"
Documentation = "https://llama-cpp-python.readthedocs.io/en/latest/"
Changelog = "https://llama-cpp-python.readthedocs.io/en/latest/changelog/"

[tool.pytest.ini_options]
testpaths = "tests"

run pip install . --verbose

blkqi · 2024-08-29T20:45:51Z

Adding the path to libcuda.so to the LD_LIBRARY_PATH environment variable allows the examples to link so that the build can succeed.

PurnaChandraPanda · 2024-09-15T13:45:21Z

Hello @blkqi

How did it work for you? Can you please share what all env or path settings to try?

JHH11 · 2024-09-16T02:56:41Z

Thank you @blkqi. Your advice really helped me. In my case, I used Dockerfile like that

ENV LD_LIBRARY_PATH=/usr/local/cuda-12.4/compat/libcuda.so
RUN CMAKE_ARGS="-DGGML_CUDA=on" pip install llama-cpp-python==0.2.90

levn · 2024-10-06T14:02:51Z

sudo apt install libcuda-12.4-1
which installs
/usr/lib/x86_64-linux-gnu/libcuda.so

umbertogriffo mentioned this issue Aug 2, 2024

make setup_cuda getting error umbertogriffo/rag-chatbot#6

Closed

abetlen added the bug Something isn't working label Aug 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ERROR: Could not build wheels for llama-cpp-python #1617

ERROR: Could not build wheels for llama-cpp-python #1617

inst32i commented Jul 23, 2024

gillbates commented Jul 23, 2024

bteinstein commented Jul 24, 2024

XingchenMengxiang commented Jul 24, 2024

TobiasKlapper commented Jul 24, 2024

SweetestRug commented Jul 24, 2024

bodybreaker commented Jul 25, 2024

bodybreaker commented Jul 29, 2024

Current Behavior

Environment and Context

Failure Information (for bugs)

Steps to Reproduce

Viagounet commented Jul 31, 2024 •

edited

Loading

hhhhpaaa commented Aug 10, 2024

gilbertc commented Aug 12, 2024

tigert1998 commented Aug 20, 2024

tigert1998 commented Aug 20, 2024

blkqi commented Aug 29, 2024

PurnaChandraPanda commented Sep 15, 2024

JHH11 commented Sep 16, 2024 •

edited

Loading

levn commented Oct 6, 2024

ERROR: Could not build wheels for llama-cpp-python #1617

ERROR: Could not build wheels for llama-cpp-python #1617

Comments

inst32i commented Jul 23, 2024

Current Behavior

Environment and Context

Failure Information (for bugs)

Steps to Reproduce

gillbates commented Jul 23, 2024

bteinstein commented Jul 24, 2024

XingchenMengxiang commented Jul 24, 2024

TobiasKlapper commented Jul 24, 2024

SweetestRug commented Jul 24, 2024

bodybreaker commented Jul 25, 2024

bodybreaker commented Jul 29, 2024

Current Behavior

Environment and Context

Failure Information (for bugs)

Steps to Reproduce

Viagounet commented Jul 31, 2024 • edited Loading

hhhhpaaa commented Aug 10, 2024

gilbertc commented Aug 12, 2024

tigert1998 commented Aug 20, 2024

tigert1998 commented Aug 20, 2024

blkqi commented Aug 29, 2024

PurnaChandraPanda commented Sep 15, 2024

JHH11 commented Sep 16, 2024 • edited Loading

levn commented Oct 6, 2024

Viagounet commented Jul 31, 2024 •

edited

Loading

JHH11 commented Sep 16, 2024 •

edited

Loading