Running locally will not use GPU #269

Videoteq · 2023-09-11T09:31:44Z

Describe the bug

Running codellama .gguf models locally on Linux: the interpeter prompt asks whether to use the GPU (in my case RTX 3060) but then doesn't use it, so it's really slow. :-(

Reproduce

N/A

Expected behavior

Expect the GPU to be used.

Screenshots

No response

Open Interpreter version

0.1.3

Python version

3.11.4

Operating System name and version

Linux Mint 21.2 Cinnamon 5.8.4 Kernel 5.15.0-83-generic

Additional context

All 8 cores in i7 running at very high utilization.

GPU user dedicated memory (12% of 12288MB)
GPU Utilization from 4% to 30%

ahoepf · 2023-09-11T13:50:43Z

Same problem here

Expected behavior
Expect the GPU to be used.

Screenshots
No response

Open Interpreter version
0.1.3

Python version
3.10.0

Operating System name and version
6.2.0-32-generic #32~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Fri Aug 18 10:40:13 UTC 2 x86_64 x86_64 x86_64 GNU/Linux

Additional context
All 6 cores in ryzen3600 running at very high utilization.

GPU user dedicated memory (12% of 12288MB)
0, NVIDIA GeForce GTX 970, 4096 MiB, 822 MiB, 3212 MiB, 58°C, P0, 2 %, 3 %
1, Tesla P40, 24576 MiB, 35 MiB, 24410 MiB, 30°C, P8, 0 %, 0 %

Maybe a missing parameter to llama-cpp api like "--gpu-layers"?

guillaumenaud · 2023-09-11T13:54:22Z

Same problem, CUDA drivers 12.0 works with oobagooba but cant run llama code 34B on open-interpreter, I can see the model is loaded in RAM even if GPU is selected.

Nvidia A6000, Ubuntu22.04

Videoteq · 2023-09-11T14:15:09Z

Same here @guillaumenaud works fine on oobabooga with several different.models.

vrijsinghani · 2023-09-11T15:34:52Z

Had same issue.

You need to ensure this command comes back with 'True':

python -c "from llama_cpp import GGML_USE_CUBLAS; print(GGML_USE_CUBLAS)"

Here's what I had to do:

pip uninstall llama-cpp-python
export CMAKE_ARGS="-DLLAMA_CUBLAS=on"
export FORCE_CMAKE=1
pip install llama-cpp-python -vv
python -c "from llama_cpp import GGML_USE_CUBLAS; print(GGML_USE_CUBLAS)"

The output of last command should be:
$ python -c "from llama_cpp import GGML_USE_CUBLAS; print(GGML_USE_CUBLAS)"
ggml_init_cublas: found 1 CUDA devices:
Device 0: NVIDIA GeForce RTX 3090, compute capability 8.6
True

Videoteq · 2023-09-11T15:50:02Z

Thanks @vrijsinghani I tried your code, but it came back with False. No mention of
ggml_init_cublas: found 1 CUDA devices:
Device 0: NVIDIA GeForce RTX 3090, compute capability 8.6
at all.

Thanks anyway for the advice.

guillaumenaud · 2023-09-11T16:53:16Z

Had same issue.

You need to ensure this command comes back with 'True':

python -c "from llama_cpp import GGML_USE_CUBLAS; print(GGML_USE_CUBLAS)"

Here's what I had to do:

pip uninstall llama-cpp-python export CMAKE_ARGS="-DLLAMA_CUBLAS=on" export FORCE_CMAKE=1 pip install llama-cpp-python -vv python -c "from llama_cpp import GGML_USE_CUBLAS; print(GGML_USE_CUBLAS)"

The output of last command should be: $ python -c "from llama_cpp import GGML_USE_CUBLAS; print(GGML_USE_CUBLAS)" ggml_init_cublas: found 1 CUDA devices: Device 0: NVIDIA GeForce RTX 3090, compute capability 8.6 True

thanks for the reply. Just reinstalled fresh nvidia driver to 535 and cuda to 12.2 which nvcc shows correct bin, now, i can see my card in nvidia-smi too. just followed your instructions (inside venv activated) it was indeed showing false and still is after (still not working).

jordanbtucker · 2023-09-11T18:31:10Z

For Linux, please run the following command, which will create out.txt, then upload that here.

# Note: Change CUDA_PATH to match your environment if the one below is not correct.

CUDA_PATH=/usr/local/cuda FORCE_CMAKE=1 CMAKE_ARGS='-DLLAMA_CUBLAS=on' \
pip install llama-cpp-python --force-reinstall --upgrade --no-cache-dir -vv > out.txt 2>&1

Videoteq · 2023-09-11T19:21:05Z

Here it is Jordan (took ages to create)
Thanks, Rob
out.txt

jordanbtucker · 2023-09-11T19:32:50Z

@Videoteq

Using pip 23.2.1 from /home/bob/miniconda3/lib/python3.11/site-packages/pip (python 3.11)

Can you please run which python and paste the response.

Videoteq · 2023-09-11T19:38:20Z

(base) bob@BobLinuxMint:~$ which python
/home/bob/miniconda3/bin/python

Is this what you mean?

jordanbtucker · 2023-09-11T19:43:07Z

Yes, thanks!

and just for good measure, can you also do:

which interpreter

Videoteq · 2023-09-11T19:44:57Z

(base) bob@BobLinuxMint:~$ which interpreter
/home/bob/.local/bin/interpreter

jordanbtucker · 2023-09-11T19:49:44Z

This is a tough one. Your logs indicate that GPU support was compiled for llama.cpp so python -c "from llama_cpp import GGML_USE_CUBLAS; print(GGML_USE_CUBLAS)" should return True.

Your paths to python and pip seem to line up too.

Just to make sure, can you please create a conda env to test this all in.

conda create -n oi python=3.11
conda activate oi
CUDA_PATH=/usr/local/cuda FORCE_CMAKE=1 CMAKE_ARGS='-DLLAMA_CUBLAS=on' \
pip install llama-cpp-python --force-reinstall --upgrade --no-cache-dir

python -c "from llama_cpp import GGML_USE_CUBLAS; print(GGML_USE_CUBLAS)"

Videoteq · 2023-09-11T19:54:29Z

Thanks for all your time and effort Jordan. Here's the final few lines...

Downloading typing_extensions-4.7.1-py3-none-any.whl (33 kB)
Building wheels for collected packages: llama-cpp-python
Building wheel for llama-cpp-python (pyproject.toml) ... done
Created wheel for llama-cpp-python: filename=llama_cpp_python-0.1.84-cp311-cp311-linux_x86_64.whl size=6211073 sha256=c8762c78673de505097aa9c9d51be043acdc27fd20a026b93e4ce41753b0d308
Stored in directory: /tmp/pip-ephem-wheel-cache-fk024i8i/wheels/3a/34/35/5dd54ae409c3ba072b2ebd7b7ca56f21e8bcae4e3f887ca164
Successfully built llama-cpp-python
Installing collected packages: typing-extensions, numpy, diskcache, llama-cpp-python
Successfully installed diskcache-5.6.3 llama-cpp-python-0.1.84 numpy-1.25.2 typing-extensions-4.7.1
Illegal instruction (core dumped)
(oi) bob@BobLinuxMint:~$

jordanbtucker · 2023-09-11T21:02:10Z

Well, I don't like to do this, but if you can't get it to work in a virtual environment like conda, then I'm out of ideas. You might be able to get support from the https://github.com/abetlen/llama-cpp-python or https://github.com/ggerganov/llama.cpp repos.

Videoteq · 2023-09-12T04:19:02Z

Thanks Jordan

ahoepf · 2023-09-13T13:51:48Z

Lol, i forgot to install the cuda-toolkitX-)

sudo apt install nvidia-cuda-toolkit nvidia-cuda-toolkit-gcc
CMAKE_ARGS="-DLLAMA_CUBLAS=on" pip install llama-cpp-python --force-reinstall --upgrade --no-cache-dir

That works for me:-)

Videoteq · 2023-09-13T15:06:00Z

Thanks @ahoepf nearly worked! Still no joy, but getting closer...

KewkLW · 2023-09-13T16:57:57Z

Essentially same thing. Everything checks out, no errors it just simply does not want to use the gpu.

ahoepf · 2023-09-13T18:07:38Z

Make a check with:
python -c "from llama_cpp import GGML_USE_CUBLAS; print(GGML_USE_CUBLAS)"
what's the response?

KewkLW · 2023-09-13T22:51:53Z

Make a check with: python -c "from llama_cpp import GGML_USE_CUBLAS; print(GGML_USE_CUBLAS)" what's the response?

hrm that was interesting. It failed when I was in my env but I tried it out of it and it succeeded, so ran interpreter and it used my gpu >__<

jordanbtucker · 2023-09-14T03:22:46Z

@Videoteq So, I've got a PR that should fix this, but there's also a way to fix this with conda.

In a conda env, run:

conda install cuda -c nvidia
pip install --force-reinstall llama-cpp-python --prefer-binary --extra-index-url=https://jllllll.github.io/llama-cpp-python-cuBLAS-wheels/AVX2/cu122

That should make GPU work.

My PR will eliminate the need for conda, and if you're interested I'd appreciate if you're able to test the steps in the PR and give some feedback.

#339

Videoteq · 2023-09-14T11:30:58Z

@jordanbtucker Following previous attempts I switched to Windows (I have 2 SSDs on my PC, one Linux, the other Windows). When I rebooted into Linux only one of my two screens was working, so had to reinstall Nvidia drivers and generally faff around with the settings for a while. Anyway, once back to two screens I tried running --local and this time it all worked and installed llama-pcc. But - then the terminal quit so I tried again - all went well, but again Terminal quit at the last step.

Tried your code above, but this time it failed with the error shown below

Open Interpreter will use Code Llama for local execution. Use your arrow keys to
set up the model.

[?] Parameter count (smaller is faster, larger is more capable): 7B

7B
13B
34B

[?] Quality (smaller is faster, larger is more capable): Small | Size: 2.6 GB, Estimated RAM usage: 5.1 GB

Small | Size: 2.6 GB, Estimated RAM usage: 5.1 GB
Medium | Size: 3.8 GB, Estimated RAM usage: 6.3 GB
Large | Size: 6.7 GB, Estimated RAM usage: 9.2 GB
See More

[?] Use GPU? (Large models might crash on GPU, but will run more quickly) (...: Y

Model found at /home/bob/.local/share/Open
Interpreter/models/codellama-7b-instruct.Q2_K.gguf
Illegal instruction (core dumped)
(oi) bob@BobLinuxMint:~$

Anyhow, thanks for all your help Jordan.

Videoteq · 2023-09-14T11:55:06Z

@jordanbtucker The plot thickens...

Ran your code but in a new env

conda create -n oitest python=3.11
conda activate oitest
CUDA_PATH=/usr/local/cuda FORCE_CMAKE=1 CMAKE_ARGS='-DLLAMA_CUBLAS=on'
pip install llama-cpp-python --force-reinstall --upgrade --no-cache-dir

python -c "from llama_cpp import GGML_USE_CUBLAS; print(GGML_USE_CUBLAS)"

Result.......

Successfully built llama-cpp-python
Installing collected packages: typing-extensions, numpy, diskcache, llama-cpp-python
Successfully installed diskcache-5.6.3 llama-cpp-python-0.2.4 numpy-1.25.2 typing-extensions-4.7.1
True

Good so far.....

(oitest) bob@BobLinuxMint:~$ pip install open-interpreter

Then.....

(oitest) bob@BobLinuxMint:~$ interpreter --local

Open Interpreter will use Code Llama for local execution. Use your arrow keys to
set up the model.

[?] Parameter count (smaller is faster, larger is more capable): 7B

7B
13B
34B

[?] Quality (smaller is faster, larger is more capable): Small | Size: 2.6 GB, Estimated RAM usage: 5.1 GB

Small | Size: 2.6 GB, Estimated RAM usage: 5.1 GB
Medium | Size: 3.8 GB, Estimated RAM usage: 6.3 GB
Large | Size: 6.7 GB, Estimated RAM usage: 9.2 GB
See More

[?] Use GPU? (Large models might crash on GPU, but will run more quickly) (...: Y

Model found at /home/bob/.local/share/Open
Interpreter/models/codellama-7b-instruct.Q2_K.gguf
Illegal instruction (core dumped)
(oitest) bob@BobLinuxMint:~$

Ran your code in the oitest env
conda install cuda -c nvidia
pip install --force-reinstall llama-cpp-python --prefer-binary --extra-index-url=https://jllllll.github.io/llama-cpp-python-cuBLAS-wheels/AVX2/cu122

Result......

Successfully installed diskcache-5.6.3 llama-cpp-python-0.1.85+cu122 numpy-1.25.2 typing-extensions-4.7.1

(oitest) bob@BobLinuxMint:~$ interpreter --local

Open Interpreter will use Code Llama for local execution. Use your arrow keys to
set up the model.

[?] Parameter count (smaller is faster, larger is more capable): 7B

7B
13B
34B

[?] Quality (smaller is faster, larger is more capable): Small | Size: 2.6 GB, Estimated RAM usage: 5.1 GB

Small | Size: 2.6 GB, Estimated RAM usage: 5.1 GB
Medium | Size: 3.8 GB, Estimated RAM usage: 6.3 GB
Large | Size: 6.7 GB, Estimated RAM usage: 9.2 GB
See More

[?] Use GPU? (Large models might crash on GPU, but will run more quickly) (...: Y

Model found at /home/bob/.local/share/Open
Interpreter/models/codellama-7b-instruct.Q2_K.gguf
[?] Local LLM interface package not found. Install llama-cpp-python? (Y/n): Y

Requirement already satisfied: llama-cpp-python in ./anaconda3/envs/oitest/lib/python3.11/site-packages (0.1.85+cu122)
Requirement already satisfied: typing-extensions>=4.5.0 in ./anaconda3/envs/oitest/lib/python3.11/site-packages (from llama-cpp-python) (4.7.1)
Requirement already satisfied: numpy>=1.20.0 in ./anaconda3/envs/oitest/lib/python3.11/site-packages (from llama-cpp-python) (1.25.2)
Requirement already satisfied: diskcache>=5.6.1 in ./anaconda3/envs/oitest/lib/python3.11/site-packages (from llama-cpp-python) (5.6.3)
Traceback (most recent call last):
File "/home/bob/anaconda3/envs/oitest/lib/python3.11/site-packages/llama_cpp/llama_cpp.py", line 67, in _load_shared_library
return ctypes.CDLL(str(_lib_path), **cdll_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/bob/anaconda3/envs/oitest/lib/python3.11/ctypes/init.py", line 376, in init
self._handle = _dlopen(self._name, mode)
^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: libcudart.so.12: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/bob/anaconda3/envs/oitest/lib/python3.11/site-packages/interpreter/get_hf_llm.py", line 164, in get_hf_llm
from llama_cpp import Llama
File "/home/bob/anaconda3/envs/oitest/lib/python3.11/site-packages/llama_cpp/init.py", line 1, in
from .llama_cpp import *
File "/home/bob/anaconda3/envs/oitest/lib/python3.11/site-packages/llama_cpp/llama_cpp.py", line 80, in
_lib = _load_shared_library(_lib_base_name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/bob/anaconda3/envs/oitest/lib/python3.11/site-packages/llama_cpp/llama_cpp.py", line 69, in _load_shared_library
raise RuntimeError(f"Failed to load shared library '{_lib_path}': {e}")
RuntimeError: Failed to load shared library '/home/bob/anaconda3/envs/oitest/lib/python3.11/site-packages/llama_cpp/libllama.so': libcudart.so.12: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/bob/anaconda3/envs/oitest/lib/python3.11/site-packages/llama_cpp/llama_cpp.py", line 67, in _load_shared_library
return ctypes.CDLL(str(_lib_path), **cdll_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/bob/anaconda3/envs/oitest/lib/python3.11/ctypes/init.py", line 376, in init
self._handle = _dlopen(self._name, mode)
^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: libcudart.so.12: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/bob/anaconda3/envs/oitest/lib/python3.11/site-packages/interpreter/interpreter.py", line 323, in chat
self.llama_instance = get_hf_llm(self.model, self.debug_mode, self.context_window)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/bob/anaconda3/envs/oitest/lib/python3.11/site-packages/interpreter/get_hf_llm.py", line 223, in get_hf_llm
from llama_cpp import Llama
File "/home/bob/anaconda3/envs/oitest/lib/python3.11/site-packages/llama_cpp/init.py", line 1, in
from .llama_cpp import *
File "/home/bob/anaconda3/envs/oitest/lib/python3.11/site-packages/llama_cpp/llama_cpp.py", line 80, in
_lib = _load_shared_library(_lib_base_name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/bob/anaconda3/envs/oitest/lib/python3.11/site-packages/llama_cpp/llama_cpp.py", line 69, in _load_shared_library
raise RuntimeError(f"Failed to load shared library '{_lib_path}': {e}")
RuntimeError: Failed to load shared library '/home/bob/anaconda3/envs/oitest/lib/python3.11/site-packages/llama_cpp/libllama.so': libcudart.so.12: cannot open shared object file: No such file or directory

▌ Failed to install TheBloke/CodeLlama-7B-Instruct-GGUF.

Common Fixes: You can follow our simple setup docs at the link below to resolve
common errors.

https://github.com/KillianLucas/open-interpreter/tree/main/docs

If you've tried that and you're still getting an error, we have likely not built
the proper TheBloke/CodeLlama-7B-Instruct-GGUF support for your system.

( Running language models locally is a difficult task! If you have insight into
the best way to implement this across platforms/architectures, please join the
Open Interpreter community Discord and consider contributing the project's
development. )

Press enter to switch to GPT-4 (recommended).
conda install cuda -c nvidia

●

Welcome to Open Interpreter.

────────────────────────────────────────────────────────────────────────────────

▌ OpenAI API key not found

To use GPT-4 (recommended) please provide an OpenAI API key.

To use Code-Llama (free but less capable) press enter.

jordanbtucker · 2023-09-14T16:42:13Z

Thanks for testing. It could be an issue with AVX2 support. Can you please run the following command and post its output.

grep flags /proc/cpuinfo

Videoteq · 2023-09-14T19:02:35Z

Here is it Jordan - a very long outpu!

(base) bob@BobLinuxMint:~$ grep flags /proc/cpuinfo
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer aes xsave avx lahf_lm epb pti ssbd ibrs ibpb stibp xsaveopt dtherm ida arat pln pts md_clear flush_l1d
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer aes xsave avx lahf_lm epb pti ssbd ibrs ibpb stibp xsaveopt dtherm ida arat pln pts md_clear flush_l1d
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer aes xsave avx lahf_lm epb pti ssbd ibrs ibpb stibp xsaveopt dtherm ida arat pln pts md_clear flush_l1d
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer aes xsave avx lahf_lm epb pti ssbd ibrs ibpb stibp xsaveopt dtherm ida arat pln pts md_clear flush_l1d
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer aes xsave avx lahf_lm epb pti ssbd ibrs ibpb stibp xsaveopt dtherm ida arat pln pts md_clear flush_l1d
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer aes xsave avx lahf_lm epb pti ssbd ibrs ibpb stibp xsaveopt dtherm ida arat pln pts md_clear flush_l1d
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer aes xsave avx lahf_lm epb pti ssbd ibrs ibpb stibp xsaveopt dtherm ida arat pln pts md_clear flush_l1d
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer aes xsave avx lahf_lm epb pti ssbd ibrs ibpb stibp xsaveopt dtherm ida arat pln pts md_clear flush_l1d

Videoteq · 2023-09-14T19:08:09Z

@jordanbtucker I have an ancient (but updated) PC which might cause snags. Here's the spec

System:
Kernel: 5.15.0-83-generic x86_64 bits: 64 compiler: gcc v: 11.4.0 Desktop: Cinnamon 5.8.4
tk: GTK 3.24.33 wm: muffin dm: LightDM Distro: Linux Mint 21.2 Victoria base: Ubuntu 22.04 jammy
Machine:
Type: Desktop System: Gigabyte product: N/A v: N/A serial: Chassis: type: 3
serial:
Mobo: Gigabyte model: Z77M-D3H v: x.x serial: UEFI: American Megatrends
v: F4 date: 02/20/2012
Battery:
Device-1: hidpp_battery_0 model: Logitech Wireless Keyboard serial:
charge: 55% (should be ignored) status: Discharging
CPU:
Info: quad core model: Intel Core i7-2600K bits: 64 type: MT MCP arch: Sandy Bridge rev: 7
cache: L1: 256 KiB L2: 1024 KiB L3: 8 MiB
Speed (MHz): avg: 1661 high: 2058 min/max: 1600/3800 cores: 1: 1605 2: 2058 3: 1605 4: 1605
5: 1605 6: 1605 7: 1605 8: 1605 bogomips: 54559
Flags: avx ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3
Graphics:
Device-1: NVIDIA GA106 [GeForce RTX 3060 Lite Hash Rate] vendor: ASUSTeK driver: nvidia
v: 525.125.06 pcie: speed: 5 GT/s lanes: 16 ports: active: none off: DP-3,HDMI-A-1
empty: DP-1,DP-2 bus-ID: 01:00.0 chip-ID: 10de:2504
Display: x11 server: X.Org v: 1.21.1.4 driver: X: loaded: nvidia
unloaded: fbdev,modesetting,nouveau,vesa gpu: nvidia display-ID: :0 screens: 1
Screen-1: 0 s-res: 8960x2880 s-dpi: 120
Monitor-1: DP-4 pos: bottom-r res: 3840x2160 dpi: 191 diag: 585mm (23")
Monitor-2: HDMI-0 pos: primary,top-left res: 5120x2880 dpi: 209 diag: 708mm (27.9")
OpenGL: renderer: NVIDIA GeForce RTX 3060/PCIe/SSE2 v: 4.6.0 NVIDIA 525.125.06
direct render: Yes
Audio:
Device-1: Intel 7 Series/C216 Family High Definition Audio vendor: Gigabyte
driver: snd_hda_intel v: kernel bus-ID: 00:1b.0 chip-ID: 8086:1e20
Device-2: NVIDIA vendor: ASUSTeK driver: snd_hda_intel v: kernel pcie: speed: 5 GT/s lanes: 16
bus-ID: 01:00.1 chip-ID: 10de:228e
Device-3: Alesis io|2 type: USB driver: snd-usb-audio bus-ID: 1-1.3:3 chip-ID: 13b2:0008
Device-4: Texas Instruments PCM2902 Audio Codec type: USB
driver: hid-generic,snd-usb-audio,usbhid bus-ID: 1-1.4:4 chip-ID: 08bb:2902
Device-5: Focusrite-Novation Impulse 49 type: USB driver: snd-usb-audio bus-ID: 3-4.1.1.2:6
chip-ID: 1235:001a
Sound Server-1: ALSA v: k5.15.0-83-generic running: yes
Sound Server-2: PulseAudio v: 15.99.1 running: yes
Sound Server-3: PipeWire v: 0.3.48 running: yes
Network:
Device-1: Qualcomm Atheros AR8151 v2.0 Gigabit Ethernet vendor: Gigabyte driver: atl1c v: kernel
pcie: speed: 2.5 GT/s lanes: 1 port: d000 bus-ID: 03:00.0 chip-ID: 1969:1083
IF: enp3s0 state: up speed: 100 Mbps duplex: full mac:
IF-ID-1: br-b1f81bf3d8a3 state: down mac:
IF-ID-2: docker0 state: down mac:
Drives:
Local Storage: total: 11.16 TiB used: 4.26 TiB (38.1%)
ID-1: /dev/nvme0n1 vendor: Crucial model: CT2000P3SSD8 size: 1.82 TiB speed: 31.6 Gb/s
lanes: 4 serial: temp: 25.9 C
ID-2: /dev/sda vendor: Crucial model: CT525MX300SSD1 size: 489.05 GiB speed: 6.0 Gb/s
serial:
ID-3: /dev/sdb vendor: Western Digital model: WDS100T2B0A size: 931.51 GiB speed: 6.0 Gb/s
serial:
ID-4: /dev/sdc vendor: Western Digital model: WDS500G2B0A-00SM50 size: 465.76 GiB
speed: serial:
ID-5: /dev/sdd vendor: Crucial model: CT1000BX500SSD1 size: 931.51 GiB speed: 3.0 Gb/s
serial:
ID-6: /dev/sde vendor: SanDisk model: SSD PLUS 240 GB size: 223.58 GiB speed: 3.0 Gb/s
serial:
ID-7: /dev/sdf type: USB vendor: JMicron Tech model: N/A size: 931.51 GiB serial:
ID-8: /dev/sdg type: USB vendor: Sabrent model: SABRENT size: 1.82 TiB serial:
ID-9: /dev/sdh type: USB vendor: Seagate model: Expansion HDD size: 1.82 TiB serial:
ID-10: /dev/sdi type: USB vendor: Western Digital model: WD Elements 2620 size: 1.82 TiB
serial:
Partition:
ID-1: / size: 456.88 GiB used: 421.72 GiB (92.3%) fs: ext4 dev: /dev/sdc3
ID-2: /boot/efi size: 512 MiB used: 6.1 MiB (1.2%) fs: vfat dev: /dev/sdc2
Swap:
ID-1: swap-1 type: file size: 2 GiB used: 0 KiB (0.0%) priority: -2 file: /swapfile
Sensors:
System Temperatures: cpu: 56.0 C mobo: 27.8 C gpu: nvidia temp: 30 C
Fan Speeds (RPM): N/A gpu: nvidia fan: 40%
Repos:
Packages: 2727 apt: 2701 flatpak: 13 snap: 13
No active apt repos in: /etc/apt/sources.list
Active apt repos in: /etc/apt/sources.list.d/additional-repositories.list
1: deb https: //deb.opera.com/opera-stable/ stable non-free
Active apt repos in: /etc/apt/sources.list.d/cuda-ubuntu2004-12-2-local.list
1: deb [signed-by=/usr/share/keyrings/cuda-8C974382-keyring.gpg] file: ///var/cuda-repo-ubuntu2004-12-2-local /
Active apt repos in: /etc/apt/sources.list.d/cuda-ubuntu2204-12-2-local.list
1: deb [signed-by=/usr/share/keyrings/cuda-F73B257B-keyring.gpg] file: ///var/cuda-repo-ubuntu2204-12-2-local /
Active apt repos in: /etc/apt/sources.list.d/docker.list
1: deb [arch=amd64 signed-by=/etc/apt/keyrings/docker.gpg] https: //download.docker.com/linux/ubuntu victoria stable
Active apt repos in: /etc/apt/sources.list.d/github-cli.list
1: deb [arch=amd64 signed-by=/usr/share/keyrings/githubcli-archive-keyring.gpg] https: //cli.github.com/packages stable main
Active apt repos in: /etc/apt/sources.list.d/google-chrome.list
1: deb [arch=amd64] https: //dl.google.com/linux/chrome/deb/ stable main
Active apt repos in: /etc/apt/sources.list.d/jonmagon-kdiskmark-jammy.list
1: deb [signed-by=/etc/apt/keyrings/jonmagon-kdiskmark-jammy.gpg] https: //ppa.launchpadcontent.net/jonmagon/kdiskmark/ubuntu jammy main
Active apt repos in: /etc/apt/sources.list.d/microsoft-edge.list
1: deb [arch=amd64] https: //packages.microsoft.com/repos/edge/ stable main
Active apt repos in: /etc/apt/sources.list.d/official-package-repositories.list
1: deb http: //packages.linuxmint.com victoria main upstream import backport
2: deb http: //archive.ubuntu.com/ubuntu jammy main restricted universe multiverse
3: deb http: //archive.ubuntu.com/ubuntu jammy-updates main restricted universe multiverse
4: deb http: //archive.ubuntu.com/ubuntu jammy-backports main restricted universe multiverse
5: deb http: //security.ubuntu.com/ubuntu/ jammy-security main restricted universe multiverse
Active apt repos in: /etc/apt/sources.list.d/opera-stable.list
1: deb https: //deb.opera.com/opera-stable/ stable non-free
Active apt repos in: /etc/apt/sources.list.d/pgadmin4.list
1: deb [signed-by=/usr/share/keyrings/packages-pgadmin-org.gpg] https: //ftp.postgresql.org/pub/pgadmin/pgadmin4/apt/victoria pgadmin4 main
Active apt repos in: /etc/apt/sources.list.d/pgdg.list
1: deb http: //apt.postgresql.org/pub/repos/apt victoria-pgdg main
Active apt repos in: /etc/apt/sources.list.d/ubuntu-toolchain-r-test-jammy.list
1: deb [signed-by=/etc/apt/keyrings/ubuntu-toolchain-r-test-jammy.gpg] https: //ppa.launchpadcontent.net/ubuntu-toolchain-r/test/ubuntu jammy main
Active apt repos in: /etc/apt/sources.list.d/vscode.list
1: deb [arch=amd64,arm64,armhf signed-by=/etc/apt/keyrings/packages.microsoft.gpg] https: //packages.microsoft.com/repos/code stable main
Info:
Processes: 333 Uptime: 6m Memory: 31.3 GiB used: 3.2 GiB (10.2%) Init: systemd v: 249
runlevel: 5 Compilers: gcc: 11.4.0 alt: 11/12 Client: Cinnamon v: 5.8.4 inxi: 3.3.13

jordanbtucker · 2023-09-15T05:16:19Z

@Videoteq Yep, your CPU doesn't support AVX2. Run the following command in your conda env.

pip install --force-reinstall llama-cpp-python --prefer-binary --extra-index-url=https://jllllll.github.io/llama-cpp-python-cuBLAS-wheels/AVX/cu122

I updated my PR to account for this. Hopefully I can merge it before the next release.

Videoteq · 2023-09-15T07:54:29Z

@jordanbtucker

(base) bob@BobLinuxMint:~$ conda activate oitest

(oitest) bob@BobLinuxMint:~$ pip install --force-reinstall llama-cpp-python --prefer-binary --extra-index-url=https://jllllll.github.io/llama-cpp-python-cuBLAS-wheels/AVX/cu122

Looking in indexes: https://pypi.org/simple, https://jllllll.github.io/llama-cpp-python-cuBLAS-wheels/AVX/cu122
Collecting llama-cpp-python
Downloading https://github.com/jllllll/llama-cpp-python-cuBLAS-wheels/releases/download/AVX/llama_cpp_python-0.1.85%2Bcu122-cp311-cp311-linux_x86_64.whl (5.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.0/5.0 MB 5.9 MB/s eta 0:00:00
Collecting typing-extensions>=4.5.0 (from llama-cpp-python)
Obtaining dependency information for typing-extensions>=4.5.0 from https://files.pythonhosted.org/packages/ec/6b/63cc3df74987c36fe26157ee12e09e8f9db4de771e0f3404263117e75b95/typing_extensions-4.7.1-py3-none-any.whl.metadata
Using cached typing_extensions-4.7.1-py3-none-any.whl.metadata (3.1 kB)
Collecting numpy>=1.20.0 (from llama-cpp-python)
Obtaining dependency information for numpy>=1.20.0 from https://files.pythonhosted.org/packages/32/6a/65dbc57a89078af9ff8bfcd4c0761a50172d90192eaeb1b6f56e5fbf1c3d/numpy-1.25.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
Using cached numpy-1.25.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (5.6 kB)
Collecting diskcache>=5.6.1 (from llama-cpp-python)
Obtaining dependency information for diskcache>=5.6.1 from https://files.pythonhosted.org/packages/3f/27/4570e78fc0bf5ea0ca45eb1de3818a23787af9b390c0b0a0033a1b8236f9/diskcache-5.6.3-py3-none-any.whl.metadata
Using cached diskcache-5.6.3-py3-none-any.whl.metadata (20 kB)
Using cached diskcache-5.6.3-py3-none-any.whl (45 kB)
Using cached numpy-1.25.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.2 MB)
Using cached typing_extensions-4.7.1-py3-none-any.whl (33 kB)
Installing collected packages: typing-extensions, numpy, diskcache, llama-cpp-python
Attempting uninstall: typing-extensions
Found existing installation: typing_extensions 4.7.1
Uninstalling typing_extensions-4.7.1:
Successfully uninstalled typing_extensions-4.7.1
Attempting uninstall: numpy
Found existing installation: numpy 1.25.2
Uninstalling numpy-1.25.2:
Successfully uninstalled numpy-1.25.2
Attempting uninstall: diskcache
Found existing installation: diskcache 5.6.3
Uninstalling diskcache-5.6.3:
Successfully uninstalled diskcache-5.6.3
Attempting uninstall: llama-cpp-python
Found existing installation: llama-cpp-python 0.1.85+cu122
Uninstalling llama-cpp-python-0.1.85+cu122:
Successfully uninstalled llama-cpp-python-0.1.85+cu122
Successfully installed diskcache-5.6.3 llama-cpp-python-0.1.85+cu122 numpy-1.25.2 typing-extensions-4.7.1