can't use mmap because of ggml? #190

aicoder2048 · 2023-05-16T05:45:31Z

llama.cpp: can't use mmap because tensors are not aligned; convert to new format to avoid this
llama_model_load_internal: format = 'ggml' (old version with low tokenizer quality and no mmap support)

imfurkaann · 2023-05-16T06:21:57Z

.

dennis-gonzales · 2023-05-16T07:58:34Z

same error

lrodriguezcim · 2023-05-16T09:36:22Z

same!

nikhil-xb · 2023-05-17T13:56:40Z

I am facing the same problem while loading the llama embeddings. After a bit of exploration, I found that there has been a new quantization format(speed improvement) introduced in the llama-cpp-python supporting repository. Refer this pull request.

Now this project privateGPT and many others are uses old quantization formats. As of now, there don't seem to be a way to convert old quantized format to new ones other than retrieving source models f16 and quantizing them to new formats.

Would recommend following this thread for more info: ggerganov/llama.cpp#1408

imartinez added the primordial Related to the primordial version of PrivateGPT, which is now frozen in favour of the new PrivateGPT label Oct 19, 2023

imartinez closed this as completed Feb 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

can't use mmap because of ggml? #190

can't use mmap because of ggml? #190

aicoder2048 commented May 16, 2023

imfurkaann commented May 16, 2023

dennis-gonzales commented May 16, 2023

lrodriguezcim commented May 16, 2023

nikhil-xb commented May 17, 2023

can't use mmap because of ggml? #190

can't use mmap because of ggml? #190

Comments

aicoder2048 commented May 16, 2023

imfurkaann commented May 16, 2023

dennis-gonzales commented May 16, 2023

lrodriguezcim commented May 16, 2023

nikhil-xb commented May 17, 2023