You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
llama.cpp: can't use mmap because tensors are not aligned; convert to new format to avoid this
llama_model_load_internal: format = 'ggml' (old version with low tokenizer quality and no mmap support)
The text was updated successfully, but these errors were encountered:
I am facing the same problem while loading the llama embeddings. After a bit of exploration, I found that there has been a new quantization format(speed improvement) introduced in the llama-cpp-python supporting repository. Refer this pull request.
Now this project privateGPT and many others are uses old quantization formats. As of now, there don't seem to be a way to convert old quantized format to new ones other than retrieving source models f16 and quantizing them to new formats.
llama.cpp: can't use mmap because tensors are not aligned; convert to new format to avoid this
llama_model_load_internal: format = 'ggml' (old version with low tokenizer quality and no mmap support)
The text was updated successfully, but these errors were encountered: