-
Notifications
You must be signed in to change notification settings - Fork 140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New ggml llamacpp file format support #4
Comments
Hi, Support for the new quantization format was discussed previously here. Since it is a breaking change I haven't updated yet. There are few more features I'm adding to the Python library after which I will update the C backend. I will look into adding Thanks for the model link. Do you have a link to a smaller LLaMA-7B model quantized in the latest format that I can use for testing? My machine doesn't have enough RAM to run larger models. |
Of course, His hugging face models are well organized. Just search for GGML models on his space and you can find anything you want. The best one is WizardLM-7B-uncensored-GGML, that you can find here: |
This is released in the latest version 0.2.0 It supports LLaMA, MPT models now. It also includes the most recent breaking change ggerganov/llama.cpp#1508 |
Hi and thanks for this beautiful work.
Are you planning on supporting the version 2 of the llamacpp files. I want to add the OpenAssistant model to my GPT4ALL-ui and can't find a python binding that supports it.
Here is the bloke's version of the model:
https://huggingface.co/TheBloke/OpenAssistant-SFT-7-Llama-30B-GGML/tree/main
Best regards
The text was updated successfully, but these errors were encountered: