-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: IQ quants support #2631
Comments
hi @mr-september,
Many thanks |
thank @mr-september, |
hi @mr-september, Using Jan v0.4.10-368 ✅, the Starling_Monarch_Westlake_Garten-7B-v0.1-IQ4_XS.gguf is able to generate response, would you like to try it as well? Thank you |
Beautiful, it's working flawlessly! Very impressive turnaround! |
Hi, I think the latest nightly (-376) broke support again. It was a prompted update at startup. Rolling back to -368 still works. |
hi @mr-september, sorry for the inconvenience, due to the Nitro that supports IQ quants is currently facing many issues. Which we have to temporally revert it, and currently working on a fix atm. |
hi @mr-september, the latest nightly build Jan v0.4.11-386 resolved the issue with IQ Quant. Thanks. |
Problem
GGUF models quantized with IQ quants fail to load.
Success Criteria
Load and play as usual
Additional context
IQ quants: ggerganov/llama.cpp#4773
Example model with both traditional Q and new IQ quants: https://huggingface.co/bartowski/Starling_Monarch_Westlake_Garten-7B-v0.1-GGUF
The text was updated successfully, but these errors were encountered: