You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Possible problems:
I'm still learning about quantization. I notice there is a dataset field, set to c4 dataset. The dataset and prompt style for this model is different. I'm not sure how to customize this though, maybe I need custom Python script instead of using the llama.py CLI?
It took an hour or so to generate this so I'd like to get it right next time 😂
Any advice would be greatly appreciated! Thanks in advance!
Broken: GPTQ-for-LLaMa
Working: AutoGPTQ
The text was updated successfully, but these errors were encountered:
Could someone help me with how to quantize my own model with GPTQ-for-LLaMA?
See screenshot of the output I am getting 😢
Original full model: https://huggingface.co/Glavin001/startup-interviews-13b-int4-2epochs-1
Working quantized model with AutoGPT (screenshots): https://huggingface.co/Glavin001/startup-interviews-13b-2epochs-4bit-2
Dataset: https://huggingface.co/datasets/Glavin001/startup-interviews
Command I used in attempt to quantize: https://github.com/qwopqwop200/GPTQ-for-LLaMa
Quantized model (screenshots): Glavin001/startup-interviews-llama7b-v0.1-GPTQ ( https://huggingface.co/Glavin001/startup-interviews-llama7b-v0.1-GPTQ/tree/main )
Tested with/you can reproduce: TheBloke's Runpod template: https://github.com/TheBlokeAI/dockerLLM/
Model loader: Both AutoGPT & ExLlama look like gibberish/garbage output.
Example prompt:
Possible problems:
I'm still learning about quantization. I notice there is a dataset field, set to
c4
dataset. The dataset and prompt style for this model is different. I'm not sure how to customize this though, maybe I need custom Python script instead of using the llama.py CLI?It took an hour or so to generate this so I'd like to get it right next time 😂
Any advice would be greatly appreciated! Thanks in advance!
The text was updated successfully, but these errors were encountered: