Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Help: Quantized llama-7b model with custom prompt format produces only gibberish #276

Open
Glavin001 opened this issue Jul 15, 2023 · 1 comment

Comments

@Glavin001
Copy link

Could someone help me with how to quantize my own model with GPTQ-for-LLaMA?
See screenshot of the output I am getting 😢

Original full model: https://huggingface.co/Glavin001/startup-interviews-13b-int4-2epochs-1
Working quantized model with AutoGPT (screenshots): https://huggingface.co/Glavin001/startup-interviews-13b-2epochs-4bit-2
Dataset: https://huggingface.co/datasets/Glavin001/startup-interviews
Command I used in attempt to quantize: https://github.com/qwopqwop200/GPTQ-for-LLaMa

CUDA_VISIBLE_DEVICES=0 python3 llama.py /workspace/text-generation-webui/models/Glavin001_startup-interviews-13b-int4-2epochs-1/ c4 --wbits 4 --true-sequential --act-order --groupsize 128 --save_safetensors startup-interviews-llama7b-4bit-128g.safetensors

Quantized model (screenshots): Glavin001/startup-interviews-llama7b-v0.1-GPTQ ( https://huggingface.co/Glavin001/startup-interviews-llama7b-v0.1-GPTQ/tree/main )
Tested with/you can reproduce: TheBloke's Runpod template: https://github.com/TheBlokeAI/dockerLLM/
Model loader: Both AutoGPT & ExLlama look like gibberish/garbage output.
Example prompt:

<|prompt|>What is a MVP?</s><|answer|>

Possible problems:
I'm still learning about quantization. I notice there is a dataset field, set to c4 dataset. The dataset and prompt style for this model is different. I'm not sure how to customize this though, maybe I need custom Python script instead of using the llama.py CLI?

It took an hour or so to generate this so I'd like to get it right next time 😂

Any advice would be greatly appreciated! Thanks in advance!

Broken: GPTQ-for-LLaMa Working: AutoGPTQ
imageimage image image
@CheshireAI
Copy link

Try without --act-order.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants