Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

After fine-tuning, the model repeat the answer and not stop #467

Open
abdoelsayed2016 opened this issue May 19, 2023 · 13 comments
Open

After fine-tuning, the model repeat the answer and not stop #467

abdoelsayed2016 opened this issue May 19, 2023 · 13 comments

Comments

@abdoelsayed2016
Copy link

abdoelsayed2016 commented May 19, 2023

After fine-tuning, I have encountered an unexpected behavior where the model repeatedly generates the same answer without stopping.

i tried different models of llama

decapoda-research/llama-7b-hf
yahma/llama-7b-hf
here are some examples

Tell me about alpacas.


1. Alpacas are from South America.
2. Alpacas are from South America.
3. Alpacas are from South America.
4. Alpacas are from South America.
5. Alpacas are from South America.
6. Alpacas are from South America.
7. Alpacas are from South America.
8. Alpacas are from South America.
9. Alpacas are from South America.
10. Alpacas are from South America.
11. Alpac
who played charlie bucket in the original charlie and the chocolate factory

\begin{itemize}
\item Freddie Highmore
\item Freddie Highmore
\item Freddie Highmore
\item Freddie Highmore
\item Freddie Highmore
\item Freddie Highmore
\item Freddie Highmore
\item Freddie Highmore
\item Freddie Highmore
\item Freddie Highmore
\item Freddie Highmore
\item Freddie Highmore
\item Freddie Highmore
\item Freddie Highmore
\item Freddie Highmore
\item Freddie Highmore
\item Freddie Highmore
\

Also, adapter_model.bin size is 1kb

@abdoelsayed2016 abdoelsayed2016 changed the title after finetue the model repeat the answer and not stop After fine-tuning, the model repeat the answer and not stop May 19, 2023
@aryanq123
Copy link

aryanq123 commented May 20, 2023

can someone assign me this task i wants to work upon it

and explain briefly

@Qubitium
Copy link

@abdoelsayed2016 You using an old model that is no longer recommended as it was convertes to hf using wrong tokenizer params. Check the repo readme for valid 7b hf models.

@abdoelsayed2016
Copy link
Author

abdoelsayed2016 commented May 20, 2023

@diegomontoya I have already read the readme many times. I see that while finetuning the base model, is decapoda-research/llama-7b-hf
I have two questions. First, why does the model repeat the answer? The model doesn't know when it should stop
image

The second question is whether the adapter weight is 1KB only. Is that normal?

image

@abdoelsayed2016
Copy link
Author

Also i tired huggyllama/llama-7b

@abdoelsayed2016
Copy link
Author

I see a lot of issues talking about the size of the adapter I tried all solutions, but still, the adapter is the same size 1KB.

anyone has faced the same issue cuz training is costly

@asenmitrev
Copy link

asenmitrev commented May 20, 2023

I was able to solve this issue with this comment. Uninstall peft and install from github commit.

I was desperate with the repetitions for both models (yahma and decapoda-research) but I think the root cause is the empty adapter model.bin.

If you want to do a quick test without training, take the last checkpoint pytorch_model.bin and replace the adapter_model.bin with that, renaming it. This allowed me to test the model and the repetition was fixed.

@abdoelsayed2016
Copy link
Author

abdoelsayed2016 commented May 20, 2023

@asenmitrev

Yesterday I did this solution but still adapter model.bin size 1KB


pip uninstall peft -y
pip install git+https://github.com/huggingface/peft.git@e536616888d51b453ed354a6f1e243fecb02ea08

i tried again with uninstall peft. Here is the output

image

@abdoelsayed2016
Copy link
Author

abdoelsayed2016 commented May 21, 2023

I also tried this solution
huggingface/peft#286
#37 (comment)

The adapter size now is 7 GB

And model back to repeat the answers.

image

I also changed the checkpoint model name from pytorch_model.bin to the adapter_model.bin

same output

@asenmitrev
Copy link

I tried decapoda research and the answer is repeated. yahma is fine

@abdoelsayed2016
Copy link
Author

abdoelsayed2016 commented May 21, 2023

@asenmitrev Yahma with which version of peft?
Because i tired it same problem
I got confused
They should add description in readme or fixed requirements

@asenmitrev
Copy link

The one quoted in my first answer.

@SimpleConjugate
Copy link

SimpleConjugate commented May 24, 2023

You should try adding repetition_penalty keyword argument to generation config in the evaluate function. repetition_penalty >1 should do it. I finetuned a model and used repetition_penalty=2 to resolve the problem for myself.

@SuperBruceJia
Copy link

SuperBruceJia commented Dec 15, 2024

Issue

We also encountered this recurring response issue while training the GPTQ-quantized models with a LoRA adapter. We met the same problem while fine-tuning the LoRA adapter with the GPTQ-quantized Command R Plus (our model) and GPTQ-quantized LLaMA 3.3 70B (our model). After the model quantization using the GPTQ algorithm, the model performs pretty well without noticeable recurring response issues! However, after we fine-tuned the LoRA adapter, things are getting tricky. We found that the problem came from increasing training loss and increasing gradient norm.

Training Loss Before Using Our Solution

Please check our training loss and gradient norm figures.
image

Repeated Response Examples

Here are the responses from the quantized model + trained LoRA:
image
image
image
image

The Reason to This Issue and Our Solution

We believe that the increasing training loss issue was from the Gradient Exploding. So, we made changes to our hyperparameter settings. Also, please check our model loader if you are interested. Based on these settings, we solved this issue and achieved stable training:

Training Loss After Using Our Solution

quantized_model_training_progress

Model Inference

Based on our model's responses, we believe we have fundamentally solved the issue of repeated responses!
image
image
image

Best regards,

Shuyue
Dec. 15th, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants