-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
After fine-tuning, the model repeat the answer and not stop #467
Comments
can someone assign me this task i wants to work upon it and explain briefly |
@abdoelsayed2016 You using an old model that is no longer recommended as it was convertes to hf using wrong tokenizer params. Check the repo readme for valid 7b hf models. |
@diegomontoya I have already read the readme many times. I see that while finetuning the base model, is The second question is whether the adapter weight is 1KB only. Is that normal? |
Also i tired |
I see a lot of issues talking about the size of the adapter I tried all solutions, but still, the adapter is the same size 1KB. anyone has faced the same issue cuz training is costly |
I was able to solve this issue with this comment. Uninstall peft and install from github commit. I was desperate with the repetitions for both models (yahma and decapoda-research) but I think the root cause is the empty adapter model.bin. If you want to do a quick test without training, take the last checkpoint |
Yesterday I did this solution but still adapter model.bin size 1KB
i tried again with uninstall peft. Here is the output |
I also tried this solution The adapter size now is 7 GB And model back to repeat the answers. I also changed the checkpoint model name from pytorch_model.bin to the adapter_model.bin same output |
I tried decapoda research and the answer is repeated. yahma is fine |
@asenmitrev Yahma with which version of peft? |
The one quoted in my first answer. |
You should try adding |
IssueWe also encountered this recurring response issue while training the GPTQ-quantized models with a LoRA adapter. We met the same problem while fine-tuning the LoRA adapter with the GPTQ-quantized Command R Plus (our model) and GPTQ-quantized LLaMA 3.3 70B (our model). After the model quantization using the GPTQ algorithm, the model performs pretty well without noticeable recurring response issues! However, after we fine-tuned the LoRA adapter, things are getting tricky. We found that the problem came from increasing training loss and increasing gradient norm. Training Loss Before Using Our SolutionPlease check our training loss and gradient norm figures. Repeated Response ExamplesHere are the responses from the quantized model + trained LoRA: The Reason to This Issue and Our SolutionWe believe that the increasing training loss issue was from the Gradient Exploding. So, we made changes to our hyperparameter settings. Also, please check our model loader if you are interested. Based on these settings, we solved this issue and achieved stable training: Training Loss After Using Our SolutionModel InferenceBased on our model's responses, we believe we have fundamentally solved the issue of repeated responses! Best regards, Shuyue |
After fine-tuning, I have encountered an unexpected behavior where the model repeatedly generates the same answer without stopping.
i tried different models of llama
decapoda-research/llama-7b-hf
yahma/llama-7b-hf
here are some examples
Also, adapter_model.bin size is 1kb
The text was updated successfully, but these errors were encountered: