Replies: 1 comment
-
This would usually point to either a noisy dataset, either too high a LR. Make sure the various settings, like warmup and batch size are consistent with your dataset size. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I tried to fine-tune LLaMA with LoRA on my dataset. Firstly, it has learnt good - loss decreased. But after a few hundred steps loss started increasing and became nan.
Beta Was this translation helpful? Give feedback.
All reactions