-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug with saving LoRA (adapter_model.bin) on latest peft from git #317
Comments
Update: so, after some testing, That both, (A) hopefully helps narrow down where the issue may lie, and (B) gives a functional workaround for until it's properly fixed. |
Hello @mcmonkey4eva, I am using the latest main branch and running the following example: https://github.com/huggingface/peft/blob/main/examples/int8_training/Finetune_opt_bnb_peft.ipynb |
Could you please share minimal code that we can run to reproduce the above issue? |
As per this tloen/alpaca-lora#293, uninstalling and reinstalling fixed this it seems |
Please also see this #286 |
Above PR should have the fixes. Note that there were no issues with PEFT and they were related to alpaca-lora |
That's perfect, that fixed it. Thank you so much for taking the time to investigate and get it fixed for everyone! |
Setup
using
get_peft_model
withCAUSAL_LM
,transformers.Trainer(...)
, thenlora_model.save_pretrained(lora_file_path)
to train a LoRA on LLaMA (int8).The Issue
When saving at the end,
adapter_model.bin
is an empty pickle (443 bytes, contains a 6 bytedata
entry).This is, of course, wrong, there should be data in there. Prior versions of peft saved full proper files with actual content in them.
Checkpoints saved midway through by
save_steps
in the transformers trainer seem to contain full valid data (but in a different format).See oobabooga/text-generation-webui#1098 (comment) for more discussion of the issue externally.
Relevant technical details
pip install git+https://github.com/huggingface/peft
has an error (at time of writing that's commit b21559e ) butpip install peft==0.2.0
does not, indicating likely an error sourcing from a recent change.Relevant source code replicating this issue: https://github.com/mcmonkey4eva/text-generation-webui/blob/lora-trainer-improvements-3/modules/training.py
Side Note
Likely a separate topic, but users have said that on
peft==0.2.0
they're seeing huge VRAM spikes whensave_pretrained
is ran. I haven't seen this myself yet but it may indicate need more broadly for more thorough validation of the saving code.EDIT: Yeah okay indeed separate topic, other user on github reported the VRAM spike part is actually caused by bitsandbytes rather than peft.
(Also it's a bit strange that pickles are being used - a separate project entirely but those should definitely be replaced to
safetensors
files)The text was updated successfully, but these errors were encountered: