-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial PEFT support #612
Initial PEFT support #612
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
tests/test_peft.py
Outdated
orig_peft_model.save_pretrained(orig_model_path.as_posix()) | ||
|
||
# PEFT model saved using `NeuronPeftModel`. | ||
seed_patcher = create_static_seed_patcher(LlamaForCausalLM, 42) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you really need to recreate this ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, otherwise it fails (tried that). Maybe we can make it "reusable" because it's annoying but did not want to spend too much time on that.
|
||
class NeuronPeftModel(PeftModel): | ||
@requires_neuronx_distributed | ||
def save_pretrained( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this is the only method that needs to be implemented differently for Neuron models, right ?
Is it because everything else is already handled by the generic code in accelerator.prepare_model ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Basically for now it's only about how we can save things the best way (more efficient). accelerator.prepare_model
is indeed more general. We might need to change other methods in the future but for now this is only what's needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the pull-request. It looks good to me, however unless I am mistaken there is no test verifying that the training works, only tests verifying that the model can be peft-ed, prepared, and saved correctly (but maybe I missed something).
I can add that now! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM ! Just as David mentioned, a training test would be necessary, and if possible maybe move all peft tests under a tests/peft folder? Then I could put sd peft related stuff under it as well.
I moved the tests to I wrote a "basic" test that runs training. It does not check about the loss on anything because it's not trivial to overfit with a random model only by finetuning the LoRA adapters. For now this test guarantees that the code runs at least. |
Merging, the issue on TGI is going to be handled in #620 |
What does this PR do?
Adds support for PEFT, tested for LoRA. For now, support is only for DDP. Support for TP will come in another PR.