Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Special Optimizer for LoRA training #1803

Closed
fangzhaozhang opened this issue May 27, 2024 · 3 comments
Closed

Add Special Optimizer for LoRA training #1803

fangzhaozhang opened this issue May 27, 2024 · 3 comments

Comments

@fangzhaozhang
Copy link

Feature request

Our research group has studied speeding up LoRA training with new optimizer features, and we would like to contribute to integrating our proposed method in current huggingface libraries. The innovation is that we introduce a special gradient preconditioner which has been verified to enhance LoRA training quality significantly via extensive experiments. Our paper can be accessed at https://arxiv.org/pdf/2402.02347 and it's recently accepted by ICML 2024.

We haven't contributed to HF libraries before and am not sure whether it's possible to integrate our method to existing code base, if so, what are the steps we should follow?

Motivation

Our feature request is not related to a problem. We empirically verified that a new optimizer variant helps LoRA training significantly and we feel integrating it to existing HF trainer setting would benefit potential users who want to try it out.

Your contribution

We are happy to implement all the code and merge it to existing HF code base, though we are not familiar with steps required.

@BenjaminBossan
Copy link
Member

Thanks for suggesting to add this new optimization method to PEFT. From a quick glance at the paper, I think PEFT would be a good place to add it.

If I understand this right, the new method would be about creating a new PyTorch optimizer class that works especially well with LoRA. At the moment, we don't have optimizers in PEFT (or training code, for that matter), but there is a PR to add LoRA+ (#1509), so you could check that PR out for inspiration (like module structure).

If you already have a repo with code, feel free to share it. Otherwise, just open a PR. You can create a draft PR and share your work even if not yet finished to get some early feedback.

@fangzhaozhang
Copy link
Author

Thanks for reply. We have opened a draft PR (#1807) and pushed some initial code for our method & simple test following LoRA+ implementation(#1509).

Any feedback would be appreciated.

Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants