Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Doc] Add how to use Lion optimizer #152

Merged
merged 11 commits into from
Feb 21, 2023

Conversation

younesbelkada
Copy link
Contributor

@younesbelkada younesbelkada commented Feb 16, 2023

What does this PR do?

This PR adds LION optimizer from: https://arxiv.org/abs/2302.06675 / code source is: https://github.com/google/automl/blob/master/lion/lion_pytorch.py

For now it does not really seem to improve convergence in gpt-sentiment, but thought it would be nice to showcase that it can be applied for trl and used out of the box

EDIT: it seems to converge nicely after dividing the learning rate by 3, as suggested by the paper

cc @lvwerra @kashif

I think this is a nice-to-have artifact in this lib, as it reduces quite nicely the memory footprint of training (got 10% reduction)

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Feb 16, 2023

The documentation is not available anymore as the PR was closed or merged.

@younesbelkada younesbelkada changed the title Add Lion optimizer [Doc] Add how to use Lion optimizer Feb 16, 2023
Copy link
Member

@lvwerra lvwerra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just two minor comments, otherwise looks good! ❤️

docs/source/customization.mdx Outdated Show resolved Hide resolved
docs/source/customization.mdx Outdated Show resolved Hide resolved
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
@younesbelkada younesbelkada merged commit 9eaea2e into huggingface:main Feb 21, 2023
@younesbelkada younesbelkada deleted the lion-optimizer branch February 21, 2023 19:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants