-
Notifications
You must be signed in to change notification settings - Fork 639
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[docs] APIs #1075
[docs] APIs #1075
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
Hey @Titus-von-Koeller, can you check the docstring descriptions for AdaGrad and the base optimizer classes? Since most of the docstring descriptions are the same, if these look good to you, I can easily copy most of them over to the remaining optimizers. |
Hey @stevhliu, I'll take a look now. Please be sure to run |
Everything looks really good, thanks! I think the docstring for the optimizers and base class is good the way you did it. |
@@ -1,17 +1,18 @@ | |||
# AdaGrad | |||
|
|||
[AdaGrad (Adaptive Gradient)](https://jmlr.org/papers/v12/duchi11a.html) is an optimizer that adaptively adjusts the learning rate for each parameter based on their historical gradients. | |||
[AdaGrad (Adaptive Gradient)](https://jmlr.org/papers/v12/duchi11a.html) is an adaptive learning rate optimizer. AdaGrad stores a sum of the squared past gradients for each parameter and uses it to scale their learning rate. This allows the learning rate to be automatically lower or higher depending on the magnitude of the gradient, eliminating the need to manually tune the learning rate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice summary, much cleaner now!
ac5d6ee
into
bitsandbytes-foundation:main
This PR expands the API docs to showcase high-level classes and core lower-level building blocks, starting with the optimizers.