Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RLlib] - Add example for PyTorch lr schedulers. #47454

Merged

Conversation

simonsays1980
Copy link
Collaborator

@simonsays1980 simonsays1980 commented Sep 2, 2024

Why are these changes needed?

This PR adds an example to the rllib/examples/learners/ that shows how to use PyTorch's learning rate schedulers to assemble a complex learning rate schedule for RL training.

Related issue number

#47453

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

…hms. Either a list of schedulers sequentially applied or a dicitonary mapping module IDs to their list of schedulers respectively.

Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>
…dulers with RLlib.

Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>
@simonsays1980 simonsays1980 marked this pull request as ready for review September 2, 2024 16:11
@simonsays1980 simonsays1980 added rllib RLlib related issues rllib-torch rllib-docs-or-examples Issues related to RLlib documentation or rllib/examples labels Sep 2, 2024
.experimental(
# Add tow learning rate schedulers to be applied in sequence.
_torch_lr_scheduler_classes=[
# Multiplies the learning rate by a factor of 0.1 for 10 iterations.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a user of this script (who is too lazy reading through all the torch docs :D ) I have a few questions here that we should briefly answer here:

  • What is the actually resulting total schedule here if user configured config.training(lr=L)? I'm assuming: For lr_const_iters iterations: Use L * 0.1, after that, jump back up to L, then decay L by lr_exp_decay each iter? So L *= 0.3 per iter?
  • What is an iter here? It's not the same necessarily as RLlib algorithm iters, correct, but actually refers to Learner.update_from... calls, correct?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great questions! I try to answer them in the following:

  • So any list of learning rate schedulers will be chained, i.e. we apply the first and then the second in each iteration. Your assumption is (almost) correct. So the first scheduler will multiply the actual learning rate L by 0.1 unless we have stepped 10 times (after this correct, we stay at L), i.e L * 0.1. The second scheduler then takes this rate (L * 0.1) and decays it with a rate of 0.3, i.e L = (L * 0.1) * 0.3. Because of this latter assignment, the actual learning rate in the second iteration is (L * 0.1) * 0.3
  • So in the second iteration we will have (L * 0.1) * 0.3^2 and in the third (L * 0.1) * 0.3^3 and so on.
  • In the 10th iteration however the ConstantLR multiplication is off and the learning rate AT THIS POINT is multiplied by the reverse factor, i.e. (L * 0.1) * 0.3^10 * 1/0.1.

Yes, it is complex, but this is what we want to offer users. How torch.rl_scheduler instances work together is a torch thing and users have to figure it out, we just apply it.

Copy link
Contributor

@sven1977 sven1977 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some questions, nits, and comment requests.

Awesome PR @simonsays1980 . Thanks for the example, this helped a lot visualizing how this would look in action.

simonsays1980 and others added 4 commits September 3, 2024 14:36
Co-authored-by: Sven Mika <sven@anyscale.io>
Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>
Co-authored-by: Sven Mika <sven@anyscale.io>
Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>
Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>
@sven1977 sven1977 enabled auto-merge (squash) September 4, 2024 09:14
@github-actions github-actions bot added the go add ONLY when ready to merge, run all tests label Sep 4, 2024
@sven1977 sven1977 merged commit 158a75f into ray-project:master Sep 4, 2024
7 checks passed
ujjawal-khare pushed a commit to ujjawal-khare-27/ray that referenced this pull request Oct 15, 2024
Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
ujjawal-khare pushed a commit to ujjawal-khare-27/ray that referenced this pull request Oct 15, 2024
Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
ujjawal-khare pushed a commit to ujjawal-khare-27/ray that referenced this pull request Oct 15, 2024
Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
ujjawal-khare pushed a commit to ujjawal-khare-27/ray that referenced this pull request Oct 15, 2024
Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
ujjawal-khare pushed a commit to ujjawal-khare-27/ray that referenced this pull request Oct 15, 2024
Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
ujjawal-khare pushed a commit to ujjawal-khare-27/ray that referenced this pull request Oct 15, 2024
Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
ujjawal-khare pushed a commit to ujjawal-khare-27/ray that referenced this pull request Oct 15, 2024
Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
go add ONLY when ready to merge, run all tests rllib RLlib related issues rllib-docs-or-examples Issues related to RLlib documentation or rllib/examples
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants