[RLlib] Cleanup examples folder (vol 24): Mixed-precision training (and float16 inference) through new example script. #47116

sven1977 · 2024-08-13T19:47:59Z

This PR adds a new example script demo'ing:

how to write a custom callback for RLlib to convert those RLModules only(!) on the EnvRunners to float16 precision.
how to write a custom env-to-module ConnectorV2 piece to add float16 observations to the action computing forward batch on the EnvRunners, but NOT permanently write these changes into the episodes, such that on the Learner side, the original float32 observations will be used (for the mixed precision forward_train and loss computations).
how to plugin torch's built-in GradScaler class to be used by the TorchLearner to scale losses and unscale gradients in order to gain more stability when training with mixed precision.
how to write a custom TorchLearner to run the update step (overrides _update()) within a torch.amp.autocast() context.
demonstrates how to plug in all the above custom components into an AlgorithmConfig instance and start training with mixed-precision while performing the inference on the EnvRunners with float16 precision.

Why are these changes needed?

Related issue number

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

Signed-off-by: sven1977 <svenmika1977@gmail.com>

…t16_precision

Signed-off-by: sven1977 <svenmika1977@gmail.com>

can-anyscale

stamp the doc changes

Signed-off-by: sven1977 <svenmika1977@gmail.com>

…t16_precision

Signed-off-by: sven1977 <svenmika1977@gmail.com>

…nup_examples_folder_23_float16

Signed-off-by: sven1977 <svenmika1977@gmail.com>

…nup_examples_folder_24_mixed_precision

Signed-off-by: sven1977 <svenmika1977@gmail.com>

…es_folder_24_mixed_precision # Conflicts: # rllib/algorithms/algorithm_config.py # rllib/core/learner/torch/torch_learner.py

Signed-off-by: sven1977 <svenmika1977@gmail.com>

…nup_examples_folder_24_mixed_precision Signed-off-by: sven1977 <svenmika1977@gmail.com> # Conflicts: # rllib/BUILD # rllib/algorithms/algorithm_config.py # rllib/core/learner/learner.py # rllib/core/models/torch/primitives.py # rllib/examples/gpus/float16_training_and_inference.py

Signed-off-by: sven1977 <svenmika1977@gmail.com>

simonsays1980

LGTM. Again, such a great example!

simonsays1980 · 2024-08-29T08:35:28Z

rllib/core/learner/learner.py

+        only `compute_loss_for_module()` should be overridden instead. If the algorithm
+        uses independent multi-agent learning (default behavior for RLlib's multi-agent
+        setups), also only `compute_loss_for_module()` should be overridden, but it will
+        be called for each individual RLModule inside the MultiRLModule.


Maybe add a point for when to specifically override compute_losses instead of compute_loss_for_module

simonsays1980 · 2024-08-29T08:38:01Z

rllib/examples/gpus/mixed_precision_training_float16_inference.py

+    TorchLearner to scale losses and unscale gradients in order to gain more stability
+    when training with mixed precision.
+    - shows how to write a custom TorchLearner to run the update step (overrides
+    `_update()`) within a `torch.amp.autocast()` context. This makes sure that .


Sentence has no end: "This makes sure that ...?"

simonsays1980 · 2024-08-29T08:39:29Z

rllib/examples/gpus/mixed_precision_training_float16_inference.py

+
+class PPOTorchMixedPrecisionLearner(PPOTorchLearner):
+    def _update(self, *args, **kwargs):
+        with torch.cuda.amp.autocast():


Nice! I learned something new! Thanks!

…nd float16 inference) through new example script. (ray-project#47116) Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>

sven1977 added 2 commits August 13, 2024 16:35

wip

231ec2a

Signed-off-by: sven1977 <svenmika1977@gmail.com>

wip

9a583f8

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 requested review from ArturNiederfahrenhorst and simonsays1980 as code owners August 13, 2024 19:48

wip

c1e4c8f

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 assigned simonsays1980 Aug 13, 2024

sven1977 requested review from maxpumperla and a team as code owners August 13, 2024 19:58

sven1977 added 4 commits August 14, 2024 06:33

wip

c127eb5

Signed-off-by: sven1977 <svenmika1977@gmail.com>

wip

4fe7cf4

Signed-off-by: sven1977 <svenmika1977@gmail.com>

Merge branch 'master' of https://github.com/ray-project/ray into floa…

d87305b

…t16_precision

wip

178fe9f

Signed-off-by: sven1977 <svenmika1977@gmail.com>

can-anyscale approved these changes Aug 14, 2024

View reviewed changes

sven1977 added 13 commits August 27, 2024 18:39

wip

41600c3

Signed-off-by: sven1977 <svenmika1977@gmail.com>

Merge branch 'master' of https://github.com/ray-project/ray into floa…

f9b2355

…t16_precision

wip

9feb1ef

Signed-off-by: sven1977 <svenmika1977@gmail.com>

wip

33fefd8

Signed-off-by: sven1977 <svenmika1977@gmail.com>

wip

42da46b

Signed-off-by: sven1977 <svenmika1977@gmail.com>

Merge branch 'master' of https://github.com/ray-project/ray into clea…

819c241

…nup_examples_folder_23_float16

wip

cc5e5c1

Signed-off-by: sven1977 <svenmika1977@gmail.com>

fix

6596ee5

Signed-off-by: sven1977 <svenmika1977@gmail.com>

Merge branch 'master' of https://github.com/ray-project/ray into clea…

c9d07c6

…nup_examples_folder_24_mixed_precision

fix

e676226

Signed-off-by: sven1977 <svenmika1977@gmail.com>

Merge branch 'cleanup_examples_folder_23_float16' into cleanup_exampl…

eac3b8b

…es_folder_24_mixed_precision # Conflicts: # rllib/algorithms/algorithm_config.py # rllib/core/learner/torch/torch_learner.py

wip

c1e8d0f

Signed-off-by: sven1977 <svenmika1977@gmail.com>

wip

d48e0d3

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 changed the title ~~[RLlib] Add experimental setting for mixed-precision learning (new API stack only).~~ [RLlib] Cleanup examples folder (vol 24): Mixed-precision training (and float16 inference) through new example script. Aug 28, 2024

sven1977 added 3 commits August 28, 2024 16:19

wip

afba480

Signed-off-by: sven1977 <svenmika1977@gmail.com>

wip

5406943

Signed-off-by: sven1977 <svenmika1977@gmail.com>

wip

7ef8f9c

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 enabled auto-merge (squash) August 29, 2024 06:17

github-actions bot added the go add ONLY when ready to merge, run all tests label Aug 29, 2024

simonsays1980 approved these changes Aug 29, 2024

View reviewed changes

sven1977 merged commit 751dbb1 into ray-project:master Aug 29, 2024
6 of 7 checks passed

sven1977 deleted the float16_precision branch August 30, 2024 11:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib] Cleanup examples folder (vol 24): Mixed-precision training (and float16 inference) through new example script. #47116

[RLlib] Cleanup examples folder (vol 24): Mixed-precision training (and float16 inference) through new example script. #47116

sven1977 commented Aug 13, 2024 •

edited

Loading

can-anyscale left a comment

simonsays1980 left a comment

simonsays1980 Aug 29, 2024

simonsays1980 Aug 29, 2024

simonsays1980 Aug 29, 2024

[RLlib] Cleanup examples folder (vol 24): Mixed-precision training (and float16 inference) through new example script. #47116

[RLlib] Cleanup examples folder (vol 24): Mixed-precision training (and float16 inference) through new example script. #47116

Conversation

sven1977 commented Aug 13, 2024 • edited Loading

Why are these changes needed?

Related issue number

Checks

can-anyscale left a comment

Choose a reason for hiding this comment

simonsays1980 left a comment

Choose a reason for hiding this comment

simonsays1980 Aug 29, 2024

Choose a reason for hiding this comment

simonsays1980 Aug 29, 2024

Choose a reason for hiding this comment

simonsays1980 Aug 29, 2024

Choose a reason for hiding this comment

sven1977 commented Aug 13, 2024 •

edited

Loading