-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RLlib] New API stack: (Multi)RLModule overhaul vol 02 (VPG RLModule, Algo, and Learner example classes). #47885
[RLlib] New API stack: (Multi)RLModule overhaul vol 02 (VPG RLModule, Algo, and Learner example classes). #47885
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Just a small nit and a deisgn question.
) | ||
|
||
# Reduce and return collected metrics. | ||
return self.metrics.reduce() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's so simple now :)
|
||
@override(TorchRLModule) | ||
def _forward_inference(self, batch, **kwargs): | ||
with torch.no_grad(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
def _common_forward(self, batch): | ||
# Features can be found in the batch under the "encoder_features" key. | ||
features = batch["encoder_features"] | ||
logits = self.policy_head(features) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self._policy_head(features)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
) | ||
|
||
@override(MultiRLModule) | ||
def _run_forward_pass(self, forward_fn_name, batch, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still a strange form - passing in the forward_fn_name
. Why is this needed? To have a single method to be overridden in inheritance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be gone in another PR that's already in (your) review :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great question!
…odule_do_over_bc_default_module_02_pg_algo
… Algo, and Learner example classes). (ray-project#47885) Signed-off-by: Tina Wu <j6vupz97@gmail.com>
… Algo, and Learner example classes). (ray-project#47885) Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
… Algo, and Learner example classes). (ray-project#47885) Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
… Algo, and Learner example classes). (ray-project#47885) Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
… Algo, and Learner example classes). (ray-project#47885) Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
… Algo, and Learner example classes). (ray-project#47885) Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
… Algo, and Learner example classes). (ray-project#47885) Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
… Algo, and Learner example classes). (ray-project#47885) Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
… Algo, and Learner example classes). (ray-project#47885) Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
… Algo, and Learner example classes). (ray-project#47885) Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
… Algo, and Learner example classes). (ray-project#47885) Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
… Algo, and Learner example classes). (ray-project#47885) Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
… Algo, and Learner example classes). (ray-project#47885) Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
… Algo, and Learner example classes). (ray-project#47885) Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
… Algo, and Learner example classes). (ray-project#47885) Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
… Algo, and Learner example classes). (ray-project#47885) Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
… Algo, and Learner example classes). (ray-project#47885) Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
… Algo, and Learner example classes). (ray-project#47885) Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
… Algo, and Learner example classes). (ray-project#47885) Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
New API stack: (Multi)RLModule overhaul vol 02 (VPG RLModule, Algo, and Learner example classes).
These new classes will replace the
core/testing/bc_...
classes and serve as:Why are these changes needed?
Related issue number
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.