Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RLlin] Add dist_inputs to action sampler fn returns in TorchPolicyV2 #33795

Conversation

ArturNiederfahrenhorst
Copy link
Contributor

@ArturNiederfahrenhorst ArturNiederfahrenhorst commented Mar 28, 2023

Why are these changes needed?

This PR adds the action distribution inputs to the values returned by self.action_sampler_fn in TorchPolicyV2.
Before, there where not enough values to unpack when action_sampler_fn was implemented correctly.

I also explored unifying all action_sampler_fn occurrences in RLlib to four return values but I don't think that makes sense at this point. Most Policy classes will soon be deprecated anyway and we would be harddeprecating something (action_sampler_fn with three returns) that simply works for the time being.

Fixes #33716

Signed-off-by: Artur Niederfahrenhorst <artur@anyscale.com>
Signed-off-by: Artur Niederfahrenhorst <artur@anyscale.com>
@gjoliver gjoliver merged commit bf02571 into ray-project:master Mar 30, 2023
can-anyscale pushed a commit that referenced this pull request Mar 30, 2023
…#33795)

Signed-off-by: Artur Niederfahrenhorst <artur@anyscale.com>
can-anyscale pushed a commit that referenced this pull request Mar 30, 2023
…#33795)

Signed-off-by: Artur Niederfahrenhorst <artur@anyscale.com>
elliottower pushed a commit to elliottower/ray that referenced this pull request Apr 22, 2023
…ray-project#33795)

Signed-off-by: Artur Niederfahrenhorst <artur@anyscale.com>
Signed-off-by: elliottower <elliot@elliottower.com>
ProjectsByJackHe pushed a commit to ProjectsByJackHe/ray that referenced this pull request May 4, 2023
…ray-project#33795)

Signed-off-by: Artur Niederfahrenhorst <artur@anyscale.com>
Signed-off-by: Jack He <jackhe2345@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[RLlib] TorchPolicyV2 number of variables should match the number of the return from action_sampler_fn
4 participants