[RLlin] Add dist_inputs to action sampler fn returns in TorchPolicyV2 #33795

ArturNiederfahrenhorst · 2023-03-28T06:27:46Z

Why are these changes needed?

This PR adds the action distribution inputs to the values returned by self.action_sampler_fn in TorchPolicyV2.
Before, there where not enough values to unpack when action_sampler_fn was implemented correctly.

I also explored unifying all action_sampler_fn occurrences in RLlib to four return values but I don't think that makes sense at this point. Most Policy classes will soon be deprecated anyway and we would be harddeprecating something (action_sampler_fn with three returns) that simply works for the time being.

Fixes #33716

Signed-off-by: Artur Niederfahrenhorst <artur@anyscale.com>

…#33795) Signed-off-by: Artur Niederfahrenhorst <artur@anyscale.com>

…ray-project#33795) Signed-off-by: Artur Niederfahrenhorst <artur@anyscale.com> Signed-off-by: elliottower <elliot@elliottower.com>

…ray-project#33795) Signed-off-by: Artur Niederfahrenhorst <artur@anyscale.com> Signed-off-by: Jack He <jackhe2345@gmail.com>

Add dist_inputs to action sampler fn returns

a78ce45

Signed-off-by: Artur Niederfahrenhorst <artur@anyscale.com>

ArturNiederfahrenhorst assigned sven1977 Mar 28, 2023

ArturNiederfahrenhorst requested review from sven1977, gjoliver, avnishn, smorad, maxpumperla, kouroshHakha and krfricke as code owners March 28, 2023 06:27

Fix dreamer action sampler fn

9825a1a

Signed-off-by: Artur Niederfahrenhorst <artur@anyscale.com>

avnishn approved these changes Mar 29, 2023

View reviewed changes

gjoliver approved these changes Mar 30, 2023

View reviewed changes

gjoliver merged commit bf02571 into ray-project:master Mar 30, 2023

can-anyscale pushed a commit that referenced this pull request Mar 30, 2023

[RLlin] Add dist_inputs to action sampler fn returns in TorchPolicyV2 (…

731b988

…#33795) Signed-off-by: Artur Niederfahrenhorst <artur@anyscale.com>

can-anyscale pushed a commit that referenced this pull request Mar 30, 2023

[RLlin] Add dist_inputs to action sampler fn returns in TorchPolicyV2 (…

4a39387

…#33795) Signed-off-by: Artur Niederfahrenhorst <artur@anyscale.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlin] Add dist_inputs to action sampler fn returns in TorchPolicyV2 #33795

[RLlin] Add dist_inputs to action sampler fn returns in TorchPolicyV2 #33795

ArturNiederfahrenhorst commented Mar 28, 2023 •

edited

Loading

[RLlin] Add dist_inputs to action sampler fn returns in TorchPolicyV2 #33795

[RLlin] Add dist_inputs to action sampler fn returns in TorchPolicyV2 #33795

Conversation

ArturNiederfahrenhorst commented Mar 28, 2023 • edited Loading

Why are these changes needed?

ArturNiederfahrenhorst commented Mar 28, 2023 •

edited

Loading