[RLlib] SAC/DQN activate multi-agent learning tests and small bug fix in `MultiAgentEpisode`. #45542

sven1977 · 2024-05-24T14:11:16Z

SAC/DQN activates multi-agent learning tests for CartPole (DQN) and Pendulum (SAC)
small bug fix in MultiAgentEpisode.concat(): self.env_t_to_agent_t is not properly built in resulting episode object.

Why are these changes needed?

Related issue number

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

Signed-off-by: sven1977 <svenmika1977@gmail.com>

…dqn_multi_agent_debugging

simonsays1980

LGTM. Awesome work! Off-policy now learns!

simonsays1980 · 2024-05-24T14:30:28Z

rllib/BUILD

@@ -303,6 +292,15 @@ py_test(
    args = ["--as-test", "--enable-new-api-stack"]
 )

+py_test(


Yey! And we are there! :)

simonsays1980 · 2024-05-24T14:30:47Z

rllib/BUILD

@@ -452,6 +450,15 @@ py_test(
    args = ["--as-test", "--enable-new-api-stack"]
 )

+py_test(


Next step: synchronized sampling.

simonsays1980 · 2024-05-24T14:32:19Z

rllib/algorithms/algorithm.py

+            EPISODE_RETURN_MIN,
+            EPISODE_RETURN_MAX,
+        ]:
+            if must_have not in results[ENV_RUNNER_RESULTS]:


Yup, understood. But this means, if a user collects custom metrics via callback and uses Tune on this ... an error could result due to this metrics not being available in an iteration. We should document this somewhere - maybe at the MetricsLogger API ...

Correct, that's why I put the comment there. We should actually fix Tune, there is no other solution to this problem, imo.

Tune didn't probably think of this b/c in SL, you don't have "strangely behaving" episodes that don't deliver data sometimes.

simonsays1980 · 2024-05-24T14:33:26Z

rllib/env/multi_agent_episode.py

@@ -873,7 +873,12 @@ def concat_episode(self, other: "MultiAgentEpisode") -> None:
                    )

                # Concatenate the env- to agent-timestep mappings.
-                self.env_t_to_agent_t[agent_id].extend(other.env_t_to_agent_t[agent_id])
+                j = self.env_t
+                for i, val in enumerate(other.env_t_to_agent_t[agent_id][1:]):


This is one of the nasty parts of the MAE. Awesome catch!

simonsays1980 · 2024-05-24T14:34:14Z

rllib/tuned_examples/dqn/cartpole_dqn.py

    .rl_module(
        # Settings identical to old stack.
        model_config_dict={
            "fcnet_hiddens": [256],
-            "fcnet_activation": "relu",
+            "fcnet_activation": "tanh",


simonsays1980 · 2024-05-24T14:35:07Z

rllib/tuned_examples/sac/multi_agent_pendulum_sac.py

+        model_config_dict={
+            "fcnet_hiddens": [256, 256],
+            "fcnet_activation": "relu",
+            # "post_fcnet_hiddens": [],


This part is old stack equivalent.

simonsays1980 · 2024-05-24T14:35:38Z

rllib/utils/replay_buffers/multi_agent_episode_replay_buffer.py

@@ -131,7 +131,7 @@ def add(
        """
        episodes: List["MultiAgentEpisode"] = force_list(episodes)

-        new_episode_ids: List[str] = {eps.id_ for eps in episodes}
+        new_episode_ids: Set[str] = {eps.id_ for eps in episodes}


Great catch!

simonsays1980 · 2024-05-24T14:36:03Z

rllib/utils/replay_buffers/multi_agent_episode_replay_buffer.py

-                # TODO (sven, simon): Is there always a mapping? What if not?
-                # Is then module_id == agent_id?
-                module_id = ma_episode._agent_to_module_mapping[agent_id]
+                module_id = ma_episode.module_for(agent_id)


Sweet, this method is just clean!

Signed-off-by: sven1977 <svenmika1977@gmail.com>

…dqn_multi_agent_debugging

Signed-off-by: sven1977 <svenmika1977@gmail.com>

…dqn_multi_agent_debugging

…ti_agent_debugging

…dqn_multi_agent_debugging Signed-off-by: sven1977 <svenmika1977@gmail.com> # Conflicts: # rllib/utils/metrics/stats.py

Signed-off-by: sven1977 <svenmika1977@gmail.com>

…dqn_multi_agent_debugging Signed-off-by: sven1977 <svenmika1977@gmail.com> # Conflicts: # rllib/BUILD # rllib/algorithms/sac/torch/sac_torch_learner.py # rllib/tuned_examples/sac/multi_agent_pendulum_sac.py # rllib/utils/metrics/stats.py

Signed-off-by: sven1977 <svenmika1977@gmail.com>

…dqn_multi_agent_debugging

Signed-off-by: sven1977 <svenmika1977@gmail.com>

…dqn_multi_agent_debugging

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 added 3 commits May 24, 2024 15:22

wip

6f33e58

Signed-off-by: sven1977 <svenmika1977@gmail.com>

wip

d20f1bb

Signed-off-by: sven1977 <svenmika1977@gmail.com>

LINT

ff9b76a

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 requested review from ArturNiederfahrenhorst and simonsays1980 as code owners May 24, 2024 14:11

sven1977 assigned simonsays1980 May 24, 2024

sven1977 added 2 commits May 24, 2024 16:35

LINT

5db3507

Signed-off-by: sven1977 <svenmika1977@gmail.com>

Merge branch 'master' of https://github.com/ray-project/ray into sac_…

bdc7d01

…dqn_multi_agent_debugging

simonsays1980 approved these changes May 24, 2024

View reviewed changes

sven1977 added 13 commits May 25, 2024 09:27

small fixes

8d2400a

Signed-off-by: sven1977 <svenmika1977@gmail.com>

Merge branch 'master' of https://github.com/ray-project/ray into sac_…

f4a52bc

…dqn_multi_agent_debugging

LINT

4040b97

Signed-off-by: sven1977 <svenmika1977@gmail.com>

wip

6929f29

Signed-off-by: sven1977 <svenmika1977@gmail.com>

wip

a5009f0

Signed-off-by: sven1977 <svenmika1977@gmail.com>

wip

9648a0c

Signed-off-by: sven1977 <svenmika1977@gmail.com>

Merge branch 'master' of https://github.com/ray-project/ray into sac_…

7e29cd9

…dqn_multi_agent_debugging

Merge branch 'change_fix_metrics_logger_log_n_dicts' into sac_dqn_mul…

d69d290

…ti_agent_debugging

Merge branch 'master' of https://github.com/ray-project/ray into sac_…

2530c75

…dqn_multi_agent_debugging Signed-off-by: sven1977 <svenmika1977@gmail.com> # Conflicts: # rllib/utils/metrics/stats.py

wip

2837298

Signed-off-by: sven1977 <svenmika1977@gmail.com>

wip

1e9c151

Signed-off-by: sven1977 <svenmika1977@gmail.com>

fix

6bc3645

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 enabled auto-merge (squash) June 22, 2024 09:29

github-actions bot added the go add ONLY when ready to merge, run all tests label Jun 22, 2024

wip

c1f6b7d

Signed-off-by: sven1977 <svenmika1977@gmail.com>

github-actions bot disabled auto-merge June 22, 2024 12:24

sven1977 added 4 commits June 22, 2024 16:35

fix

fd1919c

Signed-off-by: sven1977 <svenmika1977@gmail.com>

Merge branch 'master' of https://github.com/ray-project/ray into sac_…

bf6a94c

…dqn_multi_agent_debugging

wip

b5e8df9

Signed-off-by: sven1977 <svenmika1977@gmail.com>

wip

6fdebca

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 added 2 commits June 23, 2024 12:20

Merge branch 'master' of https://github.com/ray-project/ray into sac_…

421a529

…dqn_multi_agent_debugging

wip

96f794b

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 enabled auto-merge (squash) June 23, 2024 10:20

sven1977 merged commit c942d60 into ray-project:master Jun 23, 2024
7 checks passed

sven1977 deleted the sac_dqn_multi_agent_debugging branch June 23, 2024 13:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib] SAC/DQN activate multi-agent learning tests and small bug fix in `MultiAgentEpisode`. #45542

[RLlib] SAC/DQN activate multi-agent learning tests and small bug fix in `MultiAgentEpisode`. #45542

sven1977 commented May 24, 2024 •

edited

Loading

simonsays1980 left a comment

simonsays1980 May 24, 2024

simonsays1980 May 24, 2024

simonsays1980 May 24, 2024

sven1977 May 24, 2024

simonsays1980 May 24, 2024

simonsays1980 May 24, 2024

simonsays1980 May 24, 2024

sven1977 May 24, 2024

simonsays1980 May 24, 2024

simonsays1980 May 24, 2024

[RLlib] SAC/DQN activate multi-agent learning tests and small bug fix in MultiAgentEpisode. #45542

[RLlib] SAC/DQN activate multi-agent learning tests and small bug fix in MultiAgentEpisode. #45542

Conversation

sven1977 commented May 24, 2024 • edited Loading

Why are these changes needed?

Related issue number

Checks

simonsays1980 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

[RLlib] SAC/DQN activate multi-agent learning tests and small bug fix in `MultiAgentEpisode`. #45542

[RLlib] SAC/DQN activate multi-agent learning tests and small bug fix in `MultiAgentEpisode`. #45542

sven1977 commented May 24, 2024 •

edited

Loading