Multiagent CPP API #584

jjshoots · 2024-12-14T09:24:40Z

This updates the CPP interface to allow a 2 player interface. Work for the 4 player interface is left as a future PR because that's quite a bit more complicated.

AFAICT, these are the 2 player games available:

Air Raid
Combat
Double Dunk
Human Cannonball
Ice Hockey
Joust
Maze Craze
Surround
Tennis
Video Checkers
Video Chess

And these are games with 4 players:

Warlords
Flag Capture

I believe we should be able to just copy the games that support multiplayer from the MALE repo since they already have the modifications required. I have modified Surround to support 2 player mode for the testing script below.

The more difficult question is how the Python interface for this should look, since, I presume, the gymnasium API is not sufficient.

Testing

import gymnasium as gym
import ale_py

gym.register_envs(ale_py)

# Initialise the environment
env = gym.make("ALE/Surround-v5", render_mode="human", mode=4)  # mode here controls multiplayer. I believe mode 2 is single player

# Reset the environment to generate the first observation
observation, info = env.reset(seed=42)
for _ in range(300):
    # this is where you would insert your policy
    action = env.action_space.sample()

    # step (transition) through the environment with the action
    # receiving the next observation, reward and if the episode has terminated or truncated
    observation, reward, terminated, truncated, info = env.step(action)

    # If the episode has ended then we can reset to start a new episode
    if terminated or truncated:
        observation, info = env.reset()

env.close()

Expected Behaviour:
Since the gymnasium API by default makes player B's actions NOOP, setting mode=4 means that player B won't have any actions.
Conversely, setting mode=2 would mean that player B will be controlled by the emulator, therefore moving in random directions.

pseudo-rnd-thoughts

Is it necessary to change the actions? I don't see where that is necessary?
Could you add tests that PettingZoo can work with this?

jjshoots · 2024-12-15T02:34:21Z

@pseudo-rnd-thoughts Yeah we do, both players use different action idx and the game will throw an error here. Although now that you mention it, maybe that's not needed afterall and we can just do a remapping within the C stack. I'll work on that.
Updated to use just one set of actions.

Roger on the PZ tests.

jjshoots added 6 commits December 14, 2024 17:23

stash

385ae24

first iteration

7b365c1

revert surround

77e8362

black

b6040b4

add comment

a8490fc

revert change

77fe8ef

jjshoots marked this pull request as ready for review December 14, 2024 11:02

pseudo-rnd-thoughts reviewed Dec 14, 2024

View reviewed changes

jjshoots added 2 commits December 15, 2024 11:03

reduce to minimal action set

f0b83e3

fix interface

7855d6f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiagent CPP API #584

Multiagent CPP API #584

jjshoots commented Dec 14, 2024 •

edited

Loading

pseudo-rnd-thoughts left a comment

jjshoots commented Dec 15, 2024 •

edited

Loading

Multiagent CPP API #584

Are you sure you want to change the base?

Multiagent CPP API #584

Conversation

jjshoots commented Dec 14, 2024 • edited Loading

Testing

pseudo-rnd-thoughts left a comment

Choose a reason for hiding this comment

jjshoots commented Dec 15, 2024 • edited Loading

jjshoots commented Dec 14, 2024 •

edited

Loading

jjshoots commented Dec 15, 2024 •

edited

Loading