Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RLlib] Upgrade to gymnasium 1.0.0 (ale_py 0.10.1, mujoco 3.2.4, pettingzoo 1.24.3 supersuit 3.9.3). #45328

Merged
merged 45 commits into from
Oct 28, 2024
Merged
Show file tree
Hide file tree
Changes from 28 commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
9cb5160
wip
sven1977 May 14, 2024
5678569
fixes
sven1977 May 14, 2024
f750ac3
fixes
sven1977 May 14, 2024
d2a36b3
fixes
sven1977 May 14, 2024
3c90cc7
fix
sven1977 May 14, 2024
2bb745a
Merge branch 'master' of https://github.com/ray-project/ray into upgr…
sven1977 May 14, 2024
7921430
LINT
sven1977 May 14, 2024
639698a
LINT
sven1977 May 14, 2024
bdda97c
Apply suggestions from code review
sven1977 May 14, 2024
cf8f554
fixes
sven1977 May 14, 2024
a67302a
Merge remote-tracking branch 'origin/upgrade_gymnasium_to_1_0_0a1' in…
sven1977 May 14, 2024
82b1638
Merge branch 'master' of https://github.com/ray-project/ray into upgr…
sven1977 May 14, 2024
4d36a44
wip
sven1977 May 14, 2024
773dbdf
fixes
sven1977 May 14, 2024
45dcf93
Merge branch 'master' of https://github.com/ray-project/ray into upgr…
sven1977 May 31, 2024
7f6fd9d
Update to gymnasium 1.0.0a2 and new ale_py (which now supports Atari)
sven1977 May 31, 2024
8946a00
Update to gymnasium 1.0.0a2 and new ale_py (which now supports Atari)
sven1977 May 31, 2024
cb7461b
wip
sven1977 Jun 1, 2024
14cdea4
wip
sven1977 Jun 1, 2024
91958e0
wip
sven1977 Jun 4, 2024
e2cb0c8
merge
sven1977 Jun 4, 2024
ca82ec5
wip
sven1977 Jun 9, 2024
f5ab020
Merge branch 'master' of https://github.com/ray-project/ray into upgr…
sven1977 Jun 12, 2024
9245215
wip
sven1977 Jun 12, 2024
bb294a7
Merge branch 'master' of https://github.com/ray-project/ray into upgr…
sven1977 Jun 12, 2024
a399230
wip
sven1977 Jun 12, 2024
64d1f9a
wip
sven1977 Jun 12, 2024
0d09dd8
wip
sven1977 Jun 13, 2024
afe5cb4
wip
sven1977 Jun 13, 2024
506b053
Merge branch 'master' of https://github.com/ray-project/ray into upgr…
sven1977 Jun 17, 2024
9514d42
merge
sven1977 Oct 25, 2024
b8376b1
wip
sven1977 Oct 25, 2024
afdcc05
wip
sven1977 Oct 25, 2024
8a13830
wip
sven1977 Oct 25, 2024
fd3e427
wip
sven1977 Oct 25, 2024
e9f0c00
wip
sven1977 Oct 25, 2024
16302a2
wip
sven1977 Oct 25, 2024
6b78348
fix
sven1977 Oct 25, 2024
95d5f5a
fix
sven1977 Oct 25, 2024
3f2ccb1
fix
sven1977 Oct 25, 2024
a9ce50b
fixes
sven1977 Oct 25, 2024
00c4b64
wip
sven1977 Oct 26, 2024
4c82e1a
wip
sven1977 Oct 26, 2024
26f1649
wip
sven1977 Oct 27, 2024
7dae36a
wip
sven1977 Oct 27, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion doc/source/rllib/doc_code/training.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
try:
import gymnasium as gym

env = gym.make("ALE/Pong-v5")
env = gym.make("ale_py:ALE/Pong-v5")
obs, infos = env.reset()
except Exception:
import gym
Expand Down
2 changes: 1 addition & 1 deletion doc/source/rllib/rllib-examples.rst
Original file line number Diff line number Diff line change
Expand Up @@ -285,7 +285,7 @@ in roughly 5min. It can be run like this on a single g5.24xlarge (or g6.24xlarge
.. code-block:: bash

$ cd ray/rllib/tuned_examples/ppo
$ python atari_ppo.py --env ALE/Pong-v5 --num-gpus=4 --num-env-runners=95
$ python atari_ppo.py --env=ale_py:ALE/Pong-v5 --num-gpus=4 --num-env-runners=95

Note that some of the files in this folder are used for RLlib's daily or weekly
release tests as well.
Expand Down
2 changes: 1 addition & 1 deletion python/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ colorful
rich
opentelemetry-sdk
fastapi
gymnasium==0.28.1
gymnasium==1.0.0a2
virtualenv!=20.21.1,>=20.0.24
opentelemetry-api
opencensus
Expand Down
16 changes: 3 additions & 13 deletions python/requirements/ml/rllib-test-requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3,22 +3,18 @@
# Environment adapters.
# ---------------------
# Atari
gymnasium==0.28.1
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since gymnasium is already part of the main Ray requirements.txt file, we won't need this here anymore.

imageio==2.31.1
ale_py==0.8.1
ale_py==0.9.0
# For testing MuJoCo envs with gymnasium.
mujoco==2.3.6
dm_control==1.0.12

# For tests on PettingZoo's multi-agent envs.
pettingzoo==1.23.1
# When installing pettingzoo, chess is missing, even though its a dependancy
# TODO: remove if a future pettingzoo and/or ray version fixes this dependancy issue.
chess==1.7.0
pettingzoo==1.24.3
pymunk==6.2.1
supersuit==3.8.0
tinyscaler==1.2.6
shimmy
supersuit==3.9.0

# Kaggle envs.
kaggle_environments==1.7.11
Expand All @@ -29,12 +25,6 @@ mlagents_envs==0.28.0

# For tests on minigrid.
minigrid
# For tests on RecSim and Kaggle envs.
# Explicitly depends on `tensorflow` and doesn't accept `tensorflow-macos`
recsim==0.2.4; (sys_platform != 'darwin' or platform_machine != 'arm64')
# recsim depends on dopamine-rl, but dopamine-rl pins gym <= 0.25.2, which break some envs
dopamine-rl==4.0.5; (sys_platform != 'darwin' or platform_machine != 'arm64')
tensorflow_estimator
# DeepMind's OpenSpiel
open-spiel==1.4

Expand Down
23 changes: 5 additions & 18 deletions python/requirements_compiled.txt
Original file line number Diff line number Diff line change
Expand Up @@ -75,10 +75,10 @@ aiosqlite==0.19.0
# via ypy-websocket
alabaster==0.7.13
# via sphinx
ale-py==0.8.1
ale-py==0.9.0
# via
# -r /ray/ci/../python/requirements/ml/rllib-test-requirements.txt
# gym
# gymnasium
alembic==1.12.1
# via
# aim
Expand Down Expand Up @@ -274,8 +274,6 @@ charset-normalizer==3.3.2
# via
# aiohttp
# requests
chess==1.7.0
# via -r /ray/ci/../python/requirements/ml/rllib-test-requirements.txt
chex==0.1.7
# via optax
clang-format==12.0.1
Expand Down Expand Up @@ -306,7 +304,6 @@ cloudpickle==2.2.0
# -r /ray/ci/../python/requirements/test-requirements.txt
# dask
# distributed
# gym
# gymnasium
# hyperopt
# mlagents-envs
Expand Down Expand Up @@ -701,13 +698,7 @@ gsutil==5.27
# via -r /ray/ci/../python/requirements/docker/ray-docker-requirements.txt
gunicorn==20.1.0
# via mlflow
gym==0.26.2
# via
# dopamine-rl
# recsim
gym-notices==0.0.8
# via gym
gymnasium==0.28.1
gymnasium==1.0.0a2
# via
# -r /ray/ci/../python/requirements.txt
# -r /ray/ci/../python/requirements/ml/rllib-test-requirements.txt
Expand Down Expand Up @@ -1256,7 +1247,6 @@ numpy==1.24.4
# flax
# gpy
# gradio
# gym
# gymnasium
# h5py
# hebo
Expand Down Expand Up @@ -1302,7 +1292,6 @@ numpy==1.24.4
# pytorch-lightning
# pywavelets
# raydp
# recsim
# scikit-image
# scikit-learn
# scipy
Expand Down Expand Up @@ -1501,7 +1490,7 @@ pbr==6.0.0
# sarif-om
peewee==3.17.0
# via semgrep
pettingzoo==1.23.1
pettingzoo==1.24.3
# via -r /ray/ci/../python/requirements/ml/rllib-test-requirements.txt
pexpect==4.8.0
# via
Expand Down Expand Up @@ -1871,8 +1860,6 @@ querystring-parser==1.2.4
# tune-sklearn
raydp==1.7.0b20231020.dev0
# via -r /ray/ci/../python/requirements/ml/data-test-requirements.txt
recsim==0.2.4 ; sys_platform != "darwin" or platform_machine != "arm64"
# via -r /ray/ci/../python/requirements/ml/rllib-test-requirements.txt
redis==4.4.2
# via -r /ray/ci/../python/requirements/test-requirements.txt
regex==2023.10.3
Expand Down Expand Up @@ -2177,7 +2164,7 @@ statsmodels==0.14.0
# via
# hpbandster
# statsforecast
supersuit==3.8.0
supersuit==3.9.0
# via -r /ray/ci/../python/requirements/ml/rllib-test-requirements.txt
sympy==1.12
# via
Expand Down
2 changes: 1 addition & 1 deletion python/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -295,7 +295,7 @@ def get_packages(self):

setup_spec.extras["rllib"] = setup_spec.extras["tune"] + [
"dm_tree",
"gymnasium==0.28.1",
"gymnasium==1.0.0a2",
"lz4",
"scikit-image",
"pyyaml",
Expand Down
14 changes: 1 addition & 13 deletions release/ray_release/byod/requirements_byod_3.9.txt
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,7 @@ aiosignal==1.3.1 \
# via
# -c release/ray_release/byod/requirements_compiled.txt
# aiohttp
ale-py==0.8.1 \
ale-py==0.9.0 \
--hash=sha256:0006d80dfe7745eb5a93444492337203c8bc7eb594a2c24c6a651c5c5b0eaf09 \
--hash=sha256:0856ca777473ec4ae8a59f3af9580259adb0fd4a47d586a125a440c62e82fc10 \
--hash=sha256:0ffecb5c956749596030e464827642945162170a132d093c3d4fa2d7e5725c18 \
Expand Down Expand Up @@ -1231,17 +1231,6 @@ gsutil==5.27 \
# via
# -c release/ray_release/byod/requirements_compiled.txt
# -r release/ray_release/byod/requirements_byod_3.9.in
gym[atari]==0.26.2 \
--hash=sha256:e0d882f4b54f0c65f203104c24ab8a38b039f1289986803c7d02cdbe214fbcc4
# via
# -c release/ray_release/byod/requirements_compiled.txt
# -r release/ray_release/byod/requirements_byod_3.9.in
gym-notices==0.0.8 \
--hash=sha256:ad25e200487cafa369728625fe064e88ada1346618526102659b4640f2b4b911 \
--hash=sha256:e5f82e00823a166747b4c2a07de63b6560b1acb880638547e0cabf825a01e463
# via
# -c release/ray_release/byod/requirements_compiled.txt
# gym
h5py==3.10.0 \
--hash=sha256:012ab448590e3c4f5a8dd0f3533255bc57f80629bf7c5054cf4c87b30085063c \
--hash=sha256:212bb997a91e6a895ce5e2f365ba764debeaef5d2dca5c6fb7098d66607adf99 \
Expand Down Expand Up @@ -1707,7 +1696,6 @@ numpy==1.24.4 \
# via
# -c release/ray_release/byod/requirements_compiled.txt
# ale-py
# gym
# h5py
# lightgbm
# ml-dtypes
Expand Down
2 changes: 1 addition & 1 deletion rllib/algorithms/algorithm_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -3110,7 +3110,7 @@ def is_atari(self) -> bool:
# Not yet determined, try to figure this out.
if self._is_atari is None:
# Atari envs are usually specified via a string like "PongNoFrameskip-v4"
# or "ALE/Breakout-v5".
# or "ale_py:ALE/Breakout-v5".
# We do NOT attempt to auto-detect Atari env for other specified types like
# a callable, to avoid running heavy logics in validate().
# For these cases, users can explicitly set `environment(atari=True)`.
Expand Down
2 changes: 1 addition & 1 deletion rllib/algorithms/dreamerv3/tests/test_dreamerv3.py
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ def test_dreamerv3_compilation(self):
for env in [
"FrozenLake-v1",
"CartPole-v1",
"ALE/MsPacman-v5",
"ale_py:ALE/MsPacman-v5",
"Pendulum-v1",
]:
print("Env={}".format(env))
Expand Down
17 changes: 11 additions & 6 deletions rllib/algorithms/dreamerv3/utils/env_runner.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
from typing import List, Tuple

import gymnasium as gym
from gymnasium.wrappers.vector import DictInfoToList
import numpy as np
import tree # pip install dm_tree

Expand Down Expand Up @@ -73,7 +74,7 @@ def __init__(

# Create the gym.vector.Env object.
# Atari env.
if self.config.env.startswith("ALE/"):
if "ALE/" in self.config.env:
# TODO (sven): This import currently causes a Tune test to fail. Either way,
# we need to figure out how to properly setup the CI environment with
# the correct versions of all gymnasium-related packages.
Expand Down Expand Up @@ -160,11 +161,15 @@ def _entry_point():
env_descriptor=self.config.env,
),
)
# Create the vectorized gymnasium env.
self.env = gym.vector.make(
"dreamerv3-custom-env-v0",
num_envs=self.config.num_envs_per_env_runner,
asynchronous=False, # self.config.remote_worker_envs,
# Wrap into `DictInfoToList` wrapper to get infos as lists.
self.env = DictInfoToList(
gym.make_vec(
"dreamerv3-custom-env-v0",
num_envs=self.config.num_envs_per_env_runner,
vectorization_mode=(
"async" if self.config.remote_worker_envs else "sync"
),
)
)
self.num_envs = self.env.num_envs
assert self.num_envs == self.config.num_envs_per_env_runner
Expand Down
4 changes: 2 additions & 2 deletions rllib/algorithms/ppo/tests/test_ppo.py
Original file line number Diff line number Diff line change
Expand Up @@ -160,7 +160,7 @@ def test_ppo_compilation_w_connectors(self):
num_iterations = 2

for fw in framework_iterator(config):
for env in ["FrozenLake-v1", "ALE/MsPacman-v5"]:
for env in ["FrozenLake-v1", "ale_py:ALE/MsPacman-v5"]:
print("Env={}".format(env))
for lstm in [False, True]:
print("LSTM={}".format(lstm))
Expand Down Expand Up @@ -226,7 +226,7 @@ def test_ppo_compilation_and_schedule_mixins(self):
num_iterations = 2

for fw in framework_iterator(config):
for env in ["FrozenLake-v1", "ALE/MsPacman-v5"]:
for env in ["FrozenLake-v1", "ale_py:ALE/MsPacman-v5"]:
print("Env={}".format(env))
for lstm in [False, True]:
print("LSTM={}".format(lstm))
Expand Down
4 changes: 2 additions & 2 deletions rllib/algorithms/ppo/tests/test_ppo_rl_module.py
Original file line number Diff line number Diff line change
Expand Up @@ -140,7 +140,7 @@ def tearDownClass(cls):
def test_rollouts(self):
# TODO: Add FrozenLake-v1 to cover LSTM case.
frameworks = ["torch", "tf2"]
env_names = ["CartPole-v1", "Pendulum-v1", "ALE/Breakout-v5"]
env_names = ["CartPole-v1", "Pendulum-v1", "ale_py:ALE/Breakout-v5"]
fwd_fns = ["forward_exploration", "forward_inference"]
lstm = [True, False]
config_combinations = [frameworks, env_names, fwd_fns, lstm]
Expand Down Expand Up @@ -181,7 +181,7 @@ def test_rollouts(self):
def test_forward_train(self):
# TODO: Add FrozenLake-v1 to cover LSTM case.
frameworks = ["tf2", "torch"]
env_names = ["CartPole-v1", "Pendulum-v1", "ALE/Breakout-v5"]
env_names = ["CartPole-v1", "Pendulum-v1", "ale_py:ALE/Breakout-v5"]
lstm = [False, True]
config_combinations = [frameworks, env_names, lstm]
for config in itertools.product(*config_combinations):
Expand Down
2 changes: 1 addition & 1 deletion rllib/algorithms/ppo/tests/test_ppo_with_env_runner.py
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@ def test_ppo_compilation_and_schedule_mixins(self):
# "CliffWalking-v0",
"CartPole-v1",
"Pendulum-v1",
]: # "ALE/Breakout-v5"]:
]: # "ale_py:ALE/Breakout-v5"]:
print("Env={}".format(env))
for lstm in [False]:
print("LSTM={}".format(lstm))
Expand Down
2 changes: 1 addition & 1 deletion rllib/algorithms/ppo/tests/test_ppo_with_rl_module.py
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,7 @@ def test_ppo_compilation_and_schedule_mixins(self):

for fw in framework_iterator(config, frameworks=("tf2", "torch")):
# TODO (Kourosh) Bring back "FrozenLake-v1"
for env in ["CartPole-v1", "Pendulum-v1", "ALE/Breakout-v5"]:
for env in ["CartPole-v1", "Pendulum-v1", "ale_py:ALE/Breakout-v5"]:
print("Env={}".format(env))
for lstm in [False]:
print("LSTM={}".format(lstm))
Expand Down
6 changes: 3 additions & 3 deletions rllib/algorithms/tests/test_algorithm_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -145,11 +145,11 @@ def test_rollout_fragment_length(self):
def test_detect_atari_env(self):
"""Tests that we can properly detect Atari envs."""
config = AlgorithmConfig().environment(
env="ALE/Breakout-v5", env_config={"frameskip": 1}
env="ale_py:ALE/Breakout-v5", env_config={"frameskip": 1}
)
self.assertTrue(config.is_atari)

config = AlgorithmConfig().environment(env="ALE/Pong-v5")
config = AlgorithmConfig().environment(env="ale_py:ALE/Pong-v5")
self.assertTrue(config.is_atari)

config = AlgorithmConfig().environment(env="CartPole-v1")
Expand All @@ -158,7 +158,7 @@ def test_detect_atari_env(self):

config = AlgorithmConfig().environment(
env=lambda ctx: gym.make(
"ALE/Breakout-v5",
"ale_py:ALE/Breakout-v5",
frameskip=1,
)
)
Expand Down
Loading
Loading