The ARLBench is a benchmark for HPO in RL - evaluate your HPO methods fast and on a representative number of environments! For more information, see our documentation. The dataset is available at HuggingFace.
- Lightning-fast JAX-Based implementations of DQN, PPO, and SAC
- Compatible with many different environment domains via Gymnax, XLand and EnvPool
- Representative benchmark set of HPO settings
There are currently two different ways to install ARLBench. Whichever you choose, we recommend to create a virtual environment for the installation:
conda create -n arlbench python=3.10
conda activate arlbench
The instructions below will help you install the default version of ARLBench with the CPU version of JAX. If you want to run the ARLBench on GPU, we recommend you check out the JAX installation guide to see how you can install the correct version for your GPU setup before proceeding.
PyPI
You can install ARLBench using `pip`:pip install arlbench
If you want to use envpool environments (not currently supported for Mac!), instead choose:
pip install arlbench[envpool]
From source: GitHub
First, you need to clone the ARLBench reopsitory:git clone git@github.com:automl/arlbench.git
cd arlbench
Then you can install the benchmark. For the base version, use:
make install
For the envpool functionality (not available on Mac!), instead use:
make install-envpool
Caution
Windows is currently not supported and also not tested. We recommend using the Linux subsytem if you're on a Windows machine.
Here are the two ways you can use ARLBench: via the command line or as an environment. To see them in action, take a look at our examples.
We provide a command line script for black-box configuration in ARLBench which will also save the results in a 'results' directory. To execute one run of DQN on CartPole, simply run:
python run_arlbench.py
You can use the hydra command line syntax to override some of the configuration like this to change to PPO:
python run_arlbench.py algorithm=ppo
Or run multiple different seeds after one another:
python run_arlbench.py -m autorl.seed=0,1,2,3,4
All hyperparamters to adapt are in the 'hpo_config' and architecture settings in the 'nas_config', so to run a grid of different configurations for 5 seeds each , you can do this:
python run_arlbench.py -m autorl.seed=0,1,2,3,4 nas_config.hidden_size=8,16,32 hp_config.learning_rate=0.001,0.01
We recommend you create your own custom config files if using the CLI (for more information on this, checkout Hydra's guide to config files). Our examples can show you how these can look.
If you want to have specific control over the ARLBench loop, want to do dynamic configuration or learn based on the agent state, you should use the environment-like interface of ARLBench in your script.
To do so, import ARLBench and use the AutoRLEnv
to run an RL agent:
from arlbench import AutoRLEnv
env = AutoRLEnv()
obs, info = env.reset()
action = env.config_space.sample_configuration()
obs, objectives, term, trunc, info = env.step(action)
Just like with RL agents, you can call 'step' multiple times until termination (which you define via the AutoRLEnv's config). For all configuration options, check out our documentation.
If you use ARLBench in your work, please cite us:
@misc{beckdierkes24,
title={ARLBench: Flexible and Efficient Benchmarking for Hyperparameter Optimization in Reinforcement Learning},
author={J. Becktepe and J. Dierkes and C. Benjamins and D. Salinas and A. Mohan and R. Rajan and F. Hutter and H. Hoos and M. Lindauer and T. Eimer},
year={2024},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2409.18827},
note={GitHub: https://github.com/automl/arlbench},
}