Reducing Maximization Bias and Risk in Hyperparameter Optimization for Reinforcement Learning and Learning-Based Control

This is the code for the paper entitiled Reducing Maximization Bias and Risk in Hyperparameter Optimization for Reinforcement Learning and Learning-Based Control. The implementation is adapted and based on Safe-Control-Gym.

Install on Ubuntu

Create a `conda` environment

Create and access a Python 3.10 environment using conda

conda create -n pr-env python=3.10
conda activate pr-env

Install

pip install --upgrade pip
pip install -e .

Note

You may need to separately install gmp, a dependency of pycddlib:

conda install -c anaconda gmp

or

sudo apt-get install libgmp-dev

To perform hyperparmeter optimization, you may need MySQL database:

sudo apt-get install mysql-server

To set up, run the following commands sequencially:

sudo mysql
CREATE USER optuna@"%";
CREATE DATABASE {algo}_hpo;
GRANT ALL ON {algo}_hpo.* TO optuna@"%";
exit

You may replace {algo} with gp_mpc, ppo, sac, or ddpg in order to run the scripts.

Toy Examples

The results for toy examples in the paper can be reproduced in toy_example.ipynb

Reinforcement Learning

To run hyperparameter optimization (HPO) for DDPG, run:

bash experiments/comparisons/rl/main.sh hostx TPESampler ddpg cartpole stab False

To run hyperparameter optimization (HPO) for PPO, run:

bash experiments/comparisons/rl/main.sh hostx TPESampler ppo cartpole stab False

To run hyperparameter optimization (HPO) for SAC, run:

bash experiments/comparisons/rl/main.sh hostx TPESampler sac cartpole stab False

Learning-Based Control

To run hyperparameter optimization (HPO) for GP-MPC, run:

bash experiments/comparisons/gpmpc/main.sh hostx TPESampler cartpole stab False

Note

You may need to adjust the path of conda.sh in the sub-scripts called by main.sh such as rl_hpo_strategy_eval.sh.

Name		Name	Last commit message	Last commit date
Latest commit History 589 Commits
.github/workflows		.github/workflows
examples		examples
experiments/comparisons		experiments/comparisons
figures		figures
safe_control_gym		safe_control_gym
tests		tests
.gitignore		.gitignore
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reducing Maximization Bias and Risk in Hyperparameter Optimization for Reinforcement Learning and Learning-Based Control

Install on Ubuntu

Create a `conda` environment

Install

Note

Toy Examples

Reinforcement Learning

Learning-Based Control

Note

About

Releases

Packages

Languages

License

middleyuan/safe-control-gym

Folders and files

Latest commit

History

Repository files navigation

Reducing Maximization Bias and Risk in Hyperparameter Optimization for Reinforcement Learning and Learning-Based Control

Install on Ubuntu

Create a conda environment

Install

Note

Toy Examples

Reinforcement Learning

Learning-Based Control

Note

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Create a `conda` environment

Packages