PROTO: Iterative Policy Regularized Offline-to-Online Reinforcement Learning

How to run the code

Install dependencies

These are the same setup instructions as in Implicit Q-Learning.

pip install --upgrade pip

pip install -r requirements.txt

# Installs the wheel compatible with Cuda 11 and cudnn 8.
pip install --upgrade "jax[cuda]>=0.2.27" -f https://storage.googleapis.com/jax-releases/jax_releases.html

Also, see other configurations for CUDA here.

Example training code

Locomotion

bash 2online_mujoco.sh
bash 2online_mujoco_td3.sh

AntMaze

bash 2online_antmaze.sh
bash 2online_antmaze_td3.sh

Adroit

bash 2online_adroit.sh
bash 2online_adroit_td3.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

PROTO: Iterative Policy Regularized Offline-to-Online Reinforcement Learning

How to run the code

Install dependencies

Example training code

Files

README.md

Latest commit

History

README.md

File metadata and controls

PROTO: Iterative Policy Regularized Offline-to-Online Reinforcement Learning

How to run the code

Install dependencies

Example training code