This repository is the official implementation of Curiosity-Driven Exploration via Latent Bayesian Surprise.
If you find the code useful, please refer to our work using:
title={Curiosity-Driven Exploration via Latent Bayesian Surprise},
journal={Proceedings of the AAAI Conference on Artificial Intelligence},
author={Mazzaglia, Pietro and Catal, Ozan and Verbelen, Tim and Dhoedt, Bart},
Create and activate a conda environment running:
conda create -n lbs python=3.8`
conda activate lbs
To install dependencies, run:
pip install -r requirements.txt
In order to run experiments you can use the following:
python --env-name "MountainCarSparse-v0" --algo ppo-lbs --use-gae --log-interval 1 --num-steps 2048 --num-processes 1 --lr 3e-4 --entropy-coef 0.01 --value-loss-coef 0.5 --ppo-epoch 10 --num-mini-batch 32 --gamma 0.99 --gae-lambda 0.95 --num-env-steps 102400 --beta 0.1 --no-cuda --log-dir ./logs/mountaincarsparse/lbs-0 --seed 0
Make sure you have correctly installed and configured mujoco-py
python --env-name "MagellanAnt-v2" --algo ppo-lbs --use-gae --log-interval 1 --num-steps 2048 --num-processes 1 --lr 3e-4 --entropy-coef 0.01 --value-loss-coef 0.5 --ppo-epoch 10 --num-mini-batch 32 --gamma 0.99 --gae-lambda 0.95 --num-env-steps 251904 --beta 0.1 --no-cuda --log-dir ./logs/antmaze/lbs-0 --seed 0
Make sure you have correctly installed and configured mujoco-py
python --env-name "HalfCheetahSparse-v3" --algo ppo-lbs --use-gae --log-interval 1 --num-steps 2048 --num-processes 1 --lr 3e-4 --entropy-coef 0.01 --value-loss-coef 0.5 --ppo-epoch 10 --num-mini-batch 32 --gamma 0.99 --gae-lambda 0.95 --num-env-steps 501760 --beta 0.1 --no-cuda --log-dir ./logs/halfcheetahsparse/lbs-0 --seed 0
Make sure you have correctly installed and configured atari-py
(you may need to import the Atari ROMs).
python --env-name SpaceInvadersNoFrameskip-v4 --algo ppo-lbs --use-gae --lr 1e-4 --clip-param 0.1 --value-loss-coef 0.5 --num-processes 128 --num-steps 128 --num-mini-batch 8 --ppo-epoch 3 --log-interval 1 --entropy-coef 0.001 --num-env-steps 100000000 --log-dir ./logs/SpaceInvaders/lbs-5 --seed 1 --beta 0.01
Make sure to have correctly configured gym-retro
(you may need to import Mario's ROM).
python --env-name MarioBrosNoFrameskip-v4 --algo ppo-lbs --use-gae --lr 1e-4 --clip-param 0.1 --value-loss-coef 0.5 --num-processes 128 --num-steps 128 --num-mini-batch 8 --ppo-epoch 3 --log-interval 1 --entropy-coef 0.001 --num-env-steps 100000000 --log-dir ./logs/MarioBros/lbs-5 --seed 1 --beta 0.01
python --env-name "MountainCarStochastic-Frozen" --algo ppo-lbs --use-gae --log-interval 1 --num-steps 2048 --num-processes 1 --lr 3e-4 --entropy-coef 0.01 --value-loss-coef 0.5 --ppo-epoch 10 --num-mini-batch 32 --gamma 0.99 --gae-lambda 0.95 --num-env-steps 102400 --beta 2. --no-cuda --log-dir ./logs/mountaincarstoch-frozen/lbs-0 --seed 0
We would like to thank the authors of the following repositories for their open source code:
PPO implementation [PPO training code]
Model-based active exploration [Ant Maze environment]
Large-Scale Study of Curiosity-Driven Learning [Arcade games experiments]