The repository contains the code for running the experiments in the paper Reward Propagation using Graph Convolutional Networks which was presented as a spotlight at NeurIPS 2020. The implementation is based on a few source codes: gym-miniworld, a good pytorch PPO implementation and Thomas Kipf's pytorch GCN implementation.
# PyTorch
conda install pytorch torchvision -c soumith
# Other requirements
pip install -r requirements.txt
pip install mujoco-py==2.0.2.2 #optional
#Installing PyGCN
python setup_gcn.py install
To launch a run on one of the Atari games, use the following command:
python control/main.py --num-frames 10000000 --algo ppo --use-gae --lr 2.5e-4 --clip-param 0.1 --value-loss-coef 0.5 --num-processes 8 --num-steps 128 --num-mini-batch 4 --gcn_alpha 0.9 --log-interval 1 --env-name ZaxxonNoFrameskip-v4 --seed 0 --entropy-coef 0.01 --use-logger --folder results
To launch a run on one of the delayed MuJoCo environments, use the following command:
python control/main.py --num-frames 3000000 --algo ppo --use-gae --lr 3e-4 --clip-param 0.1 --value-loss-coef 0.5 --num-processes 1 --ppo-epoch 10 --num-steps 2048 --num-mini-batch 32 --gcn_alpha 0.6 --log-interval 1 --env-name Walker2d-v2 --seed 0 --entropy-coef 0.0 --use-logger --folder results --reward_freq 20
If you found our paper useful or interesting, please consider citing it:
@inproceedings{NEURIPS2020_97062741,
author = {Klissarov, Martin and Precup, Doina},
booktitle = {Advances in Neural Information Processing Systems},
editor = {H. Larochelle and M. Ranzato and R. Hadsell and M. F. Balcan and H. Lin},
pages = {12895--12908},
publisher = {Curran Associates, Inc.},
title = {Reward Propagation Using Graph Convolutional Networks},
url = {https://proceedings.neurips.cc/paper/2020/file/970627414218ccff3497cb7a784288f5-Paper.pdf},
volume = {33},
year = {2020}
}