Skip to content
/ CEER Public

Replay Memory as An Empirical MDP: Combining Conservative Estimation with Experience Replay. ICLR 2023

License

Notifications You must be signed in to change notification settings

initial-h/CEER

Repository files navigation

Replay Memory as An Empirical MDP: Combining Conservative Estimation with Experience Replay

overview

Overview

  • PyTorch implementation of Conservative Estimation with Experience Replay (CEER).

  • Method is tested on Sokoban, Minigrid and MinAtar environments.

Installation

pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 torchaudio==0.11.0 --extra-index-url https://download.pytorch.org/whl/cu113
pip install -r requirements.txt
  • My Python version is 3.7.11. CUDA version is 11.4.

Running Experiments

python main.py
  • Modify atari_name_list in ceer/arguments.py for different environments.

  • For example, 'atari_name_list': ['Sokoban-Push_5x5_1_120'].

  • Other parameters like sample_method_para # alpha,policy_loss_para # lambda are also in ceer/arguments.py.

Bibtex

@inproceedings{
zhang2023replay,
title={Replay Memory as An Empirical {MDP}: Combining Conservative Estimation with Experience Replay},
author={Hongming Zhang and Chenjun Xiao and Han Wang and Jun Jin and Bo Xu and Martin M{\"u}ller},
booktitle={The Eleventh International Conference on Learning Representations },
year={2023},
url={https://openreview.net/forum?id=SjzFVSJUt8S}
}

Acknowledgements

About

Replay Memory as An Empirical MDP: Combining Conservative Estimation with Experience Replay. ICLR 2023

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages