DeepRL

If you have any question or want to report a bug, please open an issue instead of emailing me directly.

Modularized implementation of popular deep RL algorithms in PyTorch.
Easy switch between toy tasks and challenging games.

Implemented algorithms:

(Double/Dueling/Prioritized) Deep Q-Learning (DQN)
Categorical DQN (C51)
Quantile Regression DQN (QR-DQN)
(Continuous/Discrete) Synchronous Advantage Actor Critic (A2C)
Synchronous N-Step Q-Learning (N-Step DQN)
Deep Deterministic Policy Gradient (DDPG)
Proximal Policy Optimization (PPO)
The Option-Critic Architecture (OC)
Twined Delayed DDPG (TD3)
Off-PAC-KL/TruncatedETD/DifferentialGQ/MVPI/ReverseRL/COF-PAC/GradientDICE/Bi-Res-DDPG/DAC/Geoff-PAC/QUOTA/ACE

The DQN agent, as well as C51 and QR-DQN, has an asynchronous actor for data generation and an asynchronous replay buffer for transferring data to GPU. Using 1 RTX 2080 Ti and 3 threads, the DQN agent runs for 10M steps (40M frames, 2.5M gradient updates) for Breakout within 6 hours.

Dependency

PyTorch v1.5.1
See Dockerfile and requirements.txt for more details

Usage

examples.py contains examples for all the implemented algorithms.
Dockerfile contains the environment for generating the curves below.
Please use this bibtex if you want to cite this repo

@misc{deeprl,
  author = {Zhang, Shangtong},
  title = {Modularized Implementation of Deep RL Algorithms in PyTorch},
  year = {2018},
  publisher = {GitHub},
  journal = {GitHub Repository},
  howpublished = {\url{https://github.com/ShangtongZhang/DeepRL}},
}

Curves (commit `9e811e`)

BreakoutNoFrameskip-v4 (1 run)

Mujoco

DDPG/TD3 evaluation performance. (5 runs, mean + standard error)
PPO online performance. (5 runs, mean + standard error, smoothed by a window of size 10)

References

Code of My Papers

They are located in other branches of this repo and seem to be good examples for using this codebase.

Name		Name	Last commit message	Last commit date
Latest commit History 480 Commits
deep_rl		deep_rl
images		images
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker_batch.sh		docker_batch.sh
docker_build.sh		docker_build.sh
docker_clean.sh		docker_clean.sh
docker_python.sh		docker_python.sh
docker_shell.sh		docker_shell.sh
docker_stop.sh		docker_stop.sh
examples.py		examples.py
requirements.txt		requirements.txt
setup.py		setup.py
template_jobs.py		template_jobs.py
template_plot.py		template_plot.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DeepRL

Dependency

Usage

Curves (commit `9e811e`)

BreakoutNoFrameskip-v4 (1 run)

Mujoco

References

Code of My Papers

About

Contributors 3

Languages

License

ShangtongZhang/DeepRL

Folders and files

Latest commit

History

Repository files navigation

DeepRL

Dependency

Usage

Curves (commit 9e811e)

BreakoutNoFrameskip-v4 (1 run)

Mujoco

References

Code of My Papers

About

Topics

Resources

License

Stars

Watchers

Forks

Contributors 3

Languages

Curves (commit `9e811e`)