Deep Reinforcement Learning Algorithms

This repo serves as a companion to An Overview of Reinforcement Learning. It contains Pytorch implementations of the DQN, DDPG and PPO (for continuous action spaces) algorithms.

The DQN and DDPG algorithms are implemented in a parallel fashion in the style of APE-X with multiple CPU actor processes asynchronously communicating with a single CPU replay buffer process and a single GPU learner process.

Usage

To run any algorithm, run the command python agent.py --env <ENV_ID>.

Run python agent.py --help for further information about command line arguments.

Note on PPO

While it is straightforward to extend the implementation of PPO to be parallel as well, it was found that using decoupled actor processes did not perform very well due to differences in the running mean and standard deviation of the observations and rewards. Solving that by having all actor processes interact with a single process that kept track of the running means and standard deviations worked but caused a communication bottleneck that obviated the advantage of parallelism in the first place. OpenAI's baseline implementation solves this by using lower-level primitives that I chose to avoid for the sake of simplicity.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
assets		assets
ddpg		ddpg
dqn		dqn
ppo		ppo
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep Reinforcement Learning Algorithms

Usage

Note on PPO

About

Releases

Packages

Languages

License

mjacar/deep-rl-algorithms

Folders and files

Latest commit

History

Repository files navigation

Deep Reinforcement Learning Algorithms

Usage

Note on PPO

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages