Implementation of Distributed Reinforcement Learning with Tensorflow

Information

20 actors with 1 learner.
Tensorflow implementation with distributed tensorflow of server-client architecture.
Recurrent Experience Replay in Distributed Reinforcement Learning is implemented in Breakout-Deterministic-v4 with POMDP(Observation not provided with 20% probability)

Dependency

opencv-python
gym[atari]
tensorboardX
tensorflow==1.14.0

Implementation

Asynchronous Methods for Deep Reinforcement Learning
IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
DISTRIBUTED PRIORITIZED EXPERIENCE REPLAY
Recurrent Experience Replay in Distributed Reinforcement Learning

How to Run

A3C: Asynchronous Methods for Deep Reinforcement Learning

CUDA_VISIBLE_DEVICES=-1 python train_a3c.py --job_name --job_name actor --task 0

CUDA_VISIBLE_DEVICES=-1 python train_a3c.py --job_name --job_name actor --task 0
CUDA_VISIBLE_DEVICES=-1 python train_a3c.py --job_name --job_name actor --task 1
CUDA_VISIBLE_DEVICES=-1 python train_a3c.py --job_name --job_name actor --task 2
...
CUDA_VISIBLE_DEVICES=-1 python train_a3c.py --job_name --job_name actor --task 19

Ape-x: DISTRIBUTED PRIORITIZED EXPERIENCE REPLAY

python train_apex.py --job_name learner --task 0

CUDA_VISIBLE_DEVICES=-1 python train_apex.py --job_name actor --task 0
CUDA_VISIBLE_DEVICES=-1 python train_apex.py --job_name actor --task 1
CUDA_VISIBLE_DEVICES=-1 python train_apex.py --job_name actor --task 2
...
CUDA_VISIBLE_DEVICES=-1 python train_apex.py --job_name actor --task 19

IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures

python train_impala.py --job_name learner --task 0

CUDA_VISIBLE_DEVICES=-1 python train_impala.py --job_name actor --task 0
CUDA_VISIBLE_DEVICES=-1 python train_impala.py --job_name actor --task 1
CUDA_VISIBLE_DEVICES=-1 python train_impala.py --job_name actor --task 2
...
CUDA_VISIBLE_DEVICES=-1 python train_impala.py --job_name actor --task 19

R2D2: Recurrent Experience Replay in Distributed Reinforcement Learning

python train_r2d2.py --job_name learner --task 0

CUDA_VISIBLE_DEVICES=-1 python train_r2d2.py --job_name actor --task 0
CUDA_VISIBLE_DEVICES=-1 python train_r2d2.py --job_name actor --task 1
CUDA_VISIBLE_DEVICES=-1 python train_r2d2.py --job_name actor --task 2
...
CUDA_VISIBLE_DEVICES=-1 python train_r2d2.py --job_name actor --task 39

Reference

IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
DISTRIBUTED PRIORITIZED EXPERIENCE REPLAY
Recurrent Experience Replay in Distributed Reinforcement Learning
deepmind/scalable_agent
google-research/seed-rl
Asynchronous_Advatnage_Actor_Critic
Relational_Deep_Reinforcement_Learning
Deep Recurrent Q-Learning for Partially Observable MDPs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Implementation of Distributed Reinforcement Learning with Tensorflow

Information

Dependency

Implementation

How to Run

Reference

Files

README.md

Latest commit

History

README.md

File metadata and controls

Implementation of Distributed Reinforcement Learning with Tensorflow

Information

Dependency

Implementation

How to Run

Reference