Algorithm | Authors | Publication | Code | Classification | Features | Detailed |
---|---|---|---|---|---|---|
Markov Decision Processes (MDP) | Puterman, M.L | John Wiley & Sons 2014 | / | Model | / | TBD |
Temporal Difference (TD) Learning | Tesauro, G. | Communications of the ACM 1995 | / | Cornerstone | TBD | TBD |
Q-Learning | Watkins, C. J. et al. | Machine Learning 1992 | / | / | Q Table | TBD |
Deep Q-Networks (DQN) | Mnih, V. et al. | Nature 2015 | PyTorch | Q Networks | Deep network + Q-learning | TBD |
Deep Deterministic Policy Gradient (DDPG) | Lillicrap, T.P. et al. | arXiv 2015 | TBD | AC | Continuous control | TBD |
Trust Region Policy Optimization (TRPO) | Schulman, J. et al. | ICML 2015 | TBD | Policy | TBD | TBD |
Prioritized Experience Replay (PER) | Schaul, T. et al. | arXiv 2015 | TBD | Replay | TBD | TBD |
Deep Recurrent Q-Network (DRQN) | Hausknecht, M. et al. | AAAI 2015 | TBD | Q Networks | TBD | TBD |
Monte-Carlo Tree Search (MCTS) | Silver, D. et al. | Nature 2016 | TBD | TBD | TBD | TBD |
Double DQN | Van Hasselt, H. et al. | AAAI 2016 | TBD | Q Networks | TBD | TBD |
Dueling DQN | Wang, Z. et al. | ICML 2016 | TBD | Q Networks | TBD | TBD |
Asynchronous Advantage Actor-Critic (A3C) | Mnih, V. et al. | ICML 2016 | TBD | AC | TBD | TBD |
Noise Networks | Fortunato, M. et al. | arXiv 2017 | TBD | Exploration | TBD | TBD |
Hindsight Experience Replay (HER) | Andrychowicz, M. et al. | NeurIPS 2017 | TBD | Replay | TBD | TBD |
Soft Q-Learning (SQL) | Haarnoja, T. et al. | ICML 2017 | TBD | TBD | TBD | TBD |
Distributional DQN | Bellemare, M.G. et al. | ICML 2017 | TBD | Q Networks | TBD | TBD |
Proximal Policy Optimization (PPO) | Schulman, J. et al. | arXiv 2017 | TBD | Policy | TBD | TBD |
Multi-Agent DDPG (MADDPG) | Lowe, R. et al. | NeurIPS 2017 | TBD | MADRL | TBD | TBD |
FeUdal Networks | Vezhnevets, A.S. et al. | ICML 2017 | TBD | HRL | TBD | TBD |
Twin Delayed DDPG (TD3) | Fujimoto, S. et al. | ICML 2018 | TBD | AC | TBD | TBD |
Soft Actor-Critic (SAC) | Haarnoja, T. et al. | ICML 2019 | TBD | AC | TBD | TBD |
----------- | ----------- | ----------- | TBD | TBD | TBD | TBD |
-
Notifications
You must be signed in to change notification settings - Fork 2
License
FlashRL/DRL-Tutorial
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
No description, website, or topics provided.
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published