Training agents in OpenAI-Gym
with Policy-Gradient methods

Training plots

Training Policy Gradient on the CartPoleV1 environment.

Training Policy Gradient on the LunarLander-v2 environment.

Actor Critic plots for the LunarLander-v2 environment.

Architectures (Click to expand)

Sutton, R. S., Barto, A. G. (2018). Reinforcement Learning: An Introduction. The MIT Press.
Graesser, L., Keng, W. L. (2019). Foundations of Deep Reinforcement Learning: Theory and Practice in Python. Addison-Wesley Professional.
Chris Yoon, Dec 30, 2018, Deriving Policy Gradients and Implementing REINFORCE
Silver, D. (2015, December 21). RL Course by David Silver - Lecture 7: Policy Gradient Methods.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
images		images
wandb		wandb
weights		weights
.gitattributes		.gitattributes
.gitignore		.gitignore
A2C.ipynb		A2C.ipynb
README.md		README.md
actor_critic.py		actor_critic.py
play.py		play.py
reinforce.py		reinforce.py
testing_stuff.py		testing_stuff.py
train.py		train.py