Proximal Policy Optimization(PPO) with Intrinsic Curiosity Module(ICM)
-
Updated
Jan 12, 2019 - Python
Proximal Policy Optimization(PPO) with Intrinsic Curiosity Module(ICM)
A collection of Reinforcement Learning implementations with PyTorch
Phasic-Policy-Gradient
An implementation from the state-of-the-art family of reinforcement learning algorithms Proximal Policy Optimization using normalized Generalized Advantage Estimation and optional batch mode training. The loss function incorporates an entropy bonus.
Add a description, image, and links to the generalized-advantage-estimation topic page so that developers can more easily learn about it.
To associate your repository with the generalized-advantage-estimation topic, visit your repo's landing page and select "manage topics."