==================================
https://github.com/openai/spinningup
$ cd DRL $ pip install -e . $ pip install ~/carla/PythonClient (optional) $ pip install opencv-python
==================================
references:
sac is based on:
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
https://arxiv.org/abs/1801.01290
sac1 is added based on:
Soft Actor-Critic Algorithms and Applications
https://arxiv.org/abs/1812.05905
==================================
ddgp vs sac1
- gym env 'Pendulum-v0':(Minimum_Episode_Return)
sqn experiments on gym env 'LunarLander-v2':
Try trained model on env 'Breakout-ram-v4':
$ python -m spinup.run test_policy ./saved_models/Breakout-ram-v4 -d -l 20000
More experiments: https://mp.weixin.qq.com/s/-ZWj-uw5wWWhGy3B08Xk3Q (sqn) https://mp.weixin.qq.com/s/8vgLGcpsWkF89ma7T2twRA ('BipedalWalkerHardcore-v2')
Learning Latent Dynamics for Planning from Pixels
INFOBOT: TRANSFER AND EXPLORATION VIA THE INFORMATION BOTTLENECK
- code: to be released.
Unsupervised Meta-Learning for Reinforcement Learning
DIVERSITY IS ALL YOU NEED: LEARNING SKILLS WITHOUT A REWARD FUNCTION (DIAYN)
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks (MAML)