Release v4.0.1: Soft Actor-Critic · kengz/SLM-Lab

This release adds a new algorithm: Soft Actor-Critic (SAC).

Soft Actor-Critic

-implement the original paper: "Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor" https://arxiv.org/abs/1801.01290 #398

implement the improvement of SAC paper: "Soft Actor-Critic Algorithms and Applications" https://arxiv.org/abs/1812.05905 #399
extend SAC to work directly for discrete environment using GumbelSoftmax distribution (custom)