Skip to content

Latest commit

 

History

History
16 lines (12 loc) · 915 Bytes

README.md

File metadata and controls

16 lines (12 loc) · 915 Bytes

TD3_Pendulum-v0

This TD3 implementation has been modified from Aloïs Pourchot CEM-RL git repository. The algorithm can be launched this way :
python3 td3_launcher_step_study.py
The default parameters can be seen in the argParser of td3_launcher_step_study.py file.

TD3 has achieved a mean performance of ~-146.4 on 900 episodes of Pendulum-v0 (actor_2.pkl).

The policy obtained has a high variance, maybe it can be reduced by tuning TD3 parameters, like discount value, noise or batch size. To reproduce the results, run evaluate_actor.py on some actor obtained with TD3.
Two actors have been added in this git : one making ~ -150 (actor_1.pkl), one making ~ -146 (actor_2.pkl), measured with the evaluate_actor.
You can run evaluate_actor this way :
python3 evaluate_actor.py --env Pendulum-v0 --file actor_2

These actors are obtained quickly on several short runs of TD3.