Soft-Actor-Critic Deep RL for Simulated Humanoid Walking

This is an implementation of the Reinforcement Learning algorithm SAC (as presented in Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor and Soft Actor-Critic Algorithms and Applications) for the project of the Reinforcement Learning module of the Machine Learning course of the Master in Artificial Intelligence and Robotics (A.Y. 2019/20).

The implementation is done in Python3 using the PyTorch library and has been tested for the project on the OpenAI Gym's environments MountainCarContinuous and Humanoid.

Prerequisites

Python3 and pytorch are required, as well as gym. I suggest an installation via Anaconda on an Ubuntu machine. MuJoCo is also needed for running the gym environments that depend on it, such as Humanoid.

[Optional] Create an Anaconda environment (e.g. conda create --name gym python=3.7, then conda activate gym).
Install MuJoCo.
1. Install MuJoCo prerequisites.
```
sudo apt-get update -y
sudo apt-get install -y libgl1-mesa-dev libgl1-mesa-glx libglew-dev libosmesa6-dev \ 
                        software-properties-common net-tools patchelf
```
2. Download MuJoCo for Linux (license required, you can get a 30-day trial or a student license for free).
3. Unzip the downloaded file, create a directory ~/.mujoco/, place the unzipped folder and your license key (the mjkey.txt file from your email) at ~/.mujoco/.
4. Add to ~/.bashrc the following lines, replacing <username> with your username
```
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/<username>/.mujoco/mujoco200/bin
export LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libGLEW.so
```
5. Install MuJoCo
```
pip3 install -U 'mujoco-py<2.1,>=2.0'
```
Install Gym.

pip3 install gym[box2d]

Install package for rendering of environments.

sudo apt install ffmpeg

Install PyTorch.
Install additional Python3 packages.

pip3 install numpy matplotlib seaborn texttable

### Running the tests

Run `python3 main.py --test --render --plot` to run a random agent on the selected environment. The environment should render as well as an online plot of the return per episode, and a per episode summary should be printed on the terminal.

Usage

Run python3 main.py --help to see the list of command-line options and their meaning.

Resources used

Authors

Andrea Caciolai

Name		Name	Last commit message	Last commit date
Latest commit History 66 Commits
README.md		README.md
main.py		main.py
networks.py		networks.py
replay_buffer.py		replay_buffer.py
sac.py		sac.py
simulator.py		simulator.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Soft-Actor-Critic Deep RL for Simulated Humanoid Walking

Prerequisites

Usage

Resources used

Authors

About

Releases

Packages

Languages

caciolai/Soft-Actor-Critic-Deep-RL-for-Simulated-Humanoid-Walking

Folders and files

Latest commit

History

Repository files navigation

Soft-Actor-Critic Deep RL for Simulated Humanoid Walking

Prerequisites

Usage

Resources used

Authors

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages