Skip to content

Latest commit

 

History

History
56 lines (51 loc) · 1.76 KB

README.md

File metadata and controls

56 lines (51 loc) · 1.76 KB

PPO Pytorch C++

This is an implementation of the proximal policy optimization algorithm for the C++ API of Pytorch. It uses a simple TestEnvironment to test the algorithm. Below is a small visualization of the environment, the algorithm is tested in.

Fig. 1: The agent in testing mode.

Build

You first need to install PyTorch. For a clean installation from Anaconda, checkout this short tutorial, or this tutorial, to only install the binaries.

Do

mkdir build
cd build
cmake -DCMAKE_PREFIX_PATH=/absolut/path/to/libtorch ..
make

Run

Run the executable with

cd build
./train_ppo

To plot the results, run

cd ..
python plot.py --online_view --csv_file data/data.csv --epochs 1 10

It should produce something like shown below.

Fig. 2: From left to right, the agent for successive epochs in training mode as it takes actions in the environment to reach the goal.

The algorithm can also be used in test mode, once trained. Therefore, run

cd build
./test_ppo

To plot the results, run

cd ..
python plot.py --online_view --csv_file data/data_test.csv --epochs 1

Visualization

The results are saved to data/data.csv and can be visualized by running python plot.py. Run

python plot.py --help

for help.