Baseline RL algorithms for OpenAI gym

OpenAI gym의 환경들을 이용해서 Spinning up의 Key Papers in Deep RL에 나오는 논문들을 최대한 많이 구현해보는 것이 목표인 프로젝트입니다. 또한 최대한 한 알고리즘이 다양한 환경에 적용될 수 있도록 실험자를 위한 프레임워크를 구축하는 것 또한 목표입니다.

또한 OpenAI의 multiprocessing_env를 활용하여 학습하고 여러 개 렌더링을 동시에 할 수 있습니다.(n_worker 파라미터 활용)

BipedalWalker-v2	CartPole-v0	Pendulum-v0

Acrobot-v1	LunarLander-v2

Getting started

Prerequisites

저의 실험환경. 그 외에는 돌아갈지 알 수 없음..

Ubuntu 18.04 : 실제로 저는 linux mint 19.1에서 했지만 우분투에서도 잘 돌아감을 확인했습니다.
Python 3.6.7

Package

sudo apt-get install python3-setuptools python-dev python3-dev python3-opencv swig ffmpeg build-essential libssl-dev libffi-dev libxml2-dev libxslt-dev libxslt1-dev zlib1g-dev qtbase5-dev libqt5opengl5-dev libassimp-dev cmake patchelf
python3 -m pip install virtualenv

Repository

git clone https://github.com/mclearning2/Baseline_RL.git

Python Libraries

python3 -m virtualenv .env --no-site-packages
source .env/bin/activate
python3 -m pip install -r requirements.txt

Wandb Init

Please refer to https://docs.wandb.com/docs/started.html

wandb init
wandb login

Roboschool

sudo apt-get install qtbase5 libqt5opengl5-dev libassimp-dev patchelf cmake
git clone https://github.com/openai/roboschool && cd roboschool
./install_boost.sh
./install_bullet.sh
source exports.sh
export CPLUS_INCLUDE_PATH=/usr/include/python3.6
pip3 install wheel
cd roboschool/cpp-household && make clean && make -j4 && cd ../.. && pip3 install -e .

Train

python3 main.py

Test

python3 main.py --test

Argument Parsing

Argument	Default	Description
--project	None	Restore from wandb by this project. and if None, this is replaced to the file name in {projects} folder you selected
--seed	1	Seed for reproduction
--test	-	If this is included, agent starts test not train
--restore	-	If this is included, hyperparameters from `./{report_dir}/model/{project}/hyperparams.pkl` is loaded and replaced by this in `hyperparams`
--render	-	If this is included, agent starts render
--record	-	If this is included, Interaction is recorded in `./{report_dir}/videos/{project}/`

Environment

Classic control

Box2D

Atari, MuJoCo, Roboschool

Comming soon...

Algorithms

Model-Free RL

(DQN) Playing Atari with Deep Reinforcement Learning, Mnih et al, 2013.
(Deep Recurrent Q-Learning) Deep Recurrent Q-Learning for Partially Observable MDPs, Hausknecht and Stone, 2015.
(Dueling DQN) Dueling Network Architectures for Deep Reinforcement Learning, Wang et al, 2015.
(Double DQN) Deep Reinforcement Learning with Double Q-learning, Hasselt et al 2015.
(PER) Prioritized Experience Replay, Schaul et al, 2015.
(Rainbow DQN) Rainbow: Combining Improvements in Deep Reinforcement Learning, Hessel et al, 2017.

Policy Gradients

Deterministic Policy Gradients

(DPG) Deterministic Policy Gradient Algorithms, Silver et al, 2014.
(DDPG) Continuous Control With Deep Reinforcement Learning, Lillicrap et al, 2015.
(TD3) Addressing Function Approximation Error in Actor-Critic Methods, Fujimoto et al, 2018.

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
algorithms		algorithms
common		common
environments		environments
images		images
projects		projects
.gitignore		.gitignore
README.md		README.md
main.py		main.py
mypy.ini		mypy.ini
requirements.txt		requirements.txt
test.sh		test.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Baseline RL algorithms for OpenAI gym

Getting started

Prerequisites

Package

Repository

Python Libraries

Wandb Init

Roboschool

Train

Test

Argument Parsing

Environment

Classic control

Box2D

Atari, MuJoCo, Roboschool

Algorithms

Model-Free RL

Policy Gradients

Deterministic Policy Gradients

About

Releases

Packages

Languages

mclearning2/Baseline_RL

Folders and files

Latest commit

History

Repository files navigation

Baseline RL algorithms for OpenAI gym

Getting started

Prerequisites

Package

Repository

Python Libraries

Wandb Init

Roboschool

Train

Test

Argument Parsing

Environment

Classic control

Box2D

Atari, MuJoCo, Roboschool

Algorithms

Model-Free RL

Policy Gradients

Deterministic Policy Gradients

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages