Slither-DRL

In this repository, we implement some well known deep reinforcement learning (DRL) algorithms for slither.io.

Policy Gradient (PG)
Deep-Q Network (DQN)
Actor-Critic (AC)
Advantage Actor-Critic (A2C)

Installation Instructions :

Install docker for ubuntu 16.04 MAKE SURE TO DO STEP 2 AS WELL
Install Conda for ubuntu 16.04
Create Conda env

conda create --name slither python=3.5

Activate a conda env

source activate slither

Install needed packages

sudo apt-get update
sudo apt-get install -y tmux htop cmake golang libjpeg-dev libgtk2.0-0 ffmpeg

Install pytorch 1.1.0 for CUDA 10.0

pip3 install https://download.pytorch.org/whl/cu100/torch-1.1.0-cp35-cp35m-linux_x86_64.whl
pip3 install https://download.pytorch.org/whl/cu100/torchvision-0.3.0-cp35-cp35m-linux_x86_64.whl

Install universe installation dependencies

pip install numpy
pip install gym==0.9.5

Install universe

git clone https://github.com/openai/universe.git
cd universe
pip install -e .

Install this repository

cd ..
git clone https://github.com/JuiHsiu/Slither-DRL.git
cd Slither-DRL

How to Run :

training the agent

python main.py --train_[pg|dqn|ac|a2c]

For DQN, there are some improvements can be added: (optional)

python main.py --train_dqn [--dueling_dqn] [--prioritized_dqn]

testing the agent

python main.py --test_[pg|dqn|ac|a2c]

If you want to see your agent playing the game,

python main.py --train_[pg|dqn|ac|a2c] --do_render

By default, when you test the agent, the procedure is recorded as a video. You can assign the directory to the video by :

python main.py --test_[pg|dqn|ac|a2c] --video_dir [path_to_save_video]

Advanced Arguments :

Number of Environment

You can create more than one environment at the same time. However, you need to modify the codes to perform batch learning.

python main.py --train_[pg|dqn|ac|a2c] --remotes [#_of_env]

Action Space

We make 12 different positions of the mouse as the action space of our agent. If you want the agent to have the ability to accumulate, set the action_space = 24.

python main.py --train_[pg|dqn|ac|a2c] --action_space [12|24]

Demo :

Our best model is A2C and you can see the pre-trained agent playing game as following:

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
a2c		a2c
agent		agent
demo		demo
utils		utils
README.md		README.md
argument.py		argument.py
draw graph.ipynb		draw graph.ipynb
main.py		main.py
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Slither-DRL

Installation Instructions :

How to Run :

Advanced Arguments :

Demo :

About

Releases

Packages

Contributors 3

Languages

JuiHsiu/Slither-DRL

Folders and files

Latest commit

History

Repository files navigation

Slither-DRL

Installation Instructions :

How to Run :

Advanced Arguments :

Demo :

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages