Snake-AI

This project aims to use deep reinforcement learning (DRL) to play Snake game automatically. The core DRL method used here is PPO for discrete, which has brilliant performance in the field of discrete action space like in continuous action space. You just need half an hour to train the snake agent and then it can take effect.

Requirements

conda create -n ppo --yes --file conda.txt
conda activate ppo
pip install -r requirements.txt

Usage

Train

python train.py # after training, the training curve of current round will autometically show
python snake.py # evaluate latest saved model

Evaluate assigned model

python evaluate.py --weight ./model/act-weight_round3_472_82.5.pkl

Plot assigned reward log

python plotter.py --history ./logs/reward_round3_82.5.csv

Experiments

Round	1	2	3
Traing curve
Evaluation
Reward_eat	+2.0	+2.0	+2.0
Reward_hit	-0.5	-1.0	-1.5
Reward_bit	-0.8	-1.5	-2.0
Avg record	≈19	≈23	≈28

Conclusions

Increasing the penalty for death leads to higher average records
The training result of the low death penalty strategy has a low reward curve, but it performs well in the demo
A particularly high reward for eating food can lead to quick success regardless of long-term safety

Future work

Training time is too short to reflect the advantages of DRL compared to none-DRL method (Snaqe)
The zigzag of snake body looks ugly, try to add punishment into reward for too many zigzags

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Snake-AI

Requirements

Usage

Train

Evaluate assigned model

Plot assigned reward log

Experiments

Conclusions

Future work

Files

README.md

Latest commit

History

README.md

File metadata and controls

Snake-AI

Requirements

Usage

Train

Evaluate assigned model

Plot assigned reward log

Experiments

Conclusions

Future work