Skip to content

Latest commit

 

History

History
49 lines (41 loc) · 4.65 KB

README.md

File metadata and controls

49 lines (41 loc) · 4.65 KB

Snake-AI

license Python application

This project aims to use deep reinforcement learning (DRL) to play Snake game automatically. The core DRL method used here is PPO for discrete, which has brilliant performance in the field of discrete action space like in continuous action space. You just need half an hour to train the snake agent and then it can take effect.

Requirements

conda create -n ppo --yes --file conda.txt
conda activate ppo
pip install -r requirements.txt

Usage

Train

python train.py # after training, the training curve of current round will autometically show
python snake.py # evaluate latest saved model

Evaluate assigned model

python evaluate.py --weight ./model/act-weight_round3_472_82.5.pkl

Plot assigned reward log

python plotter.py --history ./logs/reward_round3_82.5.csv

Experiments

Round 1 2 3
Traing curve round1 round2 round3
Evaluation round1 round2 round3
Reward_eat +2.0 +2.0 +2.0
Reward_hit -0.5 -1.0 -1.5
Reward_bit -0.8 -1.5 -2.0
Avg record ≈19 ≈23 ≈28

Conclusions

  1. Increasing the penalty for death leads to higher average records
  2. The training result of the low death penalty strategy has a low reward curve, but it performs well in the demo
  3. A particularly high reward for eating food can lead to quick success regardless of long-term safety

Future work

  1. Training time is too short to reflect the advantages of DRL compared to none-DRL method (Snaqe)
  2. The zigzag of snake body looks ugly, try to add punishment into reward for too many zigzags