Training parallel RL agents on Snake

This project implements the A2C algorithm from 2016, using Tensorflow v1. The environment used is a basic snake, with a parameterizable grid size.

Multiprocessing

The main feature of this program is that it uses multiple processes. One process trains the weights of the networks ('learner' or 'master'), while all the others play the game to gather data with the latest weights ('actors'). They exchange information using very basic messages, transmitted with pipes. All these processes are controlled by the main, initial process which also spawns them at the beginning (see launch.py).

Usage Example

python3 launch.py

Scores will be saved in a 'log_scores' files, and also printed on the terminal. Model weights will be regularly saved (after each epoch, consiting of 300 agent steps).

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
README.md		README.md
a2c.py		a2c.py
agent_process.py		agent_process.py
env.py		env.py
launch.py		launch.py
variables.py		variables.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Training parallel RL agents on Snake

Multiprocessing

Usage Example

About

Releases

Packages

Languages

fazega/snake-a2c

Folders and files

Latest commit

History

Repository files navigation

Training parallel RL agents on Snake

Multiprocessing

Usage Example

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages