A deep reinforcement learning algorithm that plays Connect 4, based on AlphaZero. I'm creating this because my chess algorithm learns too slowly, and I wanted to know if the problem is the amount of data needed, or my implementation of the algorithm itself.
See https://zjeffer.github.io/connect4-deep-rl/ for Doxygen documentation.
- Connect 4 environment
- MCTS algorithm
- Neural network
- AlphaZero self-play
- Argument parsing
- Load settings from file
- Unit tests:
- Horizontal win
- Vertical win
- Diagonal win
- Easy puzzle
- Harder puzzle
- ...?
- Save played moves to memory, and memory to file
- AlphaZero training
- AlphaZero evaluation
- Automatic pipeline for selfplay, training and evaluation
- Play against computer
- GUI?