Home

Jump to bottom Edit New page

Michael Pang edited this page Dec 23, 2017 · 22 revisions

Welcome to the chess-alpha-zero wiki!

What I'm doing on this fork:

Model: Diagram

Input: 12 planes for pieces, 4 planes for castling, 1 plane for 50 move rule and 1 plane for en-passant (no history, flip-color transform). Simple and reduces overfitting (in theory)
Hidden: conv3-256 + 7 residual, batchnorms in between (total 15 conv layers)
Output: 1968-wide vector for policy, scalar for value

Speed improvements:

All workers are multithreaded/multiprocess
SL and opt are especially fast, loading thousands of games in minutes which is great for collecting more data!
self-play/eval/uci are also several times faster.

SL techniques:

Weight policy by ELO
Training on the material value of position

Other:

Extraneous bias removal

TODO:

Implement MCTS in C++
Variable regularization....
Try 5x5 convs in the first few layers

Goals:

Get a model that beats the materialistic MCTS agent

Current branch: https://github.com/Akababa/chess-alpha-zero/tree/nohistory