-
Notifications
You must be signed in to change notification settings - Fork 8
Supervised Learning
After reading this article, you should be able to take a collection of PGN's and train a net using lczero-training.
[Note: this article assumes you are using Linux. Performing the same on Windows is possible, but as I don't use Windows, documenting the details will have to be left to someone else.]
You will need the following software:
-
pgn-extract
- the supervised learning pgn parser is very brittle. As a starting point, I would run your pgn file throughpgn-extract
withpgn-extract -7 -C < input.pgn > output.pgn
. See here for details. - The "supervise" branch of my fork of lczero. Yes
lczero
is the old engine for the nets, but it also has the supervised training code in it, which I fixed and reenabled. This should also be merged into the master branch of the original repo. I don't control that, so can only be confident that my fork has the right code. - The master branch of lczero-training. There's a fair bit of fiddling with setup here. You'll need CUDA-9.0 for tensorflow, et al, which is different than the CUDA-9.2, et al you got for lc0. I'll eventually add a section on configuration of this beast.
The high level supervised learning process runs as follows:
- Make sure the individual pgn files you will be converting to training data have less than 500k games in them. The training software expects files -- called "chunks" -- with one game per chunk. So training data directories
will be created with a potentially large number of files, which can become unwieldy. You can use
pgn-extract
to break a pgn file into equal sized files with N games. See documentation. - Clean up the pgn files with
pgn-extract -7 -C < input.pgn > output.pgn
. Change the filenames to reflect your naming scheme. - Run
lczero
to generate the training data. Note that lczero requires a weights file for this step. The weights file is loaded but ignored. This is an artifact of the all in one nature oflczero
.
./lczero -w weights_useless.txt.gz --supervise my_pgn_file.pgn
- Clean up the mess and edit your pgn file when it dumps core because of some minor pgn issue.
- Finally you get a clean run. You should have a directory called
supervise-my_pgn_file
with files of the formtraining.XXXXX.gz
where the X's are digits (there could be 1 or a dozen digits, depending on how many games you had). There should be one file for each game in your pgn. - If you've converted several pgn's, put all the various "supervise" directory in a common subdirectory. This will make it easier to process them in the training step.
- Determine how many "chunks" you have in the subdir by running
find subdir -type f | wc -l
. Let's assume we have 901265 chunks. - In your
lczero-training
directory. Change directory to the tf subdir. There should be a "config" subdirectory. Let's copy the example config file and make it work for us.
%YAML 1.2
---
name: 'my-first-net-64x6' # ideally no spaces
gpu: 0 # gpu id to process on
dataset:
num_chunks: 901265 # newest nof chunks to parse
train_ratio: 0.90 # trainingset ratio
# For separated test and train data.
#input_train: '/path/to/chunks/*/draw/' # supports glob
#input_test: '/path/to/chunks/*/draw/' # supports glob
# For a one-shot run with all data in one directory.
input: '/subdir/supervise-*/'
training:
batch_size: 2048 # training batch
test_steps: 2000 # eval test set values after this many steps
train_avg_report_steps: 200 # training reports its average values after this many steps.
total_steps: 140000 # terminate after these steps
# checkpoint_steps: 10000 # optional frequency for checkpointing before finish
shuffle_size: 524288 # size of the shuffle buffer
lr_values: # list of learning rates
- 0.02
- 0.002
- 0.0005
lr_boundaries: # list of boundaries
- 100000
- 130000
policy_loss_weight: 1.0 # weight of policy loss
value_loss_weight: 1.0 # weight of value loss
path: '/path/to/store/networks' # network storage dir
model:
filters: 64
residual_blocks: 6
...
You may have to fiddle with some of the LR or batch sizes depending on your GPU and the directory specifics are also up to you. 9. Run the training. This will take a long time. Maybe you can reduce the number of steps in the config file to check things out at first. From the tf dir, run the following.
./train.py --cfg configs/my-first-net.yaml --output my-first-net.txt
- After churning through a number of steps, it should barf out a network weights file, which you can then use with lc0.
Simple, no?
My new (old) blog is at lczero.libertymedia.io