-
Notifications
You must be signed in to change notification settings - Fork 8
Endgame Net
I've modified lc0 to take a file of epd's as starting positions. I'm now feeding it the same data as the adversarial play against sf9tb and alternating training on 20k batches of games. A 64x6 net really breezes through 20k endgame positions.
The command line for the self plays is
./lc0.ender selfplay --training --games=20000 -w ender-latest.txt.gz --visits=800 --cpuct=5.0 --resign-percentage=0
It’s changed a bit. I am using temp=1 in self play from 20k randomized 12 man epd’s I use noise for adversarial play with sf9tb. The nodes there are 350k for sf and 1600 for lc0.
I feed 2 adversarial batches for every 1 self play batch.
I’m going to play a 100 12 man epd match, with colors reversed, against 11258.
T11258 - 5 sec, 2 thr, no prune
Success rate: 53.69% (80/149)
Ender 38 - 5 sec, 2 thr, no prune
Success rate: 62.42% (93/149)
Based on my most recent test suite run, I am hopeful.
The Ender net (64x6) was initially trained on ~400k semirandom 6, 5, 4, 3 man positions with perfect playouts by sf9tb. This lead to mediocre play.
Currently the net is being trained on 20k batches of playouts (500k window), played from 12 and 6 man positions sampled from a CCRL database, Kingbase played out from resignation, as well as 12, 6, 5, 4, and 3 man semirandom positions. The positions are played both ways between sf9tb and the latest net at 0.25s vs 3200 nodes per move.
The training makes use of @borg's zero history patch, with the added wrinkle that it is only applied 10% of the time. The net does well with and without history, as a result. (See This GoNN page for thoughts on this approach.)
Test suites aren't the end all and be all of testing, but Ender 5 has finally surpassed the 20b networks:
Ender 5 - 5 seconds, 2 threads
Success rate: 52.35% (78/149)
T902 - 5 seconds, 2 threads
Success rate: 45.64% (68/149)
Right now none of the Leela nets can do this.
12 man CCRL Elo difference: 61.08 +/- 24.17
12 man Kingbase Elo difference: 76.98 +/- 23.64
12 man semi random Elo difference: 10.43 +/- 47.67
6 man CCRL Elo difference: 36.62 +/- 35.66
6 man Kingbase Elo difference: 47.19 +/- 37.99
6 man semi random Elo difference: 10.43 +/- 61.08
5 man semi random Elo difference: 10.43 +/- 61.08
4 man semi random Elo difference: 0.00 +/- 54.85
3 man semi random Elo difference: 0.00 +/- 60.66
12 man CCRL Elo difference: 99.05 +/- 7.85
12 man Kingbase Elo difference: 99.65 +/- 7.75
12 man semi random Elo difference: 15.99 +/- 14.86
6 man CCRL Elo difference: 38.02 +/- 10.72
6 man Kingbase Elo difference: 50.56 +/- 11.37
6 man semi random Elo difference: 16.69 +/- 19.51
5 man semi random Elo difference: 13.56 +/- 18.61
4 man semi random Elo difference: 15.99 +/- 16.95
3 man semi random Elo difference: 0.00 +/- 19.49
My new (old) blog is at lczero.libertymedia.io