Add recursive search depth, remove FPU VL bug #466

jkiliani · 2018-04-29T14:33:47Z

Adds recursive tracking for maximum search depth and replaces the current search depth output based on a log function of visits. Fixes the virtual loss bug on first play urgency. Activates USE_TUNER by default, and adds FPU reduction to parameters tuneable from command line.

jkiliani · 2018-04-29T14:40:58Z

As far as I can determine, bench impact of recursive depth tracking is negligible. Fixing the virtual loss bug in FPU seems doable without negative effects as far as I can determine from tests with the previous PR #438. Last tuning tournament with the code from #438, with Id 211:

Rank Name                           Elo     +/-   Games   Score   Draws  
   1 lc_next_puct06                  51      56      83   57.2%   44.6%
   2 lc_fixedVL_puct06               37      57      84   55.4%   41.7%
   3 lc_fixedVL_puct085             -29      55      83   45.8%   45.8%
   4 lc_next_puct085                -60      57      82   41.5%   43.9%

At least for self-play, the reduced PUCT appears significantly stronger for multiple nets. The difference between before fixing the VL bug and after appears statistically insignificant. I'll now do some tests with the code from this PR, again comparing /next and the fixed VL version, at different PUCT values.
I'll also add 0.1 FPU reduction to the tested parameters.

Some more relevant results with older networks (all 800 visits):
Id 194:

Rank Name                          Elo     +/-   Games   Score   Draws
   1 lc_next_puct06                 43      65      57   56.1%   49.1%
   2 lc_fixedVL_puct06              31      58      57   54.4%   59.6%
   3 lc_fixedVL_puct085             12      71      57   51.8%   40.4%
   4 lc_next_puct085                 6      68      58   50.9%   43.1%

Id 202:

Rank Name                          Elo     +/-   Games   Score   Draws
   1 lc_fixedVL_puct06              73      58      63   60.3%   54.0%
   2 lc_next_puct06                 33      62      63   54.8%   49.2%
   3 lc_next_puct085                28      65      63   54.0%   44.4%
   7 lc_fixedVL_puct085            -44      68      63   43.7%   39.7%

The latter two tournaments were round-robins that also included some now discarded options for FPU reduction so the Elo don't sum up to 0. However, common among all test results is that puct = 0.6 is somewhat stronger than 0.85, and the strength difference between next branch and fixing the VL bug is statistically insignificant.

Current cutechess-cli script

./cutechess-cli -rounds 50 -tournament round-robin -concurrency 2 -pgnout results_prog.pgn \
 -engine name=lc_puct6 cmd=lczero_dyneval arg="--threads=1" arg="--weights=$WDR/weights_217.txt" arg="--puct=0.6" arg="--tempdecay=10" nodes=800 tc=inf \
 -engine name=lc cmd=lczero_dyneval arg="--threads=1" arg="--weights=$WDR/weights_217.txt" arg="--tempdecay=10" nodes=800 tc=inf \
 -engine name=lc_fpu01_puct6 cmd=lczero_dyneval arg="--threads=1" arg="--weights=$WDR/weights_217.txt" arg="--fpu_reduction=0.1" arg="--puct=0.6" arg="--tempdecay=10" nodes=800 tc=inf \
 -engine name=lc_fpu01 cmd=lczero_dyneval arg="--threads=1" arg="--weights=$WDR/weights_217.txt" arg="--fpu_reduction=0.1" arg="--tempdecay=10" nodes=800 tc=inf \
 -engine name=lc_next_puct6 cmd=lczero_backup arg="--threads=1" arg="--weights=$WDR/weights_217.txt" arg="--puct=0.6" arg="--tempdecay=10" nodes=800 tc=inf \
 -engine name=lc_next cmd=lczero_backup arg="--threads=1" arg="--weights=$WDR/weights_217.txt" arg="--tempdecay=10" nodes=800 tc=inf \
 -each proto=uci

(Insert your working directory for the weights, and export the paths for the lczero executables first)

Tilps · 2018-04-30T13:53:43Z

src/UCTSearch.cpp

    const auto& cur = bh.cur();
    const auto color = cur.side_to_move();

    auto result = SearchResult{};

    node->virtual_loss();
+    if (ndepth > m_maxdepth) {
+        m_maxdepth = ndepth;
+    }


If you actually want to make this truly threadsafe, it would need to be something like....
int cur_depth;
do { cur_depth = m_maxdepth; } while(ndepth > cur_depth && !m_maxdepth.compare_exchange_strong(cur_depth, ndepth);

But maybe we don't care that much?

You're right, it probably isn't really thread safe at the moment. However, @killerducky indicates it won't be pulled as is anyway, he prefers the solution defining search depth on PV length since that is more compatible to the Tensorflow implementation.

I'm just keeping this open now until I post the last tuning tests with it, and then close it. Making additional tuning options available without UCI can be its own PR, and fixing the VL bug will probably have to wait a bit...

jkiliani · 2018-04-30T23:00:04Z

Tuning test result with Id 217:

Rank Name                          Elo     +/-   Games   Score   Draws
   1 lc_fpu01_puct06                70      52      95   60.0%   46.3%
   2 lc_next_puct06                 62      52      97   58.8%   45.4%
   3 lc_fpu00_puct06                18      54      97   52.6%   39.2%
   4 lc_next_puct085               -36      55      96   44.8%   39.6%
   5 lc_fpu00_puct085              -55      59      96   42.2%   30.2%
   6 lc_fpu01_puct085              -59      54      95   41.6%   41.1%

FPU reduction of 0.1 at least for this net gives a sufficient benefit to compensate fixing the VL bug. I started a tuning run with Id 226 now (i.e. the final 10 block net)

killerducky · 2018-05-01T22:32:40Z

Results:
lczero-id228-av539-puct0p6 vs lczero-id228-fpu-v1-fix-fpu-0p1:
70-40-193
Elo diff: 34.51 +/- 23.48

av539 means that appveyor build number.
The other one is this PR, with --fpu_reduction=0.1

It's 34.5 Elo worse on id228, but @mooskagh points out a bigger issue here: #317 (comment) FPU reduction is being done by mistake on the root node even when noise is on (implies training). This will make it more difficult for the Network to learn about low-policy moves.

@jkiliani I think we should change the default fpu_reduction to 0.1 (best we have for now), and merge this.

jkiliani · 2018-05-01T22:48:46Z

One more test result, for Id 226:

Rank Name                          Elo     +/-   Games   Score   Draws
   1 lc_fpu00_puct6                 77      58      78   60.9%   44.9%
   2 lc_fpu01_puct6                 72      57      78   60.3%   46.2%
   3 lc_next_puct6                  27      54      78   53.8%   51.3%
   4 lc_fpu00_puct085              -22      55      78   46.8%   50.0%
   5 lc_fpu01_puct085              -36      57      78   44.9%   46.2%
   6 lc_next_puct085              -120      59      78   33.3%   43.6%

Here the bug also doesn't help playing strength, for puct=0.6. FPU reduction of 0.1 doesn't hurt compared to none, but does not help a lot either.

killerducky · 2018-05-01T23:06:24Z

I spoke with @jkiliani about this, we will do one more final test of FPU 0 vs 0.1 head to head on another net and then decide tomorrow.

Edit: Going to test FPU 0.05 also.

jkiliani · 2018-05-02T12:58:31Z

Accidentally had my test script stop after just 50 games, on Id 231:

Score of lc_fpu00 vs lc_fpu01: 6 - 16 - 28  [0.400] 50
Elo difference: -70.44 +/- 64.16

I'm going to continue the match with this net until @killerducky wants to decide, but for the moment this is another (although weak) indication supporting cfg_fpu_reduction = 0.1.

Current standing of the continued test is

Score of lc_fpu00 vs lc_fpu01: 7 - 6 - 15  [0.518] 28

which means that FPU reduction = 0.1 is still leading a lot in the aggregate score. But likely, any of 0.0, 0.05 and 0.1 would work fine.

killerducky · 2018-05-02T23:28:32Z

Rank Name                                  Elo     +/-   Games   Score   Draws
   1 lczero-id232-fpu-vl-fix-fpu-0p1       22      30     174   53.2%   67.2%
   2 lczero-id232-fpu-vl-fix-fpu-0p05       0      30     174   50.0%   66.7%
   3 lczero-id232-fpu-vl-fix-fpu-0        -22      30     172   46.8%   65.7%
261 of 6000 games finished.

Let's go with 0.1.

killerducky · 2018-05-03T00:00:47Z

@jkiliani ok I am preparing a release and I noticed the depth tracking is wrong:

info depth 4 nodes 800 nps 435 tbhits 0 score cp -419 time 22 pv f7d7 d3g6 h7g8 g6h5 d7d2 e6e2 d2e2 h5e2 g7e5 f1f2 g8g7 f2f3 e5f6 f3e4 g7g6
move played f7f4

The PV is longer than the depth. Probably something to do with tree reuse? I'm going to revert the depth change so I can get v0.8 out quickly.

See PR glinscott#466.

* Revert depth calculation. See PR #466.

jkiliani added 4 commits April 20, 2018 10:26

Merge pull request #30 from glinscott/next

16a833e

Next

Merge pull request #31 from glinscott/next

4de6ede

Update to next

Merge pull request #32 from glinscott/next

cd8b1c2

Next

killerducky mentioned this pull request Apr 29, 2018

Tune PUCT parameter for chess. #435

Merged

Tilps reviewed Apr 30, 2018

View reviewed changes

Merge branch 'next' into jkiliani-patch-2

7bd2d7c

jkiliani mentioned this pull request May 2, 2018

Fix UCI protocol to work when given a terminal position #487

Merged

killerducky added 2 commits May 2, 2018 18:34

Merge remote-tracking branch 'upstream/next' into HEAD

b17adba

Make default cfg_fpu_reduction=0.1f

6b01d6f

killerducky merged commit 30bb68c into glinscott:next May 2, 2018

killerducky added a commit to killerducky/leela-chess that referenced this pull request May 3, 2018

Revert depth calculation.

7f1d59b

See PR glinscott#466.

killerducky mentioned this pull request May 3, 2018

Revert depth #505

Merged

killerducky added a commit that referenced this pull request May 3, 2018

Revert depth (#505)

2333efc

* Revert depth calculation. See PR #466.

jkiliani deleted the jkiliani-patch-2 branch May 3, 2018 04:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add recursive search depth, remove FPU VL bug #466

Add recursive search depth, remove FPU VL bug #466

jkiliani commented Apr 29, 2018

jkiliani commented Apr 29, 2018 •

edited

Loading

Tilps Apr 30, 2018

jkiliani Apr 30, 2018

jkiliani commented Apr 30, 2018

killerducky commented May 1, 2018

jkiliani commented May 1, 2018 •

edited

Loading

killerducky commented May 1, 2018 •

edited

Loading

jkiliani commented May 2, 2018 •

edited

Loading

killerducky commented May 2, 2018

killerducky commented May 3, 2018

Add recursive search depth, remove FPU VL bug #466

Add recursive search depth, remove FPU VL bug #466

Conversation

jkiliani commented Apr 29, 2018

jkiliani commented Apr 29, 2018 • edited Loading

Tilps Apr 30, 2018

Choose a reason for hiding this comment

jkiliani Apr 30, 2018

Choose a reason for hiding this comment

jkiliani commented Apr 30, 2018

killerducky commented May 1, 2018

jkiliani commented May 1, 2018 • edited Loading

killerducky commented May 1, 2018 • edited Loading

jkiliani commented May 2, 2018 • edited Loading

killerducky commented May 2, 2018

killerducky commented May 3, 2018

jkiliani commented Apr 29, 2018 •

edited

Loading

jkiliani commented May 1, 2018 •

edited

Loading

killerducky commented May 1, 2018 •

edited

Loading

jkiliani commented May 2, 2018 •

edited

Loading