Implement root move temperature #267

jkiliani · 2018-04-08T00:21:33Z

Implements root move selection by exponentiated visit count, to maintain randomness in games without as much strength loss as caused by root move temperature = 1 for the whole game. Also introduces a logarithmic decay schedule which is initialised by the new command line parameter --tempdecay (-d), which reduces the root move temperature throughout the game. The decay constant is customisable.

Update to glinscott/next

Introduces the option to choose root moves by exponentiated visit count, to maintain randomness and games with much less of a strength cost than move selection proportional to visit count. Also implements a command line parameter --tempdecay (-d) which takes a decay constant as parameter to dynamically reduce temperature throughout the game (logarithmic decay).

Implement root move temperature

jkiliani · 2018-04-08T00:26:09Z

This is still work in progress and has not yet been properly strength-tested. I will start to do that tomorrow, but in case other people like to get a head start I'm opening the pull request. So far I only tested with (currently enhanced) debug output that the feature works as intended for a few ply in console with "go" commands. For the decay constant, something like 25 seems a reasonable value to me, but I'm happy if others weigh in there as well.

I'd like to ask @killerducky and/or @Uriopass for a code review if possible.

A reintroduced bug in UCTNode.cpp breaks the continuous integration at the moment.

Fix revert in UCTNode.cpp

jkiliani · 2018-04-08T00:44:12Z

Strangely, there is still a problem with continuous integration, even after I fixed the accidentally reintroduced auto move = pos.get_move(); in UCTNode.cpp. Anyone have ideas what could be the problem?

cn4750 · 2018-04-08T01:36:36Z

I assume you mean this error:
/src/src/UCTNode.cpp:171:6: error: prototype for 'void UCTNode::randomize_first_proportionally(float)' does not match any in class 'UCTNode' void UCTNode::randomize_first_proportionally(float tau) { ^ In file included from /src/src/UCTNode.cpp:40:0: /src/src/UCTNode.h:63:10: error: candidate is: void UCTNode::randomize_first_proportionally() void randomize_first_proportionally(); ^ CMakeFiles/objs.dir/build.make:158: recipe for target 'CMakeFiles/objs.dir/src/UCTNode.cpp.o' failed make[3]: *** [CMakeFiles/objs.dir/src/UCTNode.cpp.o] Error 1

Looks like you forgot to edit the header file here:
https://github.com/glinscott/leela-chess/blob/master/src/UCTNode.h#L61

vdbergh · 2018-04-08T06:01:05Z

Why not just start the games from the matches with two random moves (4 ply)? If games are replayed with reversed colors this is fair and it seems sufficiently "zero", given that lczero needs to learn to play good chess in any position, not just the starting position.

Note: to get accurate error bars one needs to use the pentanomial model for the outcome of game pairs. See the section on "statistical analysis" here

https://chessprogramming.wikispaces.com/Match+Statistics

Changes the declaration of randomises_first_proportionally in UCTNode.h to pass a temperature parameter.

Fix missing header file

jkiliani · 2018-04-08T06:18:52Z

Thanks @cn4750! Yes, I forgot to upload the change of the declaration of randomize_first_proportionally. Looks like the build is now passing...

Fixes tab indentations, adds several comments, increases the denominator in the root temperature calculation to allow slower decay schedules, and adds a floor value of 0.1 for root temperature.

Various fixes

jkiliani · 2018-04-08T10:35:53Z

I just uploaded several fixes in the PR, including indentations, comments, a floor value for root temperature and an increase in the denominator for the root temperature calculation.

This changes my proposed initial value for the decay schedule I proposed for match games, from 5 to 25. Much smaller decay values may now actually be useable for self-play, but that should be carefully considered and tested before changing the current t=1 for all moves.

killerducky · 2018-04-08T13:31:31Z

Can you show a few positions and the % each move will be played in them? So we can get an idea what the spread looks like.

jkiliani · 2018-04-08T13:49:44Z

One example for t=0.48 from the (now removed) detailed debug output in the initial PR. This position is after 1. e4 Nc6 2. d4 e6:

info string  Bh6 ->       0 (V: 52.52%) (N:  0.09%) PV: Bh6 
info string  Ba6 ->       0 (V: 52.52%) (N:  0.11%) PV: Ba6 
info string  Kd2 ->       0 (V: 52.52%) (N:  0.14%) PV: Kd2 
info string  Qh5 ->       0 (V: 52.52%) (N:  0.16%) PV: Qh5 
info string  Qf3 ->       0 (V: 52.52%) (N:  0.17%) PV: Qf3 
info string  Ke2 ->       0 (V: 52.52%) (N:  0.18%) PV: Ke2 
info string  Bg5 ->       1 (V: 43.16%) (N:  0.31%) PV: Bg5 Qxg5
info string  Qg4 ->       1 (V: 43.98%) (N:  0.19%) PV: Qg4 d5
info string   b4 ->       1 (V: 47.76%) (N:  0.24%) PV: b4 Bxb4+
info string  Bd2 ->       1 (V: 48.45%) (N:  0.42%) PV: Bd2 Nxd4
info string  Qd2 ->       1 (V: 48.67%) (N:  0.33%) PV: Qd2 d5
info string   f3 ->       1 (V: 49.04%) (N:  0.26%) PV: f3 d5
info string  Qe2 ->       1 (V: 49.29%) (N:  0.43%) PV: Qe2 Nxd4
info string  Na3 ->       1 (V: 49.84%) (N:  0.48%) PV: Na3 d5
info string  Bc4 ->       1 (V: 49.85%) (N:  0.35%) PV: Bc4 d5
info string  Qd3 ->       1 (V: 50.46%) (N:  0.24%) PV: Qd3 d5
info string  Nh3 ->       1 (V: 50.66%) (N:  0.44%) PV: Nh3 d5
info string   g4 ->       1 (V: 51.68%) (N:  0.29%) PV: g4 d5
info string   b3 ->       2 (V: 47.63%) (N:  0.33%) PV: b3 d5 e5
info string  Nd2 ->       2 (V: 48.00%) (N:  0.69%) PV: Nd2 Nxd4 c3
info string  Be3 ->       2 (V: 49.74%) (N:  0.77%) PV: Be3 d5 Nc3
info string   f4 ->       2 (V: 50.24%) (N:  0.47%) PV: f4 d5 e5
info string  Bf4 ->       2 (V: 50.34%) (N:  0.62%) PV: Bf4 d5 Nc3
info string   a4 ->       2 (V: 50.50%) (N:  0.70%) PV: a4 d5 e5
info string  Bd3 ->       3 (V: 49.15%) (N:  0.98%) PV: Bd3 Nxd4 Nf3 Nxf3+
info string   h4 ->       3 (V: 52.04%) (N:  0.62%) PV: h4 d5 e5 h6
info string   e5 ->       4 (V: 48.80%) (N:  1.01%) PV: e5 d6 Nf3 dxe5 Nxe5
info string  Bb5 ->       5 (V: 50.36%) (N:  1.21%) PV: Bb5 d5 exd5 exd5 Nf3
info string   a3 ->       5 (V: 51.51%) (N:  1.05%) PV: a3 d5 e5 f6 Nf3 fxe5
info string  Ne2 ->       6 (V: 50.61%) (N:  1.32%) PV: Ne2 d5 Nbc3 Nf6 e5 Nd7 g3
info string   g3 ->       7 (V: 51.14%) (N:  1.53%) PV: g3 d5 e5 f6 Nf3 fxe5 dxe5
info string   h3 ->       7 (V: 51.42%) (N:  1.30%) PV: h3 d5 e5 f6 Nf3 fxe5
info string   c3 ->       9 (V: 50.57%) (N:  2.06%) PV: c3 d5 e5 f6 Nf3 fxe5 Nxe5 Nxe5
info string  Be2 ->      20 (V: 50.92%) (N:  4.07%) PV: Be2 d5 e5 f6 Nf3 fxe5 dxe5 Nge7 c4
info string  Nc3 ->      23 (V: 51.26%) (N:  4.49%) PV: Nc3 d5 Nf3 Nf6 e5 Ne4 Bd3 Nxc3
info string   c4 ->      46 (V: 52.71%) (N:  5.61%) PV: c4 Bb4+ Nc3 Nf6 e5 Ne4 Qc2 d5 Nf3 Nxc3
info string   d5 ->     300 (V: 51.86%) (N: 36.63%) PV: d5 exd5 exd5 Ne5 Be2 Nf6 Nf3 Nxf3+ Bxf3 Bd6 O-O O-O Nc3 Re8
info string  Nf3 ->     319 (V: 52.63%) (N: 29.72%) PV: Nf3 d5 Nc3 Nf6 e5 Ne4 Bd3 Bb4 Bd2 Nxd2 Qxd2 Be7

info depth 17 nodes 782 nps 294 score cp 7 winrate 52.02% time 1664 pv Nf3 d5 Nc3 Nf6 e5 Ne4 Bd3 Bb4 Bd2 Nxd2 Qxd2 Be7
Game ply: 4, root temperature:  0.48 
Visits: 319 Exponentiated visits: 179674.30 Cumulative visits: 179674.30
Visits: 300 Exponentiated visits: 157949.09 Cumulative visits: 337623.41
Visits: 46 Exponentiated visits:  3086.63 Cumulative visits: 340710.03
Visits: 23 Exponentiated visits:   720.67 Cumulative visits: 341430.72
Visits: 20 Exponentiated visits:   537.47 Cumulative visits: 341968.19
Visits: 9 Exponentiated visits:   100.60 Cumulative visits: 342068.78
Visits: 7 Exponentiated visits:    59.37 Cumulative visits: 342128.16
Visits: 7 Exponentiated visits:    59.37 Cumulative visits: 342187.53
Visits: 6 Exponentiated visits:    42.96 Cumulative visits: 342230.50
Visits: 5 Exponentiated visits:    29.30 Cumulative visits: 342259.81
Visits: 5 Exponentiated visits:    29.30 Cumulative visits: 342289.12
Visits: 4 Exponentiated visits:    18.34 Cumulative visits: 342307.47
Visits: 3 Exponentiated visits:    10.03 Cumulative visits: 342317.50
Visits: 3 Exponentiated visits:    10.03 Cumulative visits: 342327.53
Visits: 2 Exponentiated visits:     4.28 Cumulative visits: 342331.81
Visits: 2 Exponentiated visits:     4.28 Cumulative visits: 342336.09
Visits: 2 Exponentiated visits:     4.28 Cumulative visits: 342340.38
Visits: 2 Exponentiated visits:     4.28 Cumulative visits: 342344.66
Visits: 2 Exponentiated visits:     4.28 Cumulative visits: 342348.94
Visits: 2 Exponentiated visits:     4.28 Cumulative visits: 342353.22
Visits: 1 Exponentiated visits:     1.00 Cumulative visits: 342354.22
Visits: 1 Exponentiated visits:     1.00 Cumulative visits: 342355.22
Visits: 1 Exponentiated visits:     1.00 Cumulative visits: 342356.22
Visits: 1 Exponentiated visits:     1.00 Cumulative visits: 342357.22
Visits: 1 Exponentiated visits:     1.00 Cumulative visits: 342358.22
Visits: 1 Exponentiated visits:     1.00 Cumulative visits: 342359.22
Visits: 1 Exponentiated visits:     1.00 Cumulative visits: 342360.22
Visits: 1 Exponentiated visits:     1.00 Cumulative visits: 342361.22
Visits: 1 Exponentiated visits:     1.00 Cumulative visits: 342362.22
Visits: 1 Exponentiated visits:     1.00 Cumulative visits: 342363.22
Visits: 1 Exponentiated visits:     1.00 Cumulative visits: 342364.22
Visits: 1 Exponentiated visits:     1.00 Cumulative visits: 342365.22
Visits: 0 Exponentiated visits:     0.00 Cumulative visits: 342365.22
Visits: 0 Exponentiated visits:     0.00 Cumulative visits: 342365.22
Visits: 0 Exponentiated visits:     0.00 Cumulative visits: 342365.22
Visits: 0 Exponentiated visits:     0.00 Cumulative visits: 342365.22
Visits: 0 Exponentiated visits:     0.00 Cumulative visits: 342365.22
Visits: 0 Exponentiated visits:     0.00 Cumulative visits: 342365.22
pick, pick_scaled: 1494472530, 238258.12
bestmove d4d5

In this particular example, there are two candidate moves with almost equal visit counts. Those two still get nearly equal selection chances at this root temperature, while the other candidate moves are already strongly suppressed (only ~1.5 % chance for all of them together, while with t=1 it would be 22.6 %).

I didn't create detailed debug output specifically for the selection chances, but you can see them by comparing exponentiated visits counts.

killerducky · 2018-04-08T13:54:25Z

I don't know the position, but it looks like this is moving the Bishop and letting it get captured for free. I'd like a solution that sets the probability of this being selected to zero.
info string Bg5 -> 1 (V: 43.16%) (N: 0.31%) PV: Bg5 Qxg5

I'd like to collect several positions, analyze them, and set a floor. Again I don't know the position here, but it seems like setting the floor to include maybe Be2 but not c3 seems reasonable.

Edit: A floor can be implemented by saying pick only top X% after doing the exponent on visits. And/or exclude moves where the V% is X% below the top move.

jkiliani · 2018-04-08T14:04:19Z

I included the move history now. You're right, Bg5 lets the bishop be captured by the black queen, and the value head is fully aware of that fact. However, I think a 3 in a million chance for this to actually happen is acceptable. Eventually, with stronger nets Bg5 would not get a visit anymore in this position since the policy prior will continue to drop.

The tournament I'm currently running shows that with -d 25, this branch is actually stronger than current Lc0 with Dirichlet noise:

lc_id103 vs sf_lv10: 20 - 10 - 7
lc_id103d vs sf_lv10: 24 - 5 - 9
lc_id103d vs lc_id103: 13 - 9 - 15

Later in the game (not that much later actually), 1-visit moves are completely suppressed by the single precision floating point accuracy. I think a hard lower cap in addition is unnecessary at this point.

killerducky · 2018-04-08T14:09:58Z

With noise on it will pretty much always visit every root move at least once. What are the arguments against adding a floor? It should be very easy and I don't see any downside. 1 in a million is extreme but there will be 1 in a thousands too that are pretty bad as well. We have over 1 thousand people using this, and it's gonna play those moves a lot and we're going to get support questions about them.

I think the goal should be to get a solution that's worthy of putting into a real tournament like TCEC. So it should be picking only moves deemed acceptable alternatives by the search. Mathematical definition of acceptable is what this PR should be looking for.

Edit:
auto pick_scaled = pick*accum/int_limit; looks like you can just divide this by X.

jkiliani · 2018-04-08T14:18:25Z

There's no point to use noise in evaluation games anymore with fractional temperature. I'd kind of like a solution that equates current t=1 for decay constant 0, which is not helped by setting a floor.

I suppose putting in a floor where the exponentiated visit count is set to zero for any move where it's less than 1/1000 of the PV should work somewhat like you describe, this should set the selection probability for 1-visit blunders to zero already for t~0.5 (which it already reaches by ply 4 when using decay constant 25).

For t=1, it would remove 1-visit nodes from selection as long as the total visit count is larger than 1000. Would this impact self-play games negatively in any way? I doubt it personally but we should discuss...

Edit: I'll put in the floor of 1/1000 if a few other people weigh in on this. @glinscott, @Error323, @Uriopass, any opinions?

killerducky · 2018-04-08T14:32:35Z

I think for now we should not impact self-play at all. No noise, I guess that can work.

It seems this information is important enough to integrate into the main dump_stats output. We should be able to look at logs of games to determine which moves it is picking and why. How about after printing the unaltered V count, print something like:

std::pow(child->get_visits(),1/tau) / accum * 100

Then users can see what % each move will be picked without doing math.

jkiliani · 2018-04-08T14:39:51Z

OK I'm going to implement this, good idea. Currently, root output looks like this:

info string Nxd5 ->      59 (V: 49.98%) (N:  6.72%) PV: Nxd5 Nxe4 Ne3 Bc5 d3 Nf6 Nf3 Nc6 b4 Bb6 Bb2 O-O Be2
info string exd5 ->     383 (V: 48.69%) (N: 78.82%) PV: exd5 Nxd5 Nf3 Nxc3 bxc3 e4 Nd4 c5 Nb5 a6 d3 axb5 dxe4 Qxd1+

So should I put in an additional column, e.g.

info string Nxd5 ->      59 (V: 49.98%) (N:  6.72%) (S:  2.32%) PV: Nxd5 Nxe4 Ne3 Bc5 d3 Nf6 Nf3 Nc6 b4 Bb6 Bb2 O-O Be2
info string exd5 ->     383 (V: 48.69%) (N: 78.82%) (S: 97.68%) PV: exd5 Nxd5 Nf3 Nxc3 bxc3 e4 Nd4 c5 Nb5 a6 d3 axb5 dxe4 Qxd1+

where S shows the root move selection probabilities?

Or directly after visit count, like

info string Nxd5 ->      59 ( 2.32%)   (V: 49.98%) (N:  6.72%)  PV: Nxd5 Nxe4 Ne3 Bc5 d3 Nf6 Nf3 Nc6 b4 Bb6 Bb2 O-O Be2
info string exd5 ->     383 (97.68%)   (V: 48.69%) (N: 78.82%)  PV: exd5 Nxd5 Nf3 Nxc3 bxc3 e4 Nd4 c5 Nb5 a6 d3 axb5 dxe4 Qxd1+

In this case, I might have to either include 5 decimal places so the fluke moves aren't showing as 0.00%, or actually implement some sort of floor after all...

jkiliani · 2018-04-08T15:21:36Z

@killerducky I thought about how to implement your suggestion to include the move probabilities in the main dump_stats. Unfortunately, I think we need some refactoring of both the root_temperature calculation in UCTSearch::get_best_move() and the exponentiated visit count calculation in UCTNode::randomize_first_proportionally. Currently, both calculations are performed locally. Do you have some suggestions exactly how to get those stats into dump_stats without duplicating both calculations?

Edit: I guess what would work is to define new functions UCTNode::get_expvisits(tau) and UCTSearch::get_root_temp(). I'll put this refactoring in later (unless you'd like to do it first 😀 )

Akababa · 2018-04-08T16:59:44Z

@killerducky It's important to keep in mind that the moves played in self-play are not the ones the policy trains toward, so it's important to keep them in the training data because they lead to a variety of positions which are encountered by the MCTS in a normal search. As it's a local policy improvement operator, we need to include all positions in some neighborhood of the target population, or else the network might forget how to evaluate bad positions.

Anyway, in this case it will make negligible difference (3 in a million) but IMO setting a "threshold" cannot be beneficial and may even hurt.

killerducky · 2018-04-08T17:09:31Z

@Akababa this mode is only intended for matches, not self-play. For the reason you just mentioned.

Akababa · 2018-04-08T17:32:19Z

@jkiliani are you talking about the threshold mode? Wouldn't matches be played with tau=0 (pick max)?

jkiliani · 2018-04-08T17:40:19Z

@Akababa At the moment, matches are played with tau=0, but that necessitates using Dirichlet noise to avoid deterministic play. The whole point behind this pull request is to replace Dirichlet noise in match games and probably play against humans with fractional temperature, to get more variability in positions ideally without a loss in strength. So far it looks like this is succeeding with a temperature decay constant of 25: The performance of the engine with decaying temperature is almost equal to that with Dirichlet noise, both against each other and against Stockfish Level 10.

glinscott · 2018-04-08T19:15:27Z

src/Parameters.cpp

@@ -60,6 +61,7 @@ bool cfg_tune_only;
 float cfg_puct;
 float cfg_softmax_temp;
 float cfg_fpu_reduction;
+float cfg_root_temp;


It looks like we don't change this value anywhere? So could just inline the 1.0 in the one place it's used.

We could I suppose, it's a question of whether we want constants all collected in the Parameter file, or just where we use them.

I was actually considering pulling the Dirichlet noise constants alpha and epsilon into Parameters.cpp, but is your general preference towards just having multiple-use constants in the Parameter file?

I thought about it, changing cfg_root_temp would actually be rather impractical. Might be better to just get rid of it.

jkiliani · 2018-04-08T19:51:26Z

Just finished the first strength testing of this pull request. I matched Id 103 with current /next branch and Dirichlet noise against this branch without noise but a temperature decay schedule with -d 25, and Stockfish Level 10 as an outside source of comparison. Lc0 was given 800 visits each.

Rank Name                          Elo     +/-   Games   Score   Draws
   1 lc_id103d                      67      42     200   59.5%   25.0%
   2 lc_id103                       65      43     200   59.3%   23.5%
   3 sf_lv10                      -137      47     200   31.3%   16.5%

I'd call this a very successful first test, since at least with this temperature decay schedule no strength loss compared to using Dirichlet noise for randomness is measurable. The opening variety is much better on the other hand. As an aside, Id 103 seems decisively better than Stockfish Lv 10.

I'm now running a second strength test with Id 107, this time matching multiple decay schedules against each other.

About the code, I removed cfg_root_temp and will try to implement @killerducky's suggestion which requires a refactoring of the root temperature calculation. When the addition to dump_stats is in and the second test is finished, I would consider this ready.

Removes cfg_root_temp from Parameters.cpp as configurable parameter. Factors out get_root_temperature as a function. Includes the probability to play each root move in UCTSearch::dump_stats, if temperature is used.

Remove cfg_root_temp, enhance dump_stats

jkiliani · 2018-04-08T23:55:24Z

I now put in the root move probabilities while temperature is used. I tried to get it generalised to also handle the case with no temperature set (i.e. show 100 % for first root node and 0 % for all others) but didn't manage that part. If someone else (@killerducky?) would like to optimise this, please feel free.

glinscott · 2018-04-09T02:46:08Z

Awesome, thanks so much!

jkiliani · 2018-04-09T09:03:28Z

Current standing in my tournament with Id 107:

lc_id107 results:     46 - 38 - 39 (+23 Elo)
lc_id107_d25 results: 48 - 30 - 44 (+52 Elo)
lc_id107_d15 results: 46 - 43 - 33 (+9 Elo)
lc_id107_d10 results: 43 - 43 - 37 (0 Elo)
lc_id107_d5 results:  43 - 45 - 34 (-6 Elo)
sf_lv10 results:      33 - 60 - 29 (-78 Elo)

The tournament is still running and will probably finish this evening sometime, but I think I can draw some conclusions already from the data so far: Temperature decay works robustly for multiple decay schedules, and the strength loss from using a slower decay schedules than my original test with -d 25 is surprisingly small. Even -d 5 still appears to be a realistic choice. --tempdecay offers a tuneable value for how much playing strength is traded off for variety of play, which should prove popular both for matches and people running bots against other engines and against humans.

jkiliani · 2018-04-09T17:35:57Z

The tournament for Id 107 is finished now:

Rank Name                          Elo     +/-   Games   Score   Draws
   1 lc_id107                       38      41     200   55.5%   29.0%
   2 lc_id107_d25                   24      38     200   53.5%   38.0%
   3 lc_id107_d5                    12      40     200   51.7%   30.5%
   4 lc_id107_d15                    0      41     200   50.0%   29.0%
   5 lc_id107_d10                  -19      40     200   47.3%   32.5%
   6 sf_lv10                       -56      43     200   42.0%   22.0%

In the end, there was a small strength cost to using temperature decay, even with d=25. The performance of d=5 is strange and probably simply due to still rather weak statistics, but it seems safe to say that d=5 is still a valid option.

I'm going to do one more match of this format, replacing d=10 with d=50 to look at the performance of faster temperature decay.

jkiliani · 2018-04-10T15:16:58Z

And finally, the tournament for Id 112:

Rank Name                          Elo     +/-   Games   Score   Draws
   1 lc_id112_d50                   53      40     200   57.5%   33.0%
   2 lc_id112                       53      40     200   57.5%   31.0%
   3 lc_id112_d25                   40      41     200   55.8%   29.5%
   4 lc_id112_d10                  -12      40     200   48.3%   30.5%
   5 lc_id112_d5                   -42      40     200   44.0%   31.0%
   6 sf_lv10                       -92      45     200   37.0%   17.0%

I think this proves that with a quick enough temperature decay schedule, a loss in strength can be entirely avoided, since the strength loss is compensated by not using Dirichlet noise instead. I would still recommend a somewhat slower decay schedule than -d 50 for match games to get more opening variety. How about -d 10?

With these results, I would consider the temperature decay implementation sufficiently tested now to be used. Outside of "official" uses for matches, the choice of the decay constant is up to the user. I did not test what the current strength cost of Dirichlet noise is compared to no randomness at all, but picking a really large decay constant will likely closely approximate that already.

jjoshua2 · 2018-04-10T15:37:45Z

I think we should start small and see how big of a problem duplicate games is, and gradually increase until no longer a problem. It should be pretty easy for Gary to try say d25 at the start, and then go to d15 a day later, and if still necessary go to d10 for example, but I think we don't really need much variety with only 400-500 games. SF uses a 4 ply book for even 50,000 games just fine, without extra noise or even multi threaded variability normally.

jkiliani added 4 commits April 6, 2018 15:10

Fix cfg_puct reset

aeee0a0

Merge pull request #9 from glinscott/next

9811d8c

Update to glinscott/next

Merge pull request #10 from jkiliani/jkiliani-patch-1

fdc1b68

Implement root move temperature

jkiliani added 2 commits April 8, 2018 02:34

Fix revert in UCTNode.cpp

67d1e1c

A reintroduced bug in UCTNode.cpp breaks the continuous integration at the moment.

Merge pull request #11 from jkiliani/jkiliani-patch-1

6031c40

Fix revert in UCTNode.cpp

jkiliani changed the title ~~Implement root move temperature~~ Implement root move temperature (WIP) Apr 8, 2018

jkiliani mentioned this pull request Apr 8, 2018

Full temperature setting #67

Closed

jkiliani added 2 commits April 8, 2018 08:11

Fix missing header file

63374af

Changes the declaration of randomises_first_proportionally in UCTNode.h to pass a temperature parameter.

Merge pull request #12 from jkiliani/jkiliani-patch-3

151c299

Fix missing header file

jkiliani changed the title ~~Implement root move temperature (WIP)~~ [WIP] Implement root move temperature Apr 8, 2018

jkiliani mentioned this pull request Apr 8, 2018

Different elo targets #109

Closed

jkiliani added 2 commits April 8, 2018 12:17

Various fixes

5d84780

Fixes tab indentations, adds several comments, increases the denominator in the root temperature calculation to allow slower decay schedules, and adds a floor value of 0.1 for root temperature.

Merge pull request #13 from jkiliani/jkiliani-patch-3

d7f4181

Various fixes

glinscott reviewed Apr 8, 2018

View reviewed changes

jkiliani added 2 commits April 9, 2018 01:43

Remove cfg_root_temp, enhance dump_stats

2361a10

Removes cfg_root_temp from Parameters.cpp as configurable parameter. Factors out get_root_temperature as a function. Includes the probability to play each root move in UCTSearch::dump_stats, if temperature is used.

Merge pull request #14 from jkiliani/jkiliani-patch-3

5658843

Remove cfg_root_temp, enhance dump_stats

jkiliani changed the title ~~[WIP] Implement root move temperature~~ Implement root move temperature Apr 8, 2018

glinscott merged commit bca43df into glinscott:next Apr 9, 2018

jkiliani mentioned this pull request Apr 9, 2018

Question, is it healthy to self-play using one net for longer time? leela-zero/leela-zero#1164

Closed

jkiliani deleted the jkiliani-patch-2 branch April 9, 2018 06:32

jkiliani restored the jkiliani-patch-2 branch April 10, 2018 12:42

jkiliani mentioned this pull request Apr 25, 2018

Tune PUCT parameter for chess. #435

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement root move temperature #267

Implement root move temperature #267

jkiliani commented Apr 8, 2018

jkiliani commented Apr 8, 2018 •

edited

Loading

jkiliani commented Apr 8, 2018

cn4750 commented Apr 8, 2018

vdbergh commented Apr 8, 2018

jkiliani commented Apr 8, 2018

jkiliani commented Apr 8, 2018 •

edited

Loading

killerducky commented Apr 8, 2018

jkiliani commented Apr 8, 2018 •

edited

Loading

killerducky commented Apr 8, 2018 •

edited

Loading

jkiliani commented Apr 8, 2018 •

edited

Loading

killerducky commented Apr 8, 2018 •

edited

Loading

jkiliani commented Apr 8, 2018 •

edited

Loading

killerducky commented Apr 8, 2018

jkiliani commented Apr 8, 2018 •

edited

Loading

jkiliani commented Apr 8, 2018 •

edited

Loading

Akababa commented Apr 8, 2018

killerducky commented Apr 8, 2018

Akababa commented Apr 8, 2018

jkiliani commented Apr 8, 2018 •

edited

Loading

glinscott Apr 8, 2018

jkiliani Apr 8, 2018 •

edited

Loading

jkiliani commented Apr 8, 2018

jkiliani commented Apr 8, 2018 •

edited

Loading

glinscott commented Apr 9, 2018

jkiliani commented Apr 9, 2018

jkiliani commented Apr 9, 2018 •

edited

Loading

jkiliani commented Apr 10, 2018

jjoshua2 commented Apr 10, 2018

Implement root move temperature #267

Implement root move temperature #267

Conversation

jkiliani commented Apr 8, 2018

jkiliani commented Apr 8, 2018 • edited Loading

jkiliani commented Apr 8, 2018

cn4750 commented Apr 8, 2018

vdbergh commented Apr 8, 2018

jkiliani commented Apr 8, 2018

jkiliani commented Apr 8, 2018 • edited Loading

killerducky commented Apr 8, 2018

jkiliani commented Apr 8, 2018 • edited Loading

killerducky commented Apr 8, 2018 • edited Loading

jkiliani commented Apr 8, 2018 • edited Loading

killerducky commented Apr 8, 2018 • edited Loading

jkiliani commented Apr 8, 2018 • edited Loading

killerducky commented Apr 8, 2018

jkiliani commented Apr 8, 2018 • edited Loading

jkiliani commented Apr 8, 2018 • edited Loading

Akababa commented Apr 8, 2018

killerducky commented Apr 8, 2018

Akababa commented Apr 8, 2018

jkiliani commented Apr 8, 2018 • edited Loading

glinscott Apr 8, 2018

Choose a reason for hiding this comment

jkiliani Apr 8, 2018 • edited Loading

Choose a reason for hiding this comment

jkiliani commented Apr 8, 2018

jkiliani commented Apr 8, 2018 • edited Loading

glinscott commented Apr 9, 2018

jkiliani commented Apr 9, 2018

jkiliani commented Apr 9, 2018 • edited Loading

jkiliani commented Apr 10, 2018

jjoshua2 commented Apr 10, 2018

jkiliani commented Apr 8, 2018 •

edited

Loading

jkiliani commented Apr 8, 2018 •

edited

Loading

jkiliani commented Apr 8, 2018 •

edited

Loading

killerducky commented Apr 8, 2018 •

edited

Loading

jkiliani commented Apr 8, 2018 •

edited

Loading

killerducky commented Apr 8, 2018 •

edited

Loading

jkiliani commented Apr 8, 2018 •

edited

Loading

jkiliani commented Apr 8, 2018 •

edited

Loading

jkiliani commented Apr 8, 2018 •

edited

Loading

jkiliani commented Apr 8, 2018 •

edited

Loading

jkiliani Apr 8, 2018 •

edited

Loading

jkiliani commented Apr 8, 2018 •

edited

Loading

jkiliani commented Apr 9, 2018 •

edited

Loading