-
Notifications
You must be signed in to change notification settings - Fork 298
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add recursive search depth, remove FPU VL bug #466
Conversation
Update to next
Adds recursive tracking for maximum search depth and replaces the current search depth output based on a log function of visits. Fixes the virtual loss bug on first play urgency. Activates USE_TUNER by default, and adds FPU reduction to parameters tuneable from command line.
As far as I can determine, bench impact of recursive depth tracking is negligible. Fixing the virtual loss bug in FPU seems doable without negative effects as far as I can determine from tests with the previous PR #438. Last tuning tournament with the code from #438, with Id 211:
At least for self-play, the reduced PUCT appears significantly stronger for multiple nets. The difference between before fixing the VL bug and after appears statistically insignificant. I'll now do some tests with the code from this PR, again comparing /next and the fixed VL version, at different PUCT values. Some more relevant results with older networks (all 800 visits):
Id 202:
The latter two tournaments were round-robins that also included some now discarded options for FPU reduction so the Elo don't sum up to 0. However, common among all test results is that puct = 0.6 is somewhat stronger than 0.85, and the strength difference between next branch and fixing the VL bug is statistically insignificant. Current cutechess-cli script
(Insert your working directory for the weights, and export the paths for the lczero executables first) |
const auto& cur = bh.cur(); | ||
const auto color = cur.side_to_move(); | ||
|
||
auto result = SearchResult{}; | ||
|
||
node->virtual_loss(); | ||
if (ndepth > m_maxdepth) { | ||
m_maxdepth = ndepth; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you actually want to make this truly threadsafe, it would need to be something like....
int cur_depth;
do { cur_depth = m_maxdepth; } while(ndepth > cur_depth && !m_maxdepth.compare_exchange_strong(cur_depth, ndepth);
But maybe we don't care that much?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, it probably isn't really thread safe at the moment. However, @killerducky indicates it won't be pulled as is anyway, he prefers the solution defining search depth on PV length since that is more compatible to the Tensorflow implementation.
I'm just keeping this open now until I post the last tuning tests with it, and then close it. Making additional tuning options available without UCI can be its own PR, and fixing the VL bug will probably have to wait a bit...
Tuning test result with Id 217:
FPU reduction of 0.1 at least for this net gives a sufficient benefit to compensate fixing the VL bug. I started a tuning run with Id 226 now (i.e. the final 10 block net) |
Results: av539 means that appveyor build number. It's 34.5 Elo worse on id228, but @mooskagh points out a bigger issue here: #317 (comment) FPU reduction is being done by mistake on the root node even when noise is on (implies training). This will make it more difficult for the Network to learn about low-policy moves. @jkiliani I think we should change the default fpu_reduction to 0.1 (best we have for now), and merge this. |
One more test result, for Id 226:
Here the bug also doesn't help playing strength, for puct=0.6. FPU reduction of 0.1 doesn't hurt compared to none, but does not help a lot either. |
I spoke with @jkiliani about this, we will do one more final test of FPU 0 vs 0.1 head to head on another net and then decide tomorrow. Edit: Going to test FPU 0.05 also. |
Accidentally had my test script stop after just 50 games, on Id 231:
I'm going to continue the match with this net until @killerducky wants to decide, but for the moment this is another (although weak) indication supporting cfg_fpu_reduction = 0.1. Current standing of the continued test is
which means that FPU reduction = 0.1 is still leading a lot in the aggregate score. But likely, any of 0.0, 0.05 and 0.1 would work fine. |
Let's go with 0.1. |
@jkiliani ok I am preparing a release and I noticed the depth tracking is wrong:
The PV is longer than the depth. Probably something to do with tree reuse? I'm going to revert the depth change so I can get v0.8 out quickly. |
* Revert depth calculation. See PR #466.
Adds recursive tracking for maximum search depth and replaces the current search depth output based on a log function of visits. Fixes the virtual loss bug on first play urgency. Activates USE_TUNER by default, and adds FPU reduction to parameters tuneable from command line.