-
Notifications
You must be signed in to change notification settings - Fork 298
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cfg_fpu_dynamic_eval mode includes virtual loss #317
Comments
@jkiliani note this bug effects even |
Since it increasingly looks like this bug actually gains instead of losing strength, I'm rather dubious about fixing it to be honest... |
I'm only ok with it, if it gains strength with all numbers of threads. We know it works with -t1, but does it fall apart when tcec uses -t43? Or even -t8 which is easy to test. |
Very doubtful, I posted some debug output in dev channel today. The virtual losses massively reduce FPU for nodes far up the search tree, while leaving the FPU for root nodes unchanged. This is the case for all thread counts I looked at, though to be thorough, strength should be tested with multithreading for this. |
While it may improve strength of play, it can hinder training. Network id230 evaluates the probability to move Without this "virtual loss bug", it does the first visit on this subtree during playout 520, and at playout 810 that move becomes the best. (I expect 2-4 network generations for those moves to become most probable). With "virtual loss bug" however the first visit to that subtree happens at playout 1501, so with 800 playout training, that move is trained with probability 0.00. The same would happen with FPU reduction I guess, but hopefully we won't do FPU reduction in training games. |
FPU of 0.1 sometimes helps and sometimes hurts elo in my tests, but using .05 seems very conservative. and gaining Using noise and -t=1 should be enough for training to find tactics, so we should probably tune for elo on a variety of nets, instead of tactics puzzles. |
@mooskagh FPU reduction is not done when noise is on (implies training) for the root node:
This combined with the noise should help the network discover these gaps in it's knowledge. |
Sorry I was focused on the fpu_reduction term: But you're pointing out the problem in the other term, get_eval. Yes this seems like a problem. |
auto fpu_eval = (cfg_fpu_dynamic_eval ? get_eval(color) : net_eval) - fpu_reduction;
This code includes virtual losses in it. There should probably be a get_pure_eval(color) or an argument to get_eval(color, no_virtual_loss) that excludes them. This is making it unclear exactly what is going on because things depend on how many threads are going into the same move etc.
Not super urgent but we should probably clean this up before tuning other FPU related stuff.
Thanks to crem for finding this.
The text was updated successfully, but these errors were encountered: