Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cfg_fpu_dynamic_eval mode includes virtual loss #317

Closed
killerducky opened this issue Apr 14, 2018 · 8 comments
Closed

cfg_fpu_dynamic_eval mode includes virtual loss #317

killerducky opened this issue Apr 14, 2018 · 8 comments

Comments

@killerducky
Copy link
Collaborator

auto fpu_eval = (cfg_fpu_dynamic_eval ? get_eval(color) : net_eval) - fpu_reduction;

This code includes virtual losses in it. There should probably be a get_pure_eval(color) or an argument to get_eval(color, no_virtual_loss) that excludes them. This is making it unclear exactly what is going on because things depend on how many threads are going into the same move etc.

Not super urgent but we should probably clean this up before tuning other FPU related stuff.

Thanks to crem for finding this.

@killerducky
Copy link
Collaborator Author

@jkiliani note this bug effects even -t1 because after you select a node, a virtual loss is applied. Then when selecting the next node, get_eval is called on the current node, which now includes the virtual loss.

@jkiliani
Copy link
Contributor

jkiliani commented Apr 17, 2018

Since it increasingly looks like this bug actually gains instead of losing strength, I'm rather dubious about fixing it to be honest...

@jjoshua2
Copy link
Contributor

jjoshua2 commented Apr 17, 2018

I'm only ok with it, if it gains strength with all numbers of threads. We know it works with -t1, but does it fall apart when tcec uses -t43? Or even -t8 which is easy to test.

@jkiliani
Copy link
Contributor

jkiliani commented Apr 17, 2018

Very doubtful, I posted some debug output in dev channel today. The virtual losses massively reduce FPU for nodes far up the search tree, while leaving the FPU for root nodes unchanged. This is the case for all thread counts I looked at, though to be thorough, strength should be tested with multithreading for this.

@mooskagh
Copy link
Contributor

mooskagh commented May 1, 2018

While it may improve strength of play, it can hinder training.

Look at this position:
image

Network id230 evaluates the probability to move Re2 as 0.25%

Without this "virtual loss bug", it does the first visit on this subtree during playout 520, and at playout 810 that move becomes the best. (I expect 2-4 network generations for those moves to become most probable).

With "virtual loss bug" however the first visit to that subtree happens at playout 1501, so with 800 playout training, that move is trained with probability 0.00.

The same would happen with FPU reduction I guess, but hopefully we won't do FPU reduction in training games.

@jjoshua2
Copy link
Contributor

jjoshua2 commented May 1, 2018

FPU of 0.1 sometimes helps and sometimes hurts elo in my tests, but using .05 seems very conservative. and gaining Using noise and -t=1 should be enough for training to find tactics, so we should probably tune for elo on a variety of nets, instead of tactics puzzles.
But I do see the point that not getting a single playout will mean t=1 won't ever try it. I think this would be an argument for training with 1600 playouts once we stall at 800, or once we implement resign.

@killerducky
Copy link
Collaborator Author

@mooskagh FPU reduction is not done when noise is on (implies training) for the root node:

    // Lower the expected eval for moves that are likely not the best.
    // Do not do this if we have introduced noise at this node exactly
    // to explore more.
    if (!is_root || !cfg_noise) {
        fpu_reduction = cfg_fpu_reduction * std::sqrt(total_visited_policy);
    }

This combined with the noise should help the network discover these gaps in it's knowledge.

@killerducky
Copy link
Collaborator Author

Sorry I was focused on the fpu_reduction term:
fpu_eval = (cfg_fpu_dynamic_eval ? get_eval(color) : net_eval) - fpu_reduction

But you're pointing out the problem in the other term, get_eval. Yes this seems like a problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants