-
Notifications
You must be signed in to change notification settings - Fork 567
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stdev ownership #500
Stdev ownership #500
Conversation
Sorry, I will check the code again. I implemented it in the old version of KataGo, and ported it to the latest one then refactored. |
The bug seemed due to numerical errors between avg((x[i] - avg(x))^2) and avg(x[i]^2) - avg(x)^2. I will think about adding test code, but for it I need to understand KataGo test framework. |
For the numerical errors, it should be rare that they are large, can you revert your change and instead just make it ignore negative values? The only case where the value should be negative is when the variance is very very close 0 anyways. |
Your screenshot is very nice! |
I had did it(make it ignore negative values) at first but actually the error was quite large, possibly order of 0.01. |
Calling getAverageTreeOwnershipHelper with same arguments twice sequentially returns different results. |
If you're doing it during something like kata-analyze or you're doing the analysis engine's ongoing reporting feature, yes. Does this explain why you got large errors? Maybe it is not because there was a numerical problem, it is because the tree was changing at the same time. In that case, you need to compute both simultaneously to have it be self-consistent. |
I agree with you. It is probably because the inconsistency of trees between means and standard deviations. I found "Welford's online algorithm" to calculate mean and deviation in a single pass, but it may be hard for me to fit it to your algorithm. We may need only trends of stdev, not accurate value, since our target will be visualization, so I will restore the pull request to the first version with filtering negative values. |
I recommend computing x and x^2 in a single pass, and then still doing E(x^2) - E(x)^2 at the end. So this is NOT Welford's algorithm, but doing it like this in a single pass will prevent inconsistency from concurrent updating of the tree while ownership is being computed. Welford's algorithm is not important at all here. Basically, it prevents you from losing precision due to the final subtraction of E(x^2) - E(x)^2, but in our case since E(x) is around unit scale, this cancellation can limit our precision to be on the order of 10^-16 (the precision of a float around unit scale). This is still more than enough precision. When it actually matters is when you are trying to compute the variance of something where E(x) is significantly nonzero. For example, something with unit variance where the mean is 10^6. Then, in that case, you will be subtracting values that are around 10^12 from each other, which means you only have accuracy down to the order of 10^-4 left in your variance, which when square rooted gives you 10^-2, e.g. you only have 1 or 2 decimal digits of precision relative to your variance which is of order 1. But in KataGo, the mean is never that extreme, so fancy numerically-stable variance estimation is pretty much pointless. The only thing that matters is making sure the values are consistent and simultaneously computed to be robust to concurrent search tree update. |
So the algorithm would be: compute E(x) and E(x^2) in one pass. Then compute variance as max(0, E(x^2) - E(x)^2), then take square root to get stdev. |
You lead me to the clear way! I pushed current version. |
Changing getAverageTreeOwnershipHelper and getAverageAndStandardDeviationTreeOwnershipHelper to traverseTreeWtihOwnershipAndSelfWeight and passing averaging functions as lambda expressions may simplify the code. |
…shipHelper and getAverageAndStandardDeviationTreeOwnershipHelper
I pushed traverseTreeWtihOwnershipAndSelfWeight version. |
* input: `kata-analyze ... ownership true ownershipStdev true` * output: `info ... info ... ownership ... ownershipStdev ...`
Thanks for an interesting feature. I'm trying this locally. As for the output format, the following extension is natural for me. (current)
(extended)
https://github.com/kaorahi/KataGo/tree/stdev_ownership_my1 |
@kaorahi san, Thank you for your implementation for GTP command! |
I have tried this feature for a month and I still enjoy it. It looks like a heatmap of "KataGo's eye tracking". |
Finally merged, will be part of for upcoming release. I also made some minor typo fixes and added it to analysis engine and updated all the GTP docs. Thanks for the implementation! |
Hi.
Here is a pull request of standard deviation of ownership.
Could you check the code?
(I am not quite sure that the method name "getStandardDeviationTreeOwnership" is right...)
I hope that the template does not make performance down.
(I also changed the output format of kata-analyze on gtp mode in my private code, but I omitted it in this request since it needs larger decision.)