-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add R_c (convergence fraction) convergence analysis #239
Conversation
Codecov Report
@@ Coverage Diff @@
## master #239 +/- ##
==========================================
+ Coverage 98.65% 98.68% +0.02%
==========================================
Files 26 26
Lines 1711 1750 +39
Branches 352 359 +7
==========================================
+ Hits 1688 1727 +39
Misses 3 3
Partials 20 20
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
@orbeckst Do you mind review this PR, please?
This loop checks of the last |
I might be able to look at it near the end of the week. Sorry, a bit busy. @VOD555 , could you please review this PR? It's your method after all and you know it best. Although you cannot block or approve in alchemlyb, just state in your review what — if anything — needs to be done. I'll then use your review as a basis. Thank you! |
if all(conv[i:]): | ||
return i / length |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commenting on #239 (comment) : How about adding
else: # pragma: no cover
# This loop checks of the last i points in conv are true. conv is true when the backward
# and the forward are all with a certain distance to the end point. Given that the last forward
# point must be the same as the last backward point as they both cover the whole trajectory,
# conv[-1] is always true and thus, the else part of this loop is never executed.
raise AssertionError("Reached forbidden code path in convergence.R_c")
to make clear that the else part is never executed. Probably shorten the comment a bit... I just copied and pasted.
conv = ((g_forward>g-diff) & (g_forward<g+diff)) & ((g_backward>g-diff)&(g_backward<g+diff)) | ||
for i in range(out_length): | ||
if all(conv[i:]): | ||
return i / length |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better to also return g_forward
and g_backward
so that one can draw the convergence plot with these two series.
@VOD555 Thanks for the review. I have returned the forward and backward as dataframe now. |
@xiki-tempula sorry, I'll look at this PR next week; I pretty much burned my "time to work on alchemlyb" on looking at all the other PRs. @VOD555 , could you please have another look and review? Thank you!! |
@xiki-tempula could you please resolve the conflicts with all the other recently merged PRs? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks very good. I have some questions and would also like @VOD555 's input.
For consistency I'd keep input in
|
||
# Final value | ||
g = g_forward[-1] | ||
conv = ((g_forward>g-tol) & (g_forward<g+tol)) & ((g_backward>g-tol)&(g_backward<g+tol)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this faster than abs?
The output will the float A_c. A_c is a number between 0 and 1 that can be | ||
interpreted as the ratio of the total equilibrated simulation time to the | ||
whole simulation time for a full set of simulations. A_c = 1 means that all | ||
simulation time frames in all windows can be considered equilibrated, while | ||
A_c = 0 indicates that nothing is equilibrated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use math mode for
:func:`~alchemlyb.preprocessing.subsampling.decorrelate_u_nk` or | ||
:func:`~alchemlyb.preprocessing.subsampling.decorrelate_dhdl`. | ||
|
||
The output will the float R_c. R_c = 0 indicates that the system is well |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
math mode
|
||
Note | ||
---- | ||
This function tries to compute R_c from equation 18 from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function computes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you really mean R_c or should it be A_c.
I'd also put a SeeAlso to R_c and state that the R_c are calculated for the whole set using R_c()
.
precision : float | ||
The precision of the output R_c. | ||
diff : float | ||
Tolerance of the convergence check in kcal/mol. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in kT ?
@orbeckst Thanks very much for the review. Sorry for the delay. I have addressed all the issues. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just minor doc things, otherwise looks good.
@VOD555 do you want to take a last look?
Co-authored-by: Oliver Beckstein <orbeckst@gmail.com>
Co-authored-by: Oliver Beckstein <orbeckst@gmail.com>
Co-authored-by: Oliver Beckstein <orbeckst@gmail.com>
Co-authored-by: Oliver Beckstein <orbeckst@gmail.com>
Co-authored-by: Oliver Beckstein <orbeckst@gmail.com>
Co-authored-by: Oliver Beckstein <orbeckst@gmail.com>
@orbeckst Thanks for the review. I have checked that the reference could be resolved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing all my nitpicks, much appreciated.
I made one change to the docs (wasn't in your edits) and I am now waiting for CI. |
Thank you @xiki-tempula , a very welcome contribution! |
I added @VOD555 as a "co-authored" by to the squash-merge because you said that you used his code as an initial basis and because he participated in the review. I put an extra note in the commit message that this version has your performance improvement included. |
Fix #104
I have reimplemented the R_c implemented by @VOD555.
I have also implemented the A_c based on my understanding of the paper.
I have done a speed check, for a 32 window simulation where each window has 10,000 data points.
Takes 0.01 second, so I think the performance is satisfying.
Includes commits from