-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for Interleaved Double-Wide Sampling files from NAMD #135
Conversation
Well, looks like I broke the tests... I will look into this and update the PR. |
Codecov Report
@@ Coverage Diff @@
## master #135 +/- ##
==========================================
+ Coverage 97.78% 97.84% +0.06%
==========================================
Files 20 20
Lines 1128 1206 +78
Branches 236 256 +20
==========================================
+ Hits 1103 1180 +77
Misses 5 5
- Partials 20 21 +1
Continue to review full report at Codecov.
|
I'm going to submit a new dataset to alchemtest for this. |
@jhenin please submit a test set PR for alchemtest and ping me so that I can quickly review. |
(sorry, didn't mean to be pushy, just wanted to say that once you get to making a PR I am happy to quickly review it — just ping me to get my attention, please) |
@orbeckst No worries. This update should do the trick. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jhenin . I have an initial set of comments/questions.
Please also add an entry to CHANGES under 0.5.0 (your GitHub name and support for NAMD IDWS under Enhancements).
Thank you!
src/alchemlyb/parsing/namd.py
Outdated
# Mimic classic DWS data | ||
# Arbitrarily match up fwd and bwd comparison energies on the same times | ||
# truncate extra samples from whichever array is longer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor style: indent comments with inner block
src/alchemlyb/parsing/namd.py
Outdated
|
||
if lambda_idws is not None: | ||
# Mimic classic DWS data | ||
# Arbitrarily match up fwd and bwd comparison energies on the same times |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Document in a Note section that the parser may discard data.
We just switched the CI to GH actions so I needed to merge master and kick it... should report status in a few minutes (and I also fixed a bug in alchemtest, which I think I introduced with my suggestion on your PR. Sorry about that.) |
I don’t know why codecov fails during the upload stage. We just changed the CI in #139 so there might be teething problems — maybe @dotsdl @xiki-tempula have any ideas? |
I will have a look |
Oops, sorry, codecov works just fine on the PR. I got confused with a codecov failure on @jhenin ‘s fork that I got a notification for. https://github.com/jhenin/alchemlyb/actions/runs/946402101 — please ignore me… it’s another 117F day and my brain is already mushy. (At least that’s my excuse.) |
Sorry, I only just saw that the GH workflows were waiting for admin approval — that's a new feature on GitHub to crack down on abuse of CI for bitcoin mining etc. Please don't hesitate to ping me if you need my attention. |
I resolved the conflict and update CHANGES (including your and @ttjoseph 's contribution). This started the CI, too. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update, including the better test files. My major issues are (see comments)
- Replace any
print()
with a logger call (and awarnings.warn()
for cases when logging is not enabled).print
in library code is really annoying because you cannot nicely control the output from code that integrates the library. - Instead of returning
None
and making calling code choke on the return value, raise an exception right in the code where it happens. I assume that this is intended behavior (instead of having other code that works around theNone
). In any case, fail early and clearly and don't guess are principles that make code simpler and more robust, which is what alchemlyb is going for. - The test coverage needs to be increased. Basically, we want tests for any of the exceptions and corner cases that you are looking for. This is important to reduce the maintenance burden and code-debt down in the future (and to just make sure that the code really does what it was intended to do). You can use
mock
or generate a bad input file on the fly in the tests.
src/alchemlyb/parsing/namd.py
Outdated
@@ -9,14 +9,71 @@ | |||
|
|||
k_b = R_kJmol * kJ2kcal | |||
|
|||
def get_lambdas(fep_files): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this show up in the docs?
Build them locally with
python setup.py build_sphinx
and check.
(For some reason the RTD PR integration does not work #156.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are users supposed to use get_lambdas()
by themselves or would you consider this a "private" function of the module. If the latter, prefix with underscore and call it _get_lambdas()
and then we also don't have to worry about documenting for users and we don't have to maintain the interface rigorously.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_lambdas() shouldn't be exposed to users. Our next PR will have the underscore prefix.
src/alchemlyb/parsing/namd.py
Outdated
|
||
.. versionchanged:: 0.5.0 | ||
The :mod:`scipy.constants` is used for parsers instead of | ||
the constants used by the corresponding MD engine. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was there a reason to delete this?
src/alchemlyb/parsing/namd.py
Outdated
if lambda1_at_start is not None \ | ||
and (lambda1, lambda2, lambda_idws) != (lambda1_at_start, lambda2_at_start, lambda_idws_at_start): | ||
print("namd.py: extract_u_nk: Error: Lambdas changed unexpectedly while processing", fep_file) | ||
print(f"namd.py: extract_u_nk: Error: l1, l2, lidws: {lambda1_at_start}, {lambda2_at_start}, {lambda_idws_at_start} changed to {lambda1}, {lambda2}, {lambda_idws}") | ||
print(f"namd.py: extract_u_nk: Error: fep_file = {fep_file}; has_idws = {has_idws}") | ||
return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs to be tested, check the coverage report (maybe the link https://app.codecov.io/gh/alchemistry/alchemlyb/compare/135/diff#diff-c3JjL2FsY2hlbWx5Yi9wYXJzaW5nL25hbWQucHk= works)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@orbeckst That link gives me an "GitHub API rate limit error" message. I'm unfamiliar with Codecov, but I can at least add a test for this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, yes, I get that, too, when I am not logged in with my GitHub account.
Look at the running tests and then click on the codecov actions (EDIT: the link under Details) (e.g., https://github.com/alchemistry/alchemlyb/pull/135/checks?check_run_id=3314785981 ) and then go from there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This leads me to https://app.codecov.io/gh/alchemistry/alchemlyb/compare/135/diff — when I tried this link in an anonynmous browser window without being logged in anywhere I could see everything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I am not familiar with Codecov so I'll need to figure out how to tell it what my still-to-be-written new tests cover.
src/alchemlyb/parsing/namd.py
Outdated
|
||
# Make sure the lambda2 values are consistent | ||
if lambda1 in lambda_fwd_map and lambda_fwd_map[lambda1] != lambda2: | ||
print(f'namd.py: get_lambdas: Error: fwd: lambda1 {lambda1} has lambda2 {lambda_fwd_map[lambda1]} but it should be {lambda2}') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do not use print()
in library code. Use a logger for alchemlyb.parsers.NAMD (see what the Amber parser does
alchemlyb/src/alchemlyb/parsing/amber.py
Line 21 in b068776
logger = logging.getLogger("alchemlyb.parsers.Amber") |
warnings.warn
if necessary.
If this is really an error then shouldn't you raise an exception such as ValueError
?
src/alchemlyb/parsing/namd.py
Outdated
# Make sure the lambda_idws values are consistent | ||
if lambda_idws is not None: | ||
if lambda1 in lambda_bwd_map and lambda_bwd_map[lambda1] != lambda_idws: | ||
print(f'namd.py: get_lambdas: Error: bwd: lambda1 {lambda1} has lambda_idws {lambda_bwd_map[lambda1]} but it should be {lambda_idws}') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use a logger instead of print
and shouldn't this raise a ValueError
, too?
Fail early!
src/alchemlyb/parsing/namd.py
Outdated
if lambda_idws is not None: | ||
if lambda1 in lambda_bwd_map and lambda_bwd_map[lambda1] != lambda_idws: | ||
print(f'namd.py: get_lambdas: Error: bwd: lambda1 {lambda1} has lambda_idws {lambda_bwd_map[lambda1]} but it should be {lambda_idws}') | ||
return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern is that returning None
will silently discard data. It's cleaner to fail and have the user fix things.
src/alchemlyb/parsing/namd.py
Outdated
print("namd.py: extract_u_nk: Error: Lambdas changed unexpectedly while processing", fep_file) | ||
print(f"namd.py: extract_u_nk: Error: l1, l2, lidws: {lambda1_at_start}, {lambda2_at_start}, {lambda_idws_at_start} changed to {lambda1}, {lambda2}, {lambda_idws}") | ||
print(f"namd.py: extract_u_nk: Error: fep_file = {fep_file}; has_idws = {has_idws}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do not use print()
in library code. Use a logger for alchemlyb.parsers.NAMD (see what the Amber parser does
alchemlyb/src/alchemlyb/parsing/amber.py
Line 21 in b068776
logger = logging.getLogger("alchemlyb.parsers.Amber") |
Instead of returning None
, raise the exception here; ValueError
could be appropriate unless you have another idea.
src/alchemlyb/parsing/namd.py
Outdated
parsing = True | ||
|
||
if len(win_de) != 0 or len(win_de_back) != 0: | ||
print('Warning: trailing data without footer line (\"#Free energy...\"). Interrupted run?') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do not use print()
in library code. Use a logger for alchemlyb.parsers.NAMD (see what the Amber parser does
alchemlyb/src/alchemlyb/parsing/amber.py
Line 21 in b068776
logger = logging.getLogger("alchemlyb.parsers.Amber") |
warnings.warn
.
Please also address the comments raised in review above (e.g., on using the logger instead of |
@orbeckst My code now uses a logger and throws exceptions instead of returning None. I suppose the next step is for @jhenin to merge my PR to his repo? I will also submit a separate PR to the alchemistry/alchemtest repo containing a new test set so my new test will actually work. |
that is, Interleaved Double-Wide Sampling, by Brannigan and Hénin. I had to arbitrarily match up fwd and backward data as if it came from the same configurations, otherwise BAR found zero overlap. However I don't think that's a fundamental requirement (fwd and bwd data could come from altogether different simulations). This seems like an unnecessary limitation of alchemlyb, or more likely, the underlying pymbar.
Thank you for addressing my last set of comments. This looking excellent now. I am just waiting for CI to be all green. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some questions on code logic.
src/alchemlyb/parsing/namd.py
Outdated
if l1_idx > 0 and l1_idx < l2_idx: # Ascending lambdas | ||
lambda_idws_at_start = all_lambdas[l1_idx - 1] | ||
elif l2_idx < (len(all_lambdas) - 1) and l2_idx < l1_idx: # Descending lambdas | ||
lambda_idws_at_start = all_lambdas[l2_idx + 1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When should this conditional be triggered? Is this a common or a rare thing?
At the moment this case is not covered by tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would occur when we parse a FEP calculation result where lambda progresses 1 -> 0 over the set of windows, rather than (to me) the usual case of 0 -> 1. This bit of code infers what the lambda_idws likely is, in the event no lambda_idws is present in this .fepout file - which can happen when a NAMD run is interrupted.
Hm, I guess we'd want to test a 1 -> 0 FEP calculation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you as the expert say that this could happen in the wild then we want a test — yes, please.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. In this case the easiest thing for me is to add another test set in alchemtest. The format of NAMD .fepout files that use IDWS makes it difficult to modify the existing test set on the fly as is done to trigger the other failure cases: lambda values are in both header and footer, but lambda_idws is not in footer. So swapping them around across fragments of .fepout files that would arise from an interrupted and restarted simulation would make the test code more complex than the code it's meant to test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you add it to alchemtest and ping me I'll review it as quickly as possible and try to get it in ASAP.
src/alchemlyb/parsing/namd.py
Outdated
# because NAMD only emits the '#NEW' line on timestep 0 for some reason | ||
if has_idws and lambda_idws_at_start is None: | ||
l1_idx, l2_idx = all_lambdas.index(lambda1), all_lambdas.index(lambda2) | ||
if l1_idx > 0 and l1_idx < l2_idx: # Ascending lambdas |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This conditional has no else
clause. Is that ok or should it fail if neither if
nor elif
is entered?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This handles the special case where there are IDWS energies but no lambda_idws value in the current .fepout file, which can happen when NAMD is interrupted. So the else
case is handled by the rest of this block, by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you then just add your explanation as a comment to the code, please?
PR alchemistry/alchemtest#57 was merged so the new dataset should now be available in tests. |
Hm, not sure what's going on here with this failed test (it works on my machine), but I'm investigating. |
This seems to be another bug in the test related to filesystem file ordering. On my machine they happen to be read in lexicographic order by filename but not on GitHub CI. Another commit pushed...thanks in advance, @jhenin |
Does the new dataset contain two different problems instead of only one (lambda reversal)? |
Please ping me when you need me to review again, @ttjoseph . |
It also includes interruptions and restarts, as does the "restarted" dataset. |
I had a quick look and I think you are there. Before I do a proper review later today when I have more time could you please add two |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The coverage report says that the line looking at reversed lambdas is not executed even though this was the intention of the new dataset (if I understood it correctly). Can you please look into it?
(For the continue block, just add my suggestion for a #pragma: no cover
and that will be ok.)
src/alchemlyb/parsing/namd.py
Outdated
if l1_idx > 0 and l1_idx < l2_idx: # Ascending lambdas | ||
lambda_idws_at_start = all_lambdas[l1_idx - 1] | ||
elif l2_idx < (len(all_lambdas) - 1) and l2_idx < l1_idx: # Descending lambdas | ||
lambda_idws_at_start = all_lambdas[l2_idx + 1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wasn't the latest test supposed to hit this line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noticed this problem as well. Yes, but it doesn't because I already sorted the lambdas in reverse order above, so only the first clause in the if
above is necessary, and inferring lambda_idws as the lambda "before" lambda1 works in forward and reverse lambda orders.
I will also add another test case a bit later today: if l1_idx == 0 this means the first window both had IDWS data and was incomplete and so there is no way to infer what the user hoped lambda_idws would be. This is a flagrantly pathological case but still it should be tested for.
I've added one (last?) pragma no cover as a PR to @jhenin's tree |
No description provided.