Support for Interleaved Double-Wide Sampling files from NAMD #135

jhenin · 2021-06-04T17:49:23Z

No description provided.

jhenin · 2021-06-04T22:55:47Z

Well, looks like I broke the tests... I will look into this and update the PR.

codecov · 2021-06-04T23:32:48Z

Codecov Report

Merging #135 (26f3eae) into master (e280ec4) will increase coverage by 0.06%.
The diff coverage is 98.98%.

@@            Coverage Diff             @@
##           master     #135      +/-   ##
==========================================
+ Coverage   97.78%   97.84%   +0.06%     
==========================================
  Files          20       20              
  Lines        1128     1206      +78     
  Branches      236      256      +20     
==========================================
+ Hits         1103     1180      +77     
  Misses          5        5              
- Partials       20       21       +1

Impacted Files	Coverage Δ
src/alchemlyb/parsing/namd.py	`99.11% <98.98%> (-0.89%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e280ec4...26f3eae. Read the comment docs.

jhenin · 2021-06-05T07:54:15Z

I'm going to submit a new dataset to alchemtest for this.

orbeckst · 2021-06-10T04:27:56Z

@jhenin please submit a test set PR for alchemtest and ping me so that I can quickly review.

orbeckst · 2021-06-10T04:28:56Z

(sorry, didn't mean to be pushy, just wanted to say that once you get to making a PR I am happy to quickly review it — just ping me to get my attention, please)

jhenin · 2021-06-11T09:30:00Z

@orbeckst No worries. This update should do the trick.

orbeckst

Thanks @jhenin . I have an initial set of comments/questions.

Please also add an entry to CHANGES under 0.5.0 (your GitHub name and support for NAMD IDWS under Enhancements).

Thank you!

orbeckst · 2021-06-15T18:54:49Z

src/alchemlyb/parsing/namd.py

+                    # Mimic classic DWS data
+                    # Arbitrarily match up fwd and bwd comparison energies on the same times
+                    # truncate extra samples from whichever array is longer


minor style: indent comments with inner block

src/alchemlyb/parsing/namd.py

orbeckst · 2021-06-15T19:00:12Z

src/alchemlyb/parsing/namd.py

+
+                    if lambda_idws is not None:
+                    # Mimic classic DWS data
+                    # Arbitrarily match up fwd and bwd comparison energies on the same times


Document in a Note section that the parser may discard data.

src/alchemlyb/tests/parsing/test_namd.py

src/alchemlyb/parsing/namd.py

orbeckst · 2021-06-16T19:00:37Z

We just switched the CI to GH actions so I needed to merge master and kick it... should report status in a few minutes (and I also fixed a bug in alchemtest, which I think I introduced with my suggestion on your PR. Sorry about that.)

orbeckst · 2021-06-17T15:26:11Z

I don’t know why codecov fails during the upload stage. We just changed the CI in #139 so there might be teething problems — maybe @dotsdl @xiki-tempula have any ideas?

xiki-tempula · 2021-06-17T15:29:16Z

I will have a look

orbeckst · 2021-06-17T15:29:25Z

Oops, sorry, codecov works just fine on the PR. I got confused with a codecov failure on @jhenin ‘s fork that I got a notification for. https://github.com/jhenin/alchemlyb/actions/runs/946402101 — please ignore me… it’s another 117F day and my brain is already mushy. (At least that’s my excuse.)

orbeckst · 2021-07-13T17:49:31Z

Sorry, I only just saw that the GH workflows were waiting for admin approval — that's a new feature on GitHub to crack down on abuse of CI for bitcoin mining etc. Please don't hesitate to ping me if you need my attention.

jhenin · 2021-08-12T17:00:13Z

@orbeckst I merged the PR from @ttjoseph - can you please unleash the CI tests?

orbeckst · 2021-08-12T18:10:23Z

I resolved the conflict and update CHANGES (including your and @ttjoseph 's contribution). This started the CI, too.

orbeckst · 2021-08-12T18:11:55Z

@jhenin and @ttjoseph please add yourselves to AUTHORS.

orbeckst

Thanks for the update, including the better test files. My major issues are (see comments)

Replace any print() with a logger call (and a warnings.warn() for cases when logging is not enabled). print in library code is really annoying because you cannot nicely control the output from code that integrates the library.
Instead of returning None and making calling code choke on the return value, raise an exception right in the code where it happens. I assume that this is intended behavior (instead of having other code that works around the None). In any case, fail early and clearly and don't guess are principles that make code simpler and more robust, which is what alchemlyb is going for.
The test coverage needs to be increased. Basically, we want tests for any of the exceptions and corner cases that you are looking for. This is important to reduce the maintenance burden and code-debt down in the future (and to just make sure that the code really does what it was intended to do). You can use mock or generate a bad input file on the fly in the tests.

CHANGES

orbeckst · 2021-08-12T18:21:58Z

src/alchemlyb/parsing/namd.py

@@ -9,14 +9,71 @@

 k_b = R_kJmol * kJ2kcal

+def get_lambdas(fep_files):


Does this show up in the docs?

Build them locally with

python setup.py build_sphinx

and check.

(For some reason the RTD PR integration does not work #156.)

Are users supposed to use get_lambdas() by themselves or would you consider this a "private" function of the module. If the latter, prefix with underscore and call it _get_lambdas() and then we also don't have to worry about documenting for users and we don't have to maintain the interface rigorously.

get_lambdas() shouldn't be exposed to users. Our next PR will have the underscore prefix.

src/alchemlyb/parsing/namd.py

orbeckst · 2021-08-12T18:25:44Z

src/alchemlyb/parsing/namd.py

-
-    .. versionchanged:: 0.5.0
-        The :mod:`scipy.constants` is used for parsers instead of
-        the constants used by the corresponding MD engine.
-


Was there a reason to delete this?

orbeckst · 2021-08-12T18:38:24Z

src/alchemlyb/parsing/namd.py

+                    if lambda1_at_start is not None \
+                        and (lambda1, lambda2, lambda_idws) != (lambda1_at_start, lambda2_at_start, lambda_idws_at_start):
+                        print("namd.py: extract_u_nk: Error: Lambdas changed unexpectedly while processing", fep_file)
+                        print(f"namd.py: extract_u_nk: Error: l1, l2, lidws: {lambda1_at_start}, {lambda2_at_start}, {lambda_idws_at_start} changed to {lambda1}, {lambda2}, {lambda_idws}")
+                        print(f"namd.py: extract_u_nk: Error: fep_file = {fep_file}; has_idws = {has_idws}")
+                        return None


Needs to be tested, check the coverage report (maybe the link https://app.codecov.io/gh/alchemistry/alchemlyb/compare/135/diff#diff-c3JjL2FsY2hlbWx5Yi9wYXJzaW5nL25hbWQucHk= works)

@orbeckst That link gives me an "GitHub API rate limit error" message. I'm unfamiliar with Codecov, but I can at least add a test for this

Hm, yes, I get that, too, when I am not logged in with my GitHub account.

Look at the running tests and then click on the codecov actions (EDIT: the link under Details) (e.g., https://github.com/alchemistry/alchemlyb/pull/135/checks?check_run_id=3314785981 ) and then go from there?

This leads me to https://app.codecov.io/gh/alchemistry/alchemlyb/compare/135/diff — when I tried this link in an anonynmous browser window without being logged in anywhere I could see everything.

Thanks. I am not familiar with Codecov so I'll need to figure out how to tell it what my still-to-be-written new tests cover.

orbeckst · 2021-08-12T18:47:38Z

src/alchemlyb/parsing/namd.py

+
+                # Make sure the lambda2 values are consistent
+                if lambda1 in lambda_fwd_map and lambda_fwd_map[lambda1] != lambda2:
+                    print(f'namd.py: get_lambdas: Error: fwd: lambda1 {lambda1} has lambda2 {lambda_fwd_map[lambda1]} but it should be {lambda2}')


Do not use print() in library code. Use a logger for alchemlyb.parsers.NAMD (see what the Amber parser does

alchemlyb/src/alchemlyb/parsing/amber.py

Line 21 in b068776

logger = logging.getLogger("alchemlyb.parsers.Amber")

) and issue a warnings.warn if necessary.

If this is really an error then shouldn't you raise an exception such as ValueError?

orbeckst · 2021-08-12T18:48:44Z

src/alchemlyb/parsing/namd.py

+                # Make sure the lambda_idws values are consistent
+                if lambda_idws is not None:
+                    if lambda1 in lambda_bwd_map and lambda_bwd_map[lambda1] != lambda_idws:
+                        print(f'namd.py: get_lambdas: Error: bwd: lambda1 {lambda1} has lambda_idws {lambda_bwd_map[lambda1]} but it should be {lambda_idws}')


Use a logger instead of print and shouldn't this raise a ValueError, too?

Fail early!

orbeckst · 2021-08-12T18:49:33Z

src/alchemlyb/parsing/namd.py

+                if lambda_idws is not None:
+                    if lambda1 in lambda_bwd_map and lambda_bwd_map[lambda1] != lambda_idws:
+                        print(f'namd.py: get_lambdas: Error: bwd: lambda1 {lambda1} has lambda_idws {lambda_bwd_map[lambda1]} but it should be {lambda_idws}')
+                        return None


My concern is that returning None will silently discard data. It's cleaner to fail and have the user fix things.

orbeckst · 2021-08-12T18:50:46Z

src/alchemlyb/parsing/namd.py

+                        print("namd.py: extract_u_nk: Error: Lambdas changed unexpectedly while processing", fep_file)
+                        print(f"namd.py: extract_u_nk: Error: l1, l2, lidws: {lambda1_at_start}, {lambda2_at_start}, {lambda_idws_at_start} changed to {lambda1}, {lambda2}, {lambda_idws}")
+                        print(f"namd.py: extract_u_nk: Error: fep_file = {fep_file}; has_idws = {has_idws}")


Do not use print() in library code. Use a logger for alchemlyb.parsers.NAMD (see what the Amber parser does

alchemlyb/src/alchemlyb/parsing/amber.py

Line 21 in b068776

logger = logging.getLogger("alchemlyb.parsers.Amber")

).

Instead of returning None, raise the exception here; ValueError could be appropriate unless you have another idea.

orbeckst · 2021-08-12T18:51:35Z

src/alchemlyb/parsing/namd.py

+                    parsing = True
+
+    if len(win_de) != 0 or len(win_de_back) != 0:
+        print('Warning: trailing data without footer line (\"#Free energy...\"). Interrupted run?')


Do not use print() in library code. Use a logger for alchemlyb.parsers.NAMD (see what the Amber parser does

alchemlyb/src/alchemlyb/parsing/amber.py

Line 21 in b068776

logger = logging.getLogger("alchemlyb.parsers.Amber")

) and issue a warnings.warn .

orbeckst · 2021-08-27T20:17:57Z

@ttjoseph , once you have jhenin#3 merged, I'll see how this all looks like here, in particular the test coverage, which cannot be reported cleanly on the forked repo.

orbeckst · 2021-08-27T20:19:10Z

Please also address the comments raised in review above (e.g., on using the logger instead of print()). If you have questions please ask, we're happy to help.

ttjoseph · 2021-08-27T20:26:15Z

Please also address the comments raised in review above (e.g., on using the logger instead of print()). If you have questions please ask, we're happy to help.

@orbeckst My code now uses a logger and throws exceptions instead of returning None. I suppose the next step is for @jhenin to merge my PR to his repo?

I will also submit a separate PR to the alchemistry/alchemtest repo containing a new test set so my new test will actually work.

orbeckst · 2021-08-27T20:32:02Z

Yes & yes: Once your PR jhenin#3 is merged, the CI here should run. Given that it needs new test files you'll need to get the PR into alchemtest before this PR can be merged. When you open the alchemtest PR, please references this PR #135 and ping me so that we can move it along briskly.

that is, Interleaved Double-Wide Sampling, by Brannigan and Hénin. I had to arbitrarily match up fwd and backward data as if it came from the same configurations, otherwise BAR found zero overlap. However I don't think that's a fundamental requirement (fwd and bwd data could come from altogether different simulations). This seems like an unnecessary limitation of alchemlyb, or more likely, the underlying pymbar.

orbeckst · 2021-10-04T21:03:38Z

Thank you for addressing my last set of comments. This looking excellent now. I am just waiting for CI to be all green.

orbeckst · 2021-10-04T21:06:12Z

~~@jhenin @ttjoseph can you please also add your names to AUTHORS?~~ — IGNORE ME, that's in the PR already!

I'll have look at docs and coverage.

orbeckst

Some questions on code logic.

orbeckst · 2021-10-04T21:09:56Z

src/alchemlyb/parsing/namd.py

+                        if l1_idx > 0 and l1_idx < l2_idx: # Ascending lambdas
+                            lambda_idws_at_start = all_lambdas[l1_idx - 1]
+                        elif l2_idx < (len(all_lambdas) - 1) and l2_idx < l1_idx: # Descending lambdas
+                            lambda_idws_at_start = all_lambdas[l2_idx + 1]


When should this conditional be triggered? Is this a common or a rare thing?

At the moment this case is not covered by tests.

This would occur when we parse a FEP calculation result where lambda progresses 1 -> 0 over the set of windows, rather than (to me) the usual case of 0 -> 1. This bit of code infers what the lambda_idws likely is, in the event no lambda_idws is present in this .fepout file - which can happen when a NAMD run is interrupted.

Hm, I guess we'd want to test a 1 -> 0 FEP calculation?

If you as the expert say that this could happen in the wild then we want a test — yes, please.

OK. In this case the easiest thing for me is to add another test set in alchemtest. The format of NAMD .fepout files that use IDWS makes it difficult to modify the existing test set on the fly as is done to trigger the other failure cases: lambda values are in both header and footer, but lambda_idws is not in footer. So swapping them around across fragments of .fepout files that would arise from an interrupted and restarted simulation would make the test code more complex than the code it's meant to test.

If you add it to alchemtest and ping me I'll review it as quickly as possible and try to get it in ASAP.

orbeckst · 2021-10-04T21:11:24Z

src/alchemlyb/parsing/namd.py

+                    # because NAMD only emits the '#NEW' line on timestep 0 for some reason
+                    if has_idws and lambda_idws_at_start is None:
+                        l1_idx, l2_idx = all_lambdas.index(lambda1), all_lambdas.index(lambda2)
+                        if l1_idx > 0 and l1_idx < l2_idx: # Ascending lambdas


This conditional has no else clause. Is that ok or should it fail if neither if nor elif is entered?

This handles the special case where there are IDWS energies but no lambda_idws value in the current .fepout file, which can happen when NAMD is interrupted. So the else case is handled by the rest of this block, by default.

Can you then just add your explanation as a comment to the code, please?

src/alchemlyb/parsing/namd.py

… processing

ttjoseph · 2021-10-06T20:26:29Z

@orbeckst OK, I've got more changes for @jhenin to pull that should address these comments, in conjunction with my PR for alchemtest.

orbeckst · 2021-10-06T23:58:14Z

PR alchemistry/alchemtest#57 was merged so the new dataset should now be available in tests.

ttjoseph · 2021-10-07T14:25:08Z

Hm, not sure what's going on here with this failed test (it works on my machine), but I'm investigating.

ttjoseph · 2021-10-07T15:01:59Z

This seems to be another bug in the test related to filesystem file ordering. On my machine they happen to be read in lexicographic order by filename but not on GitHub CI. Another commit pushed...thanks in advance, @jhenin

orbeckst · 2021-10-07T15:04:26Z

Does the new dataset contain two different problems instead of only one (lambda reversal)?

orbeckst · 2021-10-07T15:05:59Z

Please ping me when you need me to review again, @ttjoseph .

ttjoseph · 2021-10-07T15:07:55Z

Does the new dataset contain two different problems instead of only one (lambda reversal)?

It also includes interruptions and restarts, as does the "restarted" dataset.

orbeckst · 2021-10-07T15:29:26Z

I had a quick look and I think you are there. Before I do a proper review later today when I have more time could you please add two # pragma: no cover to the two blocks (else: continue and elif: lambda…) that are not tested? Together with your comments this will make clear what’s happening here. Thank you!

orbeckst

The coverage report says that the line looking at reversed lambdas is not executed even though this was the intention of the new dataset (if I understood it correctly). Can you please look into it?

(For the continue block, just add my suggestion for a #pragma: no cover and that will be ok.)

src/alchemlyb/parsing/namd.py

orbeckst · 2021-10-07T15:48:29Z

src/alchemlyb/parsing/namd.py

+                        if l1_idx > 0 and l1_idx < l2_idx: # Ascending lambdas
+                            lambda_idws_at_start = all_lambdas[l1_idx - 1]
+                        elif l2_idx < (len(all_lambdas) - 1) and l2_idx < l1_idx: # Descending lambdas
+                            lambda_idws_at_start = all_lambdas[l2_idx + 1]


Wasn't the latest test supposed to hit this line?

Noticed this problem as well. Yes, but it doesn't because I already sorted the lambdas in reverse order above, so only the first clause in the if above is necessary, and inferring lambda_idws as the lambda "before" lambda1 works in forward and reverse lambda orders.

I will also add another test case a bit later today: if l1_idx == 0 this means the first window both had IDWS data and was incomplete and so there is no way to infer what the user hoped lambda_idws would be. This is a flagrantly pathological case but still it should be tested for.

ttjoseph · 2021-10-07T20:45:50Z

@orbeckst Test added and useless lines of code that weren't covered were removed. Hopefully when @jhenin merges, the code will be covered.

…n't needed anyway

ttjoseph · 2021-10-08T15:44:47Z

I've added one (last?) pragma no cover as a PR to @jhenin's tree

orbeckst · 2021-10-08T16:41:23Z

@ttjoseph and @jhenin , congratulations 👏 👏 👏 , your IDWS NAMD parser is in alchemlyb together with all the robustness improvements.

Thank you for not giving up and seeing it through — through >100 comments and numerous "almost there" reviews. Much appreciated!!!

orbeckst linked an issue Jun 8, 2021 that may be closed by this pull request

Parsing forward/backward comparison energies from different configurations #128

Closed

orbeckst added enhancement parsers labels Jun 8, 2021

jhenin mentioned this pull request Jun 11, 2021

Add NAMD IDWS dataset alchemistry/alchemtest#43

Merged

jhenin force-pushed the master branch from 0d27051 to 10a1f1d Compare June 11, 2021 09:28

orbeckst requested changes Jun 15, 2021

View reviewed changes

jhenin force-pushed the master branch from 3527ee6 to 6a2027d Compare July 11, 2021 06:49

orbeckst mentioned this pull request Jul 29, 2021

Possible improvements for the NAMD parser #145

Closed

4 tasks

orbeckst requested changes Aug 12, 2021

View reviewed changes

orbeckst mentioned this pull request Aug 27, 2021

Second round of changes jhenin/alchemlyb#3

Closed

ttjoseph mentioned this pull request Aug 27, 2021

Add NAMD test set that contains restarts alchemistry/alchemtest#55

Merged

ttjoseph and others added 2 commits October 4, 2021 22:18

Don't reload restarted_dataset manually

384cba9

Merge branch 'master' into master

cb9a9bc

orbeckst approved these changes Oct 4, 2021

View reviewed changes

orbeckst requested changes Oct 4, 2021

View reviewed changes

ttjoseph added 5 commits October 6, 2021 13:41

Add clarifying comments, and a continue statement to miss unnecessary…

09349a1

… processing

Get rid of unused variables

9143b2f

Use restarted_reversed test dataset where lambda goes 1 -> 0

19bc821

Specific tests for NAMD IDWS restarted_reversed dataset

bab0376

Merge jhenin's branch

b830171

Don't depend on filesystem file ordering for testing

64b5ae3

orbeckst requested changes Oct 7, 2021

View reviewed changes

ttjoseph added 2 commits October 8, 2021 09:50

Corrupt only one fepout in 'inconsistent' test, clean up cruft

6e42f65

Add pathological test case, remove lines uncovered by tests that were…

e313d00

…n't needed anyway

Add pragma no cover for a continue line

26f3eae

orbeckst approved these changes Oct 8, 2021

View reviewed changes

orbeckst merged commit cc4b42e into alchemistry:master Oct 8, 2021

orbeckst mentioned this pull request Oct 29, 2021

release 0.6.0 #174

Closed

5 tasks

		@@ -9,14 +9,71 @@

		k_b = R_kJmol * kJ2kcal

		def get_lambdas(fep_files):

Support for Interleaved Double-Wide Sampling files from NAMD #135

Support for Interleaved Double-Wide Sampling files from NAMD #135

Conversation

jhenin commented Jun 4, 2021

jhenin commented Jun 4, 2021

codecov bot commented Jun 4, 2021 • edited Loading

Codecov Report

jhenin commented Jun 5, 2021

orbeckst commented Jun 10, 2021

orbeckst commented Jun 10, 2021

jhenin commented Jun 11, 2021

orbeckst left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

orbeckst commented Jun 16, 2021

orbeckst commented Jun 17, 2021

xiki-tempula commented Jun 17, 2021

orbeckst commented Jun 17, 2021

orbeckst commented Jul 13, 2021

jhenin commented Aug 12, 2021

orbeckst commented Aug 12, 2021

orbeckst commented Aug 12, 2021

orbeckst left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

orbeckst Aug 12, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

orbeckst commented Aug 27, 2021

orbeckst commented Aug 27, 2021

ttjoseph commented Aug 27, 2021

orbeckst commented Aug 27, 2021

orbeckst commented Oct 4, 2021

orbeckst commented Oct 4, 2021 • edited Loading

orbeckst left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ttjoseph Oct 4, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ttjoseph commented Oct 6, 2021

orbeckst commented Oct 6, 2021

ttjoseph commented Oct 7, 2021

ttjoseph commented Oct 7, 2021

orbeckst commented Oct 7, 2021

orbeckst commented Oct 7, 2021

ttjoseph commented Oct 7, 2021

orbeckst commented Oct 7, 2021

orbeckst left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ttjoseph commented Oct 7, 2021

ttjoseph commented Oct 8, 2021

orbeckst commented Oct 8, 2021

codecov bot commented Jun 4, 2021 •

edited

Loading

orbeckst Aug 12, 2021 •

edited

Loading

orbeckst commented Oct 4, 2021 •

edited

Loading

ttjoseph Oct 4, 2021 •

edited

Loading