Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Checkpointing of Trailblaze Algorithm #1159

Merged
merged 4 commits into from
Jul 21, 2019
Merged

Conversation

SimonBoothroyd
Copy link

@SimonBoothroyd SimonBoothroyd commented May 15, 2019

Description

This PR aims to implement a basic checkpointing system for the trailblazing algorithm - currently if the algorithm is interrupted while running (e.g. if an insufficient wall clock limit is set by a job submission script), the partially computed optimized lambda states are lost.

Status

  • Ready to review
  • Ready to go

Copy link
Contributor

@andrrizzi andrrizzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me! Thank you!

Copy link
Member

@jchodera jchodera left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

My only comments are:

  1. We could likely easily make checkpointing optional, not required, through the Python API (though Yank should always checkpoint it)
  2. We should migrate the trailblazing code to openmmtools soon.

@@ -1962,7 +1963,7 @@ def _get_number_box_waters(self, *args):
# ==============================================================================

def trailblaze_alchemical_protocol(thermodynamic_state, sampler_state, mcmc_move, state_parameters,
std_energy_threshold=0.5, threshold_tolerance=0.05,
checkpoint_path, std_energy_threshold=0.5, threshold_tolerance=0.05,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to make checkpointing optional, or required via the Python API? This approach seems to make it required, but it seems like it would be simple to make it optional.

Copy link
Contributor

@andrrizzi andrrizzi May 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it would be useful to have it optional in the function (although exposing it in the YAML will be slightly more complicated). It's possible tests are not passing also because you're defining a positional argument after a keyword one.

@andrrizzi
Copy link
Contributor

I'm going to merge these changes so that I can pick up from here and store also the configurations generated by trailblaze in a separate PR.

@andrrizzi
Copy link
Contributor

Ah! I didn't notice the test failures. I'm going to make the changes directly here then.

@andrrizzi
Copy link
Contributor

The failing tests have been already fixed in #1168 so I think we can merge safely. I'll correct eventual issues in the next PR if there are any.

@andrrizzi andrrizzi merged commit ae7e40c into master Jul 21, 2019
@andrrizzi andrrizzi deleted the trailblaze_checkpoint branch July 21, 2019 15:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants