Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor rigidbody module to allow the preparation of cns_input to be done in parallel #933

Merged
merged 16 commits into from
Jul 15, 2024

Conversation

rvhonorato
Copy link
Member

@rvhonorato rvhonorato commented Jul 10, 2024

You are about to submit a new Pull Request. Before continuing make sure you read the contributing guidelines and that you comply with the following criteria:

  • You have sticked to Python. Please talk to us before adding other programming languages to HADDOCK3
  • Your PR is about CNS
  • Your code is well documented: proper docstrings and explanatory comments for those tricky parts
  • You structured the code into small functions as much as possible. You can use classes if there is a (state) purpose
  • Your code follows our coding style
  • You wrote tests for the new code
  • tox tests pass. Run tox command inside the repository folder
  • -test.cfg examples execute without errors. Inside examples/ run python run_tests.py -b
  • PR does not add any dependencies, unless permission granted by the HADDOCK team
  • PR does not break licensing
  • Your PR is about writing documentation for already existing code 🔥
  • Your PR is about writing tests for already existing code :godmode:

This PR refactors the rigidbody module to crate the prepare_cns_input in parallel.

To make it """ simpler """ I've added a GenericTask to libparallel that wraps a function call into a class that can be paralelized with the current Worker/Scheduler.

Since I also added some integration tests, I also had to add some golden data - hence the huge number of changes, you can ignore those files. I also added new tests for the new class methods.

@rvhonorato rvhonorato self-assigned this Jul 10, 2024
@rvhonorato rvhonorato linked an issue Jul 10, 2024 that may be closed by this pull request
@rvhonorato rvhonorato added execution Related to execution modes, such as GRID, HPC, local, etc. m|rigidbody rigidbody sampling labels Jul 11, 2024
@rvhonorato rvhonorato marked this pull request as ready for review July 11, 2024 10:18
@rvhonorato rvhonorato marked this pull request as draft July 11, 2024 10:42
@rvhonorato rvhonorato marked this pull request as ready for review July 11, 2024 11:08

self.output_models.append(model)
else:
cns_input = self.prepare_cns_input_parallel(
Copy link
Contributor

@VGPReys VGPReys Jul 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because you know that the parallel approach is better for high number of len(models_to_dock) * sampling_factor.
Maybe checking that len(models_to_dock) * sampling_factor < 3000 or self.params["mode] == "batch" to trigger the prepare_cns_input_sequential else prepare_cns_input_parallel ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep we can do that, but needs to come up with a magic number. Since the speedup threshold will depend on the number of cores and processor architecthure - I'd leave as it.

@rvhonorato rvhonorato requested a review from VGPReys July 12, 2024 10:13
@rvhonorato rvhonorato merged commit 27cf05d into main Jul 15, 2024
4 checks passed
@rvhonorato rvhonorato deleted the 930-optimize-job-preparation-in-rigidbody branch July 15, 2024 08:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
execution Related to execution modes, such as GRID, HPC, local, etc. m|rigidbody rigidbody sampling
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optimize job preparation in rigidbody
3 participants