Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new config option to set max wall-time for cluster #334

Closed
jaamarks opened this issue Sep 28, 2024 · 1 comment · Fixed by #335 or #342
Closed

Add new config option to set max wall-time for cluster #334

jaamarks opened this issue Sep 28, 2024 · 1 comment · Fixed by #335 or #342
Assignees

Comments

@jaamarks
Copy link
Collaborator

Currently, the pipeline includes two resource-intensive rules that require significant time to execute.

  1. In the subworkflow located at src/cgr_gwas_qc/workflow/sub_workflows/sample_qc.smk, the sample_concordance_plink rule utilizes a dynamic time allocation strategy defined as:

    BIG_TIME = {1: 10, 2: 48, 3: 96}
    
  2. In the subworkflow at src/cgr_gwas_qc/workflow/sub_workflows/subject_qc.smk, the same sample_concordance_plink rule has a different dynamic allocation strategy:

    BIG_TIME = {1: 8, 2: 24, 3: 48}
    

To enhance user experience, we need to implement an option in the configuration file that allows users to set a maximum time limit for these rules. This will enable users to specify a cap on the time allocated to these rules, ensuring that if their cluster has a maximum time limit that is lower than the potential allocation, they can avoid errors related to time over-allocation.

jaamarks added a commit that referenced this issue Sep 28, 2024
feat: add `max_time_hr` option for wall-time allocation (issue #334)
@jaamarks jaamarks self-assigned this Sep 28, 2024
@jaamarks jaamarks reopened this Oct 3, 2024
@jaamarks
Copy link
Collaborator Author

jaamarks commented Oct 3, 2024

This max_time_hr should also proliferate to the sample_level_ibd rule within the sample_qc.smk sub-workflow, because it is quite time intensive too.

Also need to update the description in the workflow_params.py of max_time_hr so that we state all the rules that it applies too. Because one that it also applies to that we forgot to mention is the sample_concordance_king and sample_concordance_summary rules in the sample_qc.smk file.

jaamarks added a commit that referenced this issue Oct 3, 2024
feat: Apply ``max_time_hr`` to sample_level_ibd rule (issue #334)

**What**
- Apply the ``max_time_hr`` feature to the ``sample_level_ibd`` rule
  within the ``sample_qc.smk`` sub-workflow.
- Update the ``workflow_params.py`` file to state all of the rules that
  ``max_time_hr`` affects.

**Why**
- The ``sample_level_ibd`` rule is another time-intensive step in the
  pipeline, and can users have experienced timeout issues at this step
  when running it on large datasets with the standard time-allocation
  method. So this update ensures that if a user specifies ``max_time_hr``
  in the ``config.yml``, it will allso apply to this rule.

  This is an extension of PR #335
  Fixes #334
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant