Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement max_mem_mb option in the config file to resolve memory issues withsample_level_ibd rule #343

Closed
jaamarks opened this issue Oct 3, 2024 · 2 comments · Fixed by #344
Assignees

Comments

@jaamarks
Copy link
Collaborator

jaamarks commented Oct 3, 2024

With eb116a5 we added the following code to the sample_level_ibd rule in the sample_qc.smk file so that we could apply the max_time_hr workflow parameter to this rule:

        mem_mb=lambda wildcards, attempt, input: max((attempt + 1) * input.size_mb, 1024),
        time_hr=lambda wildcards, attempt: BIG_TIME[attempt],

But now we are experiencing issues with the mem_mb not being aggressive enough. A sample size of 110K quickly failed on multiple attempts because of memory issues. So we need to make this more aggressive.

@jaamarks jaamarks changed the title Increase the rate at which memory is allocated for the sample_level_ibd rule Increase the rate at which memory is allocated for the sample_level_ibd rule Oct 3, 2024
@jaamarks
Copy link
Collaborator Author

jaamarks commented Oct 3, 2024

The previous allocation was set in the genome rule in the plink.smk file at:
PLINK_BIG_MEM = {1: 1024 * 4, 2: 1024 * 64, 3: 1024 * 250}

From looking at log files of previous successful runs of a 110K sample, the pipeline failed up until it allocated 256GB at which point it succeeded. The pipeline would fail very quickly (under a minute) if it was a memory issue.

@jaamarks jaamarks self-assigned this Oct 3, 2024
@jaamarks
Copy link
Collaborator Author

jaamarks commented Oct 3, 2024

We could implemented a max_mem_mb parameter in the config, and then apply the following approach for any rule that requires a significant amount of memory (like the sample_level_ibdrule):

PLINK_BIG_MEM = {1: max_mem_mb / 3, 2: max_mem_mb / 2, 3: max_mem_mb}

@jaamarks jaamarks changed the title Increase the rate at which memory is allocated for the sample_level_ibd rule implement max_mem_mb option in the config file to resolve memory issues withsample_level_ibd rule Oct 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant