MPI-IE Slurm Profile

A simple Snakemake profile for the MPI-IE Slurm cluster without --cluster-config

Features
Limitations
Quick start
Customizations
Use speed with caution
License

This is a fork of the excellent smk-simple-slurm Snakemake profile, which itself is a simplified version of the more comprehensive official Slurm profile for Snakemake.

Install

Install cookiecutter. If you are on the MPI-IE cluster, cookiecutter is already installed -- run module load cookiecutter.

Run cookiecutter gh:maxplanck-ie/mpi-ie-slurm. Answer the prompts. Be sure to explicitly answer the user and email prompts.

Features

Support for stopping Snakemake with Ctrl-C, which then propagates scancel to children jobs using --cluster-cancel
Support for Snakemake understanding job statuses PENDING, RUNNING, COMPLETING, OUT_OF_MEMORY, TIMEOUT, and CANCELLED using --cluster-status. NB: as of 2022-08-10 sacct does not work on the MPI-IE slurm cluster as of now, so this uses scontrol instead.
Automatically saves the log files as logs/{date}/{rule}/{rule}-{wildcards}-{time}-job%j.out, where {rule} is the name of the rule, {wildcards} is any wildcards passed to the rule, {date} and {time} are determined dynamically and %j is the job number.
automatically load slurm module for MPI-IE cluster in relevant places
use /data/extended as the default tmpdir and export this as TMPDIR variable
automatically names jobs according to their rule
Fast! It can quickly submit jobs and check their status because it doesn't invoke a Python script for these steps, which adds up when you have thousands of jobs (however, please see the section Use speed with caution)
No reliance on the deprecated option --cluster-config to customize job resources
If you wish to add more features, see the original smk-simple-slurm profile or official SLURM profile for inspiration.

Limitations

Can't use group jobs, but they aren't easy to use in the first place
Wildcards can't contain / if you want to use them in the name of the Slurm log file. This is a Slurm requirement (which makes sense, since it has to create a file on the filesystem). You'll either have to change how you manage the wildcards or remove the {wildcards} from the pattern passed to --output, e.g. --output=logs/{rule}/{rule}-%j.out. Note that you can still submit wildcards containing / to --job-name
Requires Snakemake version 7.0.0 or later (for --cluster-cancel). You can test this directly in your Snakefile with min_version()

Quick start

Copy the directory mpi-ie-slurm to a directory of your choice.
Edit any variables in config.yaml if you wish.
You can override any of the defaults by adding a resources field to a rule, e.g.
```
rule much_memory:
    resources:
        mem_mb=64000
```
Invoke snakemake with the profile:
```
snakemake --profile mpi-ie-slurm/
```

Use speed with caution

A big benefit of the simplicity of this profile is the speed in which jobs can be submitted and their statuses checked. The official Slurm profile for Snakemake provides a lot of extra fine-grained control, but this is all defined in Python scripts, which then have to be invoked for each job submission and status check. I needed this speed for a pipeline that had an aggregation rule that needed to be run tens of thousands of times, and the run time for each job was under 10 seconds. In this situation, the job submission rate and status check rate were huge bottlenecks.

However, you should use this speed with caution! On a shared HPC cluster, many users are making requests to the Slurm scheduler. If too many requests are made at once, the performance will suffer for all users. If the rules in your Snakemake pipeline take at least more than a few minutes to complete, then it's overkill to constantly check the status of multiple jobs in a single second. In other words, only increase max-jobs-per-second and/or max-status-checks-per-second if either the submission rate or status checks to confirm job completion are clear bottlenecks.

License

This is all boiler plate code. Please feel free to use it for whatever purpose you like. No need to attribute or cite this repo, but of course it comes with no warranties. To make it official, it's released under the CC0 license. See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 71 Commits
hooks		hooks
test		test
{{ cookiecutter.profile_name }}		{{ cookiecutter.profile_name }}
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
conda.yml		conda.yml
cookiecutter.json		cookiecutter.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MPI-IE Slurm Profile

Install

Features

Limitations

Quick start

Use speed with caution

License

About

Releases

Packages

Languages

License

maxplanck-ie/mpi-ie-slurm

Folders and files

Latest commit

History

Repository files navigation

MPI-IE Slurm Profile

Install

Features

Limitations

Quick start

Use speed with caution

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages