creating a custom config file for slurm cluster? #644

Thomieh73 · 2024-07-27T12:04:41Z

Hi, I have started to try out the mag pipeline. Instead of trying my own data, I now first decided to run the test profile. Makes more sense :-) But I run into some configuration trouble.

Here is what I did.
I can run the command:

 nextflow run nf-core/mag -r 3.0.2 -profile test,apptainer --outdir mag_test -work-dir $USERWORK/nf_mag -resume

That finished okay, but runs on the login node of our HPC cluster, which is not okay.

So I created costum config file, based on the base.config file provided with the pipeline and modified it slightly
See attached here: saga_mag.config.txt

In short, I added these three lines to the process:

executor                = 'slurm'
    clusterOptions          = '--job-name=Saga_nxf --account=nn10070k'
    queueSize               = 24

and to jobs with large memory I added:

clusterOptions  = '--job-name=Saga_nxf --account=nn10070k --partition=bigmem'

I then also found out that I needed to address the missing function check_max in the config file. I added that entire function by copying this into my config file:

// Function to ensure that resource requirements don't go beyond
// a maximum limit
def check_max(obj, type) {
    if (type == 'memory') {
        try {
            if (obj.compareTo(params.max_memory as nextflow.util.MemoryUnit) == 1)
                return params.max_memory as nextflow.util.MemoryUnit
            else
                return obj
        } catch (all) {
            println "   ### ERROR ###   Max memory '${params.max_memory}' is not valid! Using default value: $obj"
            return obj
        }
    } else if (type == 'time') {
        try {
            if (obj.compareTo(params.max_time as nextflow.util.Duration) == 1)
                return params.max_time as nextflow.util.Duration
            else
                return obj
        } catch (all) {
            println "   ### ERROR ###   Max time '${params.max_time}' is not valid! Using default value: $obj"
            return obj
        }
    } else if (type == 'cpus') {
        try {
            return Math.min( obj, params.max_cpus as int )
        } catch (all) {
            println "   ### ERROR ###   Max cpus '${params.max_cpus}' is not valid! Using default value: $obj"
            return obj
        }
    }
}

Next I ran the pipeline with this command:

nextflow run nf-core/mag -r 3.0.2 -profile test,apptainer --outdir mag_test -work-dir $USERWORK/nf_mag -resume -c saga_mag.config

It however fails, because the slurm jobs do not get memory allocated. I see that when I check the files .command.run in the process folder of different tasks.
I see something like this in the top of the file:

#!/bin/bash
#SBATCH -J nf-NFCORE_MAG_MAG_KRONA_KRONADB
#SBATCH -o /cluster/work/users/thhaverk/nf_mag/0c/a7ed03141b73e94a5d8286720673ea/.command.log
#SBATCH --no-requeue
#SBATCH --signal B:USR2@30
#SBATCH --job-name=Saga_nxf --account=nn10070k
NXF_CHDIR=/cluster/work/users/thhaverk/nf_mag/0c/a7ed03141b73e94a5d8286720673ea
### ---
### name: 'NFCORE_MAG:MAG:KRONA_KRONADB'
### container: '/cluster/work/users/thhaverk/apptainer_img/quay.io-biocontainers-krona-2.7.1--pl526_5.img'
### outputs:
### - 'taxonomy/taxonomy.tab'
### - 'versions.yml'
### ...
set -e
set -u
NXF_DEBUG=${NXF_DEBUG:=0}; [[ $NXF_DEBUG > 1 ]] && set -x
NXF_ENTRY=${1:-nxf_main}

So it seems that I do get a slurm job requested, but the .command.run file does not get memory, time and cpu's assigned.

I do not understand why it is failing. I checked the nextflow slack group and found the thread for https://nextflow.slack.com/archives/CEQBS091V/p1614245714076900

Where a similar set-up is described. Any ideas what I am missing here?

The text was updated successfully, but these errors were encountered:

jfy133 · 2024-07-27T13:40:22Z

Hmm, I'm not sure. Unfortunately it's hard to look from my phone...

However if you've copied from base config that is maybe overkill and makes it harder to debug. You so don't need everything from in there just to get the pipeline working with your cluster.

I would suggest maybe start from scratch with making the config following the following tutorial:

https://nf-co.re/docs/tutorials/use_nf-core_pipelines/config_institutional_profile

And see if it's still not working then.

I'm going to close this issue as this is a nextflow configuration issue rather than a pipeline issue, but feel free to keep commenting here and I'll try to keep checking it.

Or even better, ask on the nf-core slack (Https://nf-co.re/join for instructions if you're not already there) on the #configs channel if you're still having issues

jfy133 closed this as completed Jul 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

creating a custom config file for slurm cluster? #644

creating a custom config file for slurm cluster? #644

Thomieh73 commented Jul 27, 2024

jfy133 commented Jul 27, 2024

creating a custom config file for slurm cluster? #644

creating a custom config file for slurm cluster? #644

Comments

Thomieh73 commented Jul 27, 2024

jfy133 commented Jul 27, 2024