Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kallisto subworkflow runs out of memory (reiteration of #38) #116

Closed
Khajidu opened this issue Jun 21, 2022 · 11 comments
Closed

kallisto subworkflow runs out of memory (reiteration of #38) #116

Khajidu opened this issue Jun 21, 2022 · 11 comments
Assignees
Labels
bug Something isn't working
Milestone

Comments

@Khajidu
Copy link
Contributor

Khajidu commented Jun 21, 2022

Description of the bug

Every time I run kallisto as a subworkflow, it crashes for memory reasons. However, unlike in #38, it crashes at the indexing stage instead of the bustools stage. It seems to be the same reason, though, as I get the same kind of error messages (I also tried hard-coding the memory requirements in version 1.1.0 and the pipeline then worked). The reason I see is that I ask for like 32GB of memory (cannot ask for 32G as I get string [32.G] does not match pattern ^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$ (32.G)) and kallisto wants to use 32G (and this cannot be changed for GB either).

Command used and terminal output

$ nextflow run nf-core/scrnaseq -r dev --max_cpus 32 --max_memory '32.GB' --outdir /shared/ifbstor1/projects/bsbii/sc_single_cell_brain/fulltest/ --protocol '10XV3' --aligner kallisto --transcript_fasta /shared/projects/bsbii/sc_single_cell_brain/sc_gene_models_ncbi_utrs.fasta --input 'test_samples_v2.csv' --genome_fasta /shared/projects/bsbii/sc_single_cell_brain/sc_ncbi_genome.fasta --gtf /shared/projects/bsbii/sc_single_cell_brain/sc_gene_models_ncbi_no_genes_no_contigs_notrnas.gtf --kb_workflow 'nucleus' -profile ifb_core

Error executing process > 'NFCORE_SCRNASEQ:SCRNASEQ:KALLISTO_BUSTOOLS:KALLISTOBUSTOOLS_REF (sc_ncbi_genome.fasta)'

Caused by:
  Process `NFCORE_SCRNASEQ:SCRNASEQ:KALLISTO_BUSTOOLS:KALLISTOBUSTOOLS_REF (sc_ncbi_genome.fasta)` terminated for an unknown reason -- Likely it has been terminated by the external system

Command executed:

  kb \
      ref \
      -i kb_ref_out.idx \
      -g t2g.txt \
      -f1 cdna.fa \
      -f2 intron.fa \
      -c1 cdna_t2c.txt \
      -c2 intron_t2c.txt \
      --workflow nucleus \
      sc_ncbi_genome.fasta \
      sc_gene_models_ncbi_no_genes_no_contigs_notrnas.gtf
  
  cat <<-END_VERSIONS > versions.yml
  "NFCORE_SCRNASEQ:SCRNASEQ:KALLISTO_BUSTOOLS:KALLISTOBUSTOOLS_REF":
      kallistobustools: $(echo $(kb --version 2>&1) | sed 's/^.*kb_python //;s/positional arguments.*$//')
  END_VERSIONS

Command exit status:
  -

Command output:
  (empty)

Command error:
  [2022-06-21 09:31:39,900]    INFO [ref_lamanno] Preparing sc_ncbi_genome.fasta, sc_gene_models_ncbi_no_genes_no_contigs_notrnas.gtf
  [2022-06-21 09:32:13,709]    INFO [ref_lamanno] Splitting genome sc_ncbi_genome.fasta into cDNA at tmp/tmp74iebqht
  [2022-06-21 09:32:53,465]    INFO [ref_lamanno] Creating cDNA transcripts-to-capture at tmp/tmp4_dnckwg
  [2022-06-21 09:32:53,805]    INFO [ref_lamanno] Splitting genome into introns at tmp/tmpb18oat1d
  [2022-06-21 09:38:41,370]    INFO [ref_lamanno] Creating intron transcripts-to-capture at tmp/tmpmbj8zrjy
  [2022-06-21 09:38:51,358]    INFO [ref_lamanno] Concatenating 1 cDNA FASTAs to cdna.fa
  [2022-06-21 09:38:51,770]    INFO [ref_lamanno] Concatenating 1 cDNA transcripts-to-captures to cdna_t2c.txt
  [2022-06-21 09:38:51,792]    INFO [ref_lamanno] Concatenating 1 intron FASTAs to intron.fa
  [2022-06-21 09:39:06,987]    INFO [ref_lamanno] Concatenating 1 intron transcripts-to-captures to intron_t2c.txt
  [2022-06-21 09:39:07,161]    INFO [ref_lamanno] Concatenating cDNA and intron FASTAs to tmp/tmpvz8czi6v
  [2022-06-21 09:39:22,955]    INFO [ref_lamanno] Creating transcript-to-gene mapping at t2g.txt
  [2022-06-21 09:39:37,502]    INFO [ref_lamanno] Indexing tmp/tmpvz8czi6v to kb_ref_out.idx

Work dir:
  /shared/ifbstor1/projects/bsbii/sc_single_cell_brain/work/72/f969fb35db25b832205956436bb6e5

Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`

Relevant files

No response

System information

Nextflow version : 22.04.0
Hardware : HPC
Executor : Slurm
Container : Singularity
OS : CentOS
Version of nf-core/scrnaseq : 2.0.0 or dev

@Khajidu Khajidu added the bug Something isn't working label Jun 21, 2022
@Khajidu Khajidu changed the title kallisto subworflow runs out of memory (reiteration of #38) kallisto subworkflow runs out of memory (reiteration of #38) Jun 21, 2022
@apeltzer apeltzer self-assigned this Jun 21, 2022
@apeltzer apeltzer added this to the 2.1.0 milestone Jun 21, 2022
@apeltzer
Copy link
Member

You could try supplying a separate config that overwrites what kallisto is using for all steps, e.g. https://nf-co.re/usage/configuration#custom-configuration-files and then supplying something for memory:

withName: 'KALLISTOBUSTOOLS_REF' {
        memory : '32.GB'
    }

@Khajidu
Copy link
Contributor Author

Khajidu commented Jun 22, 2022

It didn't work, same error.

Here are the logs:

`[2022-06-22 09:29:53,353] INFO [ref_lamanno] Preparing sc_ncbi_genome.fasta, sc_gene_models_ncbi_no_genes_no_contigs_notrnas.gtf

[2022-06-22 09:30:27,852] INFO [ref_lamanno] Splitting genome sc_ncbi_genome.fasta into cDNA at /shared/ifbstor1/projects/bsbii/sc_single_cell_brain/work/ca/8de08f1ecfab6f41d8cc1f94e45c52/tmp/tmp_aaaucz5

[2022-06-22 09:31:08,887] INFO [ref_lamanno] Creating cDNA transcripts-to-capture at /shared/ifbstor1/projects/bsbii/sc_single_cell_brain/work/ca/8de08f1ecfab6f41d8cc1f94e45c52/tmp/tmpxx9d3eue

[2022-06-22 09:31:09,220] INFO [ref_lamanno] Splitting genome into introns at /shared/ifbstor1/projects/bsbii/sc_single_cell_brain/work/ca/8de08f1ecfab6f41d8cc1f94e45c52/tmp/tmp_4zx6l_e

[2022-06-22 09:37:03,403] INFO [ref_lamanno] Creating intron transcripts-to-capture at /shared/ifbstor1/projects/bsbii/sc_single_cell_brain/work/ca/8de08f1ecfab6f41d8cc1f94e45c52/tmp/tmpw8sjit6k

[2022-06-22 09:37:11,916] INFO [ref_lamanno] Concatenating 1 cDNA FASTAs to cdna.fa

[2022-06-22 09:37:12,346] INFO [ref_lamanno] Concatenating 1 cDNA transcripts-to-captures to cdna_t2c.txt

[2022-06-22 09:37:12,369] INFO [ref_lamanno] Concatenating 1 intron FASTAs to intron.fa

[2022-06-22 09:37:28,411] INFO [ref_lamanno] Concatenating 1 intron transcripts-to-captures to intron_t2c.txt

[2022-06-22 09:37:28,593] INFO [ref_lamanno] Concatenating cDNA and intron FASTAs to /shared/ifbstor1/projects/bsbii/sc_single_cell_brain/work/ca/8de08f1ecfab6f41d8cc1f94e45c52/tmp/tmp4jp7quuj

[2022-06-22 09:37:46,283] INFO [ref_lamanno] Creating transcript-to-gene mapping at t2g.txt

[2022-06-22 09:38:01,695] INFO [ref_lamanno] Indexing /shared/ifbstor1/projects/bsbii/sc_single_cell_brain/work/ca/8de08f1ecfab6f41d8cc1f94e45c52/tmp/tmp4jp7quuj to kb_ref_out.idx

[2022-06-22 09:47:41,663] ERROR [ref_lamanno]

[build] loading fasta file /shared/ifbstor1/projects/bsbii/sc_single_cell_brain/work/ca/8de08f1ecfab6f41d8cc1f94e45c52/tmp/tmp4jp7quuj

[build] k-mer length: 31

[build] warning: clipped off poly-A tail (longer than 10)
from 242 target sequences

[build] warning: replaced 24801554 non-ACGUT characters in the input sequence
with pseudorandom nucleotides

[build] counting k-mers ...

[2022-06-22 09:47:41,679] ERROR [main] An exception occurred

Traceback (most recent call last):

File "/usr/local/lib/python3.9/site-packages/kb_python/main.py", line 856, in main
COMMAND_TO_FUNCTION[args.command](parser, args, temp_dir=temp_dir)

File "/usr/local/lib/python3.9/site-packages/kb_python/main.py", line 131, in parse_ref
ref_lamanno(

File "/usr/local/lib/python3.9/site-packages/ngs_tools/logging.py", line 62, in inner
return func(*args, **kwargs)

File "/usr/local/lib/python3.9/site-packages/kb_python/ref.py", line 661, in ref_lamanno
index_result = kallisto_index(combined_path, index_path, k=k or 31)

File "/usr/local/lib/python3.9/site-packages/kb_python/ref.py", line 212, in kallisto_index
run_executable(command)

File "/usr/local/lib/python3.9/site-packages/kb_python/dry/init.py", line 24, in inner
return func(*args, **kwargs)

File "/usr/local/lib/python3.9/site-packages/kb_python/utils.py", line 195, in run_executable
raise sp.CalledProcessError(p.returncode, ' '.join(command))

subprocess.CalledProcessError: Command '/usr/local/bin/kallisto index -i kb_ref_out.idx -k 31 /shared/ifbstor1/projects/bsbii/sc_single_cell_brain/work/ca/8de08f1ecfab6f41d8cc1f94e45c52/tmp/tmp4jp7quuj' died wit
h <Signals.SIGKILL: 9>.

slurmstepd: error: Detected 1 oom-kill event(s) in StepId=23442481.batch cgroup. Some of your processes may have been killed by the cgroup out-of-memory handler.`

@Khajidu
Copy link
Contributor Author

Khajidu commented Jun 22, 2022

Link to profile configuration if needed: https://github.com/nf-core/configs/blob/master/conf/ifb_core.config

@Khajidu
Copy link
Contributor Author

Khajidu commented Jun 23, 2022

Solved by giving more memory to the process in the config file.

@apeltzer
Copy link
Member

Ok, then lets add another profile for you that automatically does that for your cluster

@Khajidu
Copy link
Contributor Author

Khajidu commented Jun 24, 2022

Good!

@apeltzer
Copy link
Member

Can you share what/how you modified the memory in the process config file? Then we can simply copy that

@Khajidu
Copy link
Contributor Author

Khajidu commented Jun 24, 2022

I set the memory as the following:

`process {
withLabel:process_high {
memory = 500.GB
}
withName:'KALLISTOBUSTOOLS_REF' {
memory = 250.GB
}
withName:'KALLISTOBUSTOOLS_COUNT' {
memory = 250.GB
}

}`

@grst
Copy link
Member

grst commented Jul 5, 2022

Any chance this got fixed also for less memory now that we added the -m <MEMORY> flag in the kallisto module?

@ogibson
Copy link

ogibson commented Oct 11, 2022

This issue seems to be resolved. @apeltzer, is there anything else that can be done here?

@apeltzer
Copy link
Member

No, if it works we can just close here 👍🏻

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants