Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Singularity Run issue #95

Closed
yassineS opened this issue Dec 4, 2018 · 36 comments
Closed

Singularity Run issue #95

yassineS opened this issue Dec 4, 2018 · 36 comments

Comments

@yassineS
Copy link

yassineS commented Dec 4, 2018

Description
When running the workflow on Phoenix using the phoenix profile, it starts fine and automatically downloads the prebuild singularity image to work/singularity/nf-core-eager.img. However, some jobs require the image to have a different name: work/singularity/nfcore-eager-latest.img.

I worked around that by copying the image and having it available with both names.

Error:

...
[6a/e48ec3] Submitted process > output_documentation
ERROR ~ Error executing process > 'adapter_removal (A18206_S1_L003)'

Caused by:
  Process `adapter_removal (A18206_S1_L003)` terminated with an error exit status (255)

Command executed:

  AdapterRemoval --file1 A18206_S1_L003_R1_001.fastq.gz --file2 A18206_S1_L003_R2_001.fastq.gz --basename A18206_S1_L003_R1_001 --gzip --threads 1 --trimns --trimqualities --adapter1 AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC --adapter2 AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTA --minlength 30 --minquality 20 --minadapteroverlap 1 --collapse
  #Combine files
  zcat *.collapsed.gz *.collapsed.truncated.gz *.singleton.truncated.gz *.pair1.truncated.gz *.pair2.truncated.gz | gzip > A18206_S1_L003_R1_001.combined.fq.gz
  AdapterRemovalFixPrefix A18206_S1_L003_R1_001.combined.fq.gz A18206_S1_L003_R1_001.combined.prefixed.fq.gz
  rm A18206_S1_L003_R1_001.combined.fq.gz

Command exit status:
  255

Command output:
  (empty)

Command error:
  ERROR  : Image path /fast/users/a1222423/eager/work/singularity/nfcore-eager-latest.img doesn't exist: No such file or directory
  ABORT  : Retval = 255


Work dir:
  /fast/users/a1222423/eager/work/50/0be0767609294bad218271492bab73

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`

 -- Check '.nextflow.log' file for details
Execution cancelled -- Finishing pending tasks before exit
^C
[nf-core/eager] Pipeline Complete
WARN: Killing pending tasks (27)
@yassineS
Copy link
Author

yassineS commented Dec 6, 2018

The issue still persists, I tried manually pulling the singularity image, I also tried the automated way. At first, it works fine and downloads the image to work/singularity/, however, at some point during the execution the image just disappears and the jobs start crashing.

ERROR ~ Error executing process > 'adapter_removal (A19948_S1_L008)'

Caused by:
  Process `adapter_removal (A19948_S1_L008)` terminated with an error exit status (255)

Command executed:

  AdapterRemoval --file1 A19948_S1_L008_R1_001.fastq.gz --file2 A19948_S1_L008_R2_001.fastq.gz --basename A19948_S1_L008_R1_001 --gzip --threads 2 --trimns --trimqualities --adapter1 AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC --adapter2 AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTA --minlength 30 --minquality 20 --minadapteroverlap 1 --collapse
  #Combine files
  zcat *.collapsed.gz *.collapsed.truncated.gz *.singleton.truncated.gz *.pair1.truncated.gz *.pair2.truncated.gz | gzip > A19948_S1_L008_R1_001.combined.fq.gz
  AdapterRemovalFixPrefix A19948_S1_L008_R1_001.combined.fq.gz A19948_S1_L008_R1_001.combined.prefixed.fq.gz
  rm A19948_S1_L008_R1_001.combined.fq.gz

Command exit status:
  255

Command output:
  (empty)

Command error:
  ERROR  : Image path /fast/users/a1222423/eager/work/singularity/nfcore-eager-latest.img doesn't exist: No such file or directory
  ABORT  : Retval = 255


Work dir:
  /fast/users/a1222423/eager/work/2c/0ffb5934dcf9745340d82376c7761f

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`

 -- Check '.nextflow.log' file for details
Execution cancelled -- Finishing pending tasks before exit

@apeltzer
Copy link
Member

apeltzer commented Dec 6, 2018

This is something I've never seen before, to be honest. The image is set once for the entire pipeline and should contain all the required dependencies in one single image file. (see here:

container = params.container
)

Can you provide you command to run?

@yassineS
Copy link
Author

yassineS commented Dec 6, 2018

I don't understand it either, it's very strange. Here's the command:

nextflow run nf-core/eager --pairedEnd --reads "/data/acad/HC/01-datafiles/*_R{1,2}_001.fastq.gz" --trim_bam 3 --snpcapture false --udg true —udg_type "Half" --bwamem --genome GRCh37 --saveReference true  -profile phoenix -r dev -name HC

@apeltzer
Copy link
Member

apeltzer commented Dec 9, 2018

I fear this is something with the PHOENIX clusters way of loading modules.

The containers are solely defined in the base.config and additionally also imported in the nextflow.config. This is kept in sync automatically and all processes use the same container.

Local tests and on multiple HPC systems suggest this is more an issue of the way PHOENIX requires setting SINGULARITY_BINDPATHs...

@pditommaso Maybe you have an idea whats going on here?

@pditommaso
Copy link

@apeltzer Could you summarise the problem?

@apeltzer
Copy link
Member

Yes! Give me a few minutes!

@apeltzer
Copy link
Member

@yassineS is trying to use the pipeline to run on the Phoenix cluster. Apparently this cluster requires setting the envWhitelist='SINGULARITY_BINDPATH' in the config, as indicated here, otherwise the work directories won't be found.

envWhitelist='SINGULARITY_BINDPATH'

However, when doing this, apparently the pipeline doesn't find the pulled image properly anymore:


Command error:
  ERROR  : Image path /fast/users/a1222423/eager/work/singularity/nfcore-eager-latest.img doesn't exist: No such file or directory
  ABORT  : Retval = 255


Work dir:
  /fast/users/a1222423/eager/work/2c/0ffb5934dcf9745340d82376c7761f

#95 (comment)

I don't really understand whats going on there - just assuming that it has to do something with this BINDPATH being not the same as in the container then, thus not finding the image pulled by the main nextflow process and used in the container ?

@pditommaso
Copy link

What's the value of the SINGULARITY_BINDPATH variable? Is it defined in host environment?

@apeltzer
Copy link
Member

Yes, apparently it's defined:

#68 (comment)

@apeltzer
Copy link
Member

Hm when reading this through, I have the feeling that we can get rid of this envWhiteList=... and simply just set:

autoMounts = true

@pditommaso
Copy link

I think so. Alternatively you can ty to to specify that path using process.containerOptions = '--bind <path>' (not sure if it's --bind or -B).

@apeltzer
Copy link
Member

Pushed an update @yassineS - can you please check this again? According to @uoabowen's comment here, we could simply rely on the automount feature: #68 (comment)

I removed the envWhitelist - can you please test again using the current -r dev branch and your -profile phoenix?

@yassineS
Copy link
Author

yassineS commented Dec 11, 2018

Unfortunately, the newly pushed changes did not fix the issue. I still get:

ERROR ~ Error executing process > 'adapter_removal (A18181_S1_L005)'

Caused by:
  Process `adapter_removal (A18181_S1_L005)` terminated with an error exit status (255)

Command executed:

  AdapterRemoval --file1 A18181_S1_L005_R1_001.fastq.gz --file2 A18181_S1_L005_R2_001.fastq.gz --basename A18181_S1_L005_R1_001 --gzip --threads 1 --trimns --trimqualities --adapter1 AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC --adapter2 AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTA --minlength 30 --minquality 20 --minadapteroverlap 1 --collapse
  #Combine files
  zcat *.collapsed.gz *.collapsed.truncated.gz *.singleton.truncated.gz *.pair1.truncated.gz *.pair2.truncated.gz | gzip > A18181_S1_L005_R1_001.combined.fq.gz
  AdapterRemovalFixPrefix A18181_S1_L005_R1_001.combined.fq.gz A18181_S1_L005_R1_001.combined.prefixed.fq.gz
  rm A18181_S1_L005_R1_001.combined.fq.gz

Command exit status:
  255

Command output:
  (empty)

Command error:
  ERROR  : Image path /fast/users/a1222423/eager/work/singularity/nfcore-eager-latest.img doesn't exist: No such file or directory
  ABORT  : Retval = 255


Work dir:
  /fast/users/a1222423/eager/work/11/40dcd6c0fb5afbb26747e5ebe6b4d5

Tip: when you have fixed the problem you can continue the execution appending to the nextflow command line the option `-resume`

 -- Check '.nextflow.log' file for details
Execution cancelled -- Finishing pending tasks before exit

@uoabowen, any clues what we might be doing wrong with singularity?

@uoabowen
Copy link

@yassineS I just ran a singularity image for OpenFOAM, a CFD application., under /fast/users/a1XXXXXX/eager/work/singularity/. It worked well.

I tried to reproduce your issue. But there is no subdirectory named HC under /data/acad/.

@apeltzer I have changed our singularity.conf to auto bind /fast and /data since #68

The test with OpenFOAM showed the auto bind should still work.

When I get any test case for the pipeline, I will test and update this.

@yassineS
Copy link
Author

Hi @uoabowen I masked the true path as these are protected human samples. You can use the screen session under my account in l01 to test if you want to. The issue still persists.

@uoabowen
Copy link

@yassineS I am testing it with your Screen session. The test case has been running for around 1 hour and I haven't met any error codes. I will update if any errors appear

@uoabowen
Copy link

uoabowen commented Dec 11, 2018

I met the error when the test case running around 2 hours.

[a1XXXXXX@l01 eager]$ nextflow run nf-core/eager --pairedEnd --reads "/data/XXXX/HC/01-datafiles/*_R{1,2}_001.fastq.gz" ?trim_bam 3 --snpcapture false --udg true ?udg_type Half --bwamem --genome GRCh37 ?saveReference true ?-profile phoenix -r dev -name AHP_HC_1341                                                                                       
N E X T F L O W  ~  version 18.10.1                                                                                                                                                      
Launching `nf-core/eager` [AHP_HC_1341] - revision: b07635a749 [dev]
=========================================                                                                                                                                                
 nf-core/eager v2.0.3dev
=========================================                                                                                                                                                
Pipeline Name  : nf-core/eager
Pipeline Version: 2.0.3dev                                                                                                                                                               
Run Name       : AHP_HC_1341
Reads          : /data/XXXX/HC/01-datafiles/*_R{1,2}_001.fastq.gz                                                                                                     
Fasta Ref      : false
Data Type      : Paired-End                                                                                                                                                              
Max Memory     : 128 GB
Max CPUs       : 16
Max CPUs       : 16
Max Time       : 10d
Output dir     : ./results                                                                                                                                                               
Working dir    : /fast/users/a1XXXXXX/eager/work
Container Engine: singularity                                                                                                                                                            
Container      : nfcore/eager:latest
Current home   : /home/a1XXXXXX                                                                                                                                                          
Current user   : a1XXXXXX
Current path   : /home/a1XXXXXX/fastdir/eager                                                                                                                                            
Script dir     : /home/a1XXXXXX/.nextflow/assets/nf-core/eager
Config Profile : standard                                                                                                                                                                
=========================================
[warm up] executor > SLURM                                                                                                                                                               
WARN: Singularity cache directory has not been defined -- Remote image will be stored in the path: /fast/users/a1XXXXXX/eager/work/singularity
Pulling Singularity image docker://nfcore/eager:latest [cache /fast/users/a1XXXXXX/eager/work/singularity/nfcore-eager-latest.img]                                                       
[e1/df7d52] Submitted process > fastqc (A19942_S1_L008)
[cc/5f3427] Submitted process > fastqc (A18170_S1_L006)                                                                                                                                  
[1b/5f9a1f] Submitted process > fastqc (A14936_S1_L006)
[f4/d6665f] Submitted process > fastqc (A19927_S1_L005)                                                                                                                                  
[af/872a28] Submitted process > fastqc (A19934_S1_L006)
[87/f23ac3] Submitted process > output_documentation                                                                                                                                     
[8f/c6e73c] Submitted process > fastqc (A14960_S1_L008)
[e7/d7f569] Submitted process > fastqc (A19938_S1_L007)                                                                                                                                  
[f8/bbc8b2] Submitted process > get_software_versions
[3e/36fac7] Submitted process > fastqc (A19925_S1_L004)                                                                                                                                  
[cb/355864] Submitted process > fastqc (A18184_S1_L006)
[80/21adf8] Submitted process > fastqc (A14937_S1_L007)                                                                                                                                  
[24/a17cee] Submitted process > fastqc (A18206_S1_L003)
[53/84936d] Submitted process > fastqc (A19948_S1_L008)                                                                                                                                  
[dc/b7a8ef] Submitted process > fastqc (A18190_S1_L002)
[8d/e4707c] Submitted process > fastqc (A18171_S1_L008)                                                                                                                                  
[7b/d2fc80] Submitted process > fastqc (A18181_S1_L005)
[85/2686b5] Submitted process > fastqc (A14971_S1_L007)                                                                                                                                  
[cc/ee4ca0] Submitted process > adapter_removal (A14936_S1_L006)
[0d/2f33f3] Submitted process > adapter_removal (A18190_S1_L002)                                                                                                                         
[11/0bb59b] Submitted process > adapter_removal (A19942_S1_L008)
[21/9941f9] Submitted process > adapter_removal (A19927_S1_L005)                                                                                                                         
[33/935904] Submitted process > adapter_removal (A19925_S1_L004)
[81/c10133] Submitted process > adapter_removal (A18206_S1_L003)                                                                                                                         
[b8/c25c89] Submitted process > adapter_removal (A19948_S1_L008)
[4c/41fb5e] Submitted process > adapter_removal (A14971_S1_L007)                                                                                                                         
[8e/4bdf4e] Submitted process > adapter_removal (A19938_S1_L007)
[65/f49585] Submitted process > adapter_removal (A14960_S1_L008)                                                                                                                         
[5c/e759c2] Submitted process > adapter_removal (A19934_S1_L006)
[c6/3733b1] Submitted process > adapter_removal (A18184_S1_L006)                                                                                                                         
[f2/976544] Submitted process > adapter_removal (A18171_S1_L008)
[a3/dfc962] Submitted process > adapter_removal (A18181_S1_L005)                                                                                                                         
[c4/d3e130] Submitted process > adapter_removal (A14937_S1_L007)
[0f/466552] Submitted process > adapter_removal (A18170_S1_L006)
[9e/b02886] Submitted process > fastqc_after_clipping (A19938_S1_L007_R1_001.combined.prefixed.fq)                                                                                                              
ERROR ~ Error executing process > 'fastqc_after_clipping (A19938_S1_L007_R1_001.combined.prefixed.fq)'                                                                                                          
                                                                                                                                                                                                                
Caused by:                                                                                                                                                                                                      
  Process `fastqc_after_clipping (A19938_S1_L007_R1_001.combined.prefixed.fq)` terminated with an error exit status (255)                                                                                       
                                                                                                                                                                                                                
Command executed:                                                                                                                                                                                               
                                                                                                                                                                                                                
  fastqc -q A19938_S1_L007_R1_001.combined.prefixed.fq.gz                                                                                                                                                       
                                                                                                                                                                                                                
Command exit status:                                                                                                                                                                                            
  255                                                                                                                                                                                                           
                                                                                                                                                                                                                
Command output:                                                                                                                                                                                                 
  (empty)                                                                                                                                                                                                       
                                                                                                                                                                                                                
Command error:                                                                                                                                                                                                  
  ERROR  : Image path /fast/users/a1222423/eager/work/singularity/nfcore-eager-latest.img doesn't exist: No such file or directory                                                                              
  ABORT  : Retval = 255                                                                                                                                                                                         
                                                                                                                                                                                                                
                                                                                                                                                                                                                
Work dir:                                                                                                                                                                                                       
  /fast/users/a1222423/eager/work/9e/b028867d9fb85ab5c03cc6baa3f061                                                                                                                                             
                                                                                                                                                                                                                
Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`                                                                                               
                                                                                                                                                                                                                
 -- Check '.nextflow.log' file for details                                                                                                                                                                      
Execution cancelled -- Finishing pending tasks before exit                                                                                                                                                      
[cc/5f3427] NOTE: Process `fastqc (A18170_S1_L006)` terminated with an error exit status (143) -- Execution is retried (1)                                                                                      
[8f/c6e73c] NOTE: Process `fastqc (A14960_S1_L008)` terminated with an error exit status (143) -- Execution is retried (1)                                                                                      
[80/21adf8] NOTE: Process `fastqc (A14937_S1_L007)` terminated with an error exit status (143) -- Execution is retried (1)                                                                                      
[53/84936d] NOTE: Process `fastqc (A19948_S1_L008)` terminated with an error exit status (143) -- Execution is retried (1)                                                                                      
[dc/b7a8ef] NOTE: Process `fastqc (A18190_S1_L002)` terminated with an error exit status (143) -- Execution is retried (1)                                                                                      
[7b/d2fc80] NOTE: Process `fastqc (A18181_S1_L005)` terminated with an error exit status (143) -- Execution is retried (1)                                                                                      
[cc/ee4ca0] NOTE: Process `adapter_removal (A14936_S1_L006)` terminated with an error exit status (143) -- Execution is retried (1)                                                                             
[0d/2f33f3] NOTE: Process `adapter_removal (A18190_S1_L002)` terminated with an error exit status (143) -- Execution is retried (1)                                                                             
[11/0bb59b] NOTE: Process `adapter_removal (A19942_S1_L008)` terminated with an error exit status (143) -- Execution is retried (1)                                                                             
[21/9941f9] NOTE: Process `adapter_removal (A19927_S1_L005)` terminated with an error exit status (143) -- Execution is retried (1)                                                                             
[33/935904] NOTE: Process `adapter_removal (A19925_S1_L004)` terminated with an error exit status (143) -- Execution is retried (1)                                                                             
[81/c10133] NOTE: Process `adapter_removal (A18206_S1_L003)` terminated with an error exit status (143) -- Execution is retried (1)                                                                             
[b8/c25c89] NOTE: Process `adapter_removal (A19948_S1_L008)` terminated with an error exit status (143) -- Execution is retried (1)                                                                             
[4c/41fb5e] NOTE: Process `adapter_removal (A14971_S1_L007)` terminated with an error exit status (143) -- Execution is retried (1)                                                                             
[65/f49585] NOTE: Process `adapter_removal (A14960_S1_L008)` terminated with an error exit status (143) -- Execution is retried (1)                                                                             
[5c/e759c2] NOTE: Process `adapter_removal (A19934_S1_L006)` terminated with an error exit status (143) -- Execution is retried (1)                                                                             
[c6/3733b1] NOTE: Process `adapter_removal (A18184_S1_L006)` terminated with an error exit status (143) -- Execution is retried (1)                                                                             
[f2/976544] NOTE: Process `adapter_removal (A18171_S1_L008)` terminated with an error exit status (143) -- Execution is retried (1)                                                                             
[a3/dfc962] NOTE: Process `adapter_removal (A18181_S1_L005)` terminated with an error exit status (143) -- Execution is retried (1)                                                                             
[c4/d3e130] NOTE: Process `adapter_removal (A14937_S1_L007)` terminated with an error exit status (143) -- Execution is retried (1)                                                                             
[0f/466552] NOTE: Process `adapter_removal (A18170_S1_L006)` terminated with an error exit status (143) -- Execution is retried (1)

When I went to the work dir, executing the commands as the instructions, I got

[a1222423@l01 eager]$ cd /fast/users/a1222423/eager/work/9e/b028867d9fb85ab5c03cc6baa3f061
[a1222423@l01 b028867d9fb85ab5c03cc6baa3f061]$ bash .command.run
ERROR  : Image path /fast/users/a1222423/eager/work/singularity/nfcore-eager-latest.img doesn't exist: No such file or directory
ABORT  : Retval = 255

And when I checked /home/a1222423/fastdir/eager/work/singularity, I got

[a1222423@l01 singularity]$ pwd
/home/a1222423/fastdir/eager/work/singularity
[a1222423@l01 singularity]$ ls
pitzDaily.org

pitzDaily.org is the directory in which I ran the OpenFoam Singularity image. Seems there should be the image, nfcore-eager-latest.img, but not.

@yassineS
Copy link
Author

When the workflow starts the image initially does exist under workflow/singularity/nf-core-eager.img, but at some point, it just disappears.

@uoabowen
Copy link

I tested it many times in the pasted few days. I found nfcore-eager-latest.img would be pulled at the beginning of running the pipeline. But after a couple of minutes, the image would be removed. When the pipeline ran to a certain point, it would use the .img again. Then it would threw the error.

Besides using default ./work/singularity as image cachedir, I also tried set NXF_SINGULARITY_CACHEDIR to specify a certain directory to store the image, but the image would still be removed under the directory specified.

I even tried to manually pull the image and copy it to the cachedir during running, but as long as the pipeline hasn't finished, it will be removed.

In addition, I tried to use --with-singularity option to let the pipeline to use an existing image. After several minutes, the existing image would still be removed.

@apeltzer
Copy link
Member

apeltzer commented Mar 2, 2019

Hi @yassineS - should we maybe find another date to fix this together? I'd say this is doable but probably easier when we have a joint Skype / Session to see whats going on... send me an e-mail 👍

@yassineS
Copy link
Author

yassineS commented Mar 3, 2019

I agree @apeltzer, and maybe we should get @uoabowen in as well if he got the time.

@jfy133
Copy link
Member

jfy133 commented Oct 14, 2019

@yassineS @uoabowen Did you guys resolve this in the end, or have you tried a later release?

If you're not interested in pursuing we will close the issue for now.

@yassineS
Copy link
Author

We still are, we just didn't look into it for a while. @uoabowen shall we kick off another round of tests?

@uoabowen
Copy link

uoabowen commented Oct 15, 2019 via email

@yassineS
Copy link
Author

yassineS commented Oct 20, 2019 via email

@uoabowen
Copy link

uoabowen commented Oct 24, 2019 via email

@jfy133
Copy link
Member

jfy133 commented Oct 24, 2019

Do you have the same issue when running other nf-core/nextflow pipelines? If yes, maybe we can bump the issue to the wider nf-core community?

Also, does the same issue occur if you specify the singularity profile (alongside test) in addition to --with-singularity?

I also see that you specify a custom nextflow config. Is the test profile you specify derived from that config? Or is that still meant to be EAGER default test profile?

@yassineS
Copy link
Author

@jfy133 I tested nf-core/deepvariant and it runs just fine.

I tested EAGER with -with-singularity and I run into the same behaviour, EAGER would run a few jobs and then the singularity image would just disappear.

@jfy133
Copy link
Member

jfy133 commented Oct 28, 2019

@yassineS could you supply the command you used for deepvariant?

I wonder if -with-singularity + a profile (-c phoenix241019.conf) might be conflicting with each other? Or one is over-riding the other?

If you're not already, maybe you could go to to the nf-core slack and join #eager, and describe exactly your set up and the commands you've tried. I think we need singularity/infrastructure experts...

@apeltzer
Copy link
Member

Normally the -with-singularity can override whatever is specified in the config. Either specify everything, also the container to use in a profile or specify what you need on the CLI - don't mix both ideally...

Would be great to see you in the nf-core slack and in channel #eager to discuss further there - its much faster to debug and resolve that way. Pretty sure its just a overriding of variables now and we'll get there quickly. Running the other deepvariant pipeline uses the centralized nf-core/configs, so if you just use eager this way:

nextflow pull nf-core/eager
nextflow run nf-core/eager -profile phoenix,test

Should then also work. Don't add options, just that above and let's see what kind of error that creates :-)

@jfy133
Copy link
Member

jfy133 commented Oct 28, 2019

In addition:

Maybe start with a fresh EAGER install: rm -r ~/.nextflow/assets/nf-core/eager

And please give us your entire log file

@yassineS
Copy link
Author

@jfy133 that's exactly what I did, I removed all old instances of eager and even freshly re-installed nextflow. I didn't use the phoenix config, I tried -with-singularity /path/to/image.simg and -profile singularity

$ cat config
process {
  executor = 'SLURM'
}

singularity.enabled = true
process.container = "/home/a1222423/fastdir/nf/nf-core-eager-2.0.7.simg"

$ ll *.simg
-rwxr-xr-x 1 a1222423 a1222423 1.4G Oct 29 09:40 nf-core-eager-2.0.7.simg

$ ./nextflow run nf-core/eager -profile test --pairedEnd -with-singularity ./nf-core-eager-2.0.7.simg
N E X T F L O W  ~  version 19.10.0
Launching `nf-core/eager` [happy_cori] - revision: b8d3dec3f0 [master]
[2m----------------------------------------------------
                                        ,--./,-.
        ___     __   __   __   ___     /,-._.--~'
  |\ | |__  __ /  ` /  \ |__) |__         }  {
  | \| |       \__, \__/ |  \ |___     \`-._,-`-,
                                        `._,._,'
  nf-core/eager v2.0.7
----------------------------------------------------
Pipeline Name     : nf-core/eager
Pipeline Version  : 2.0.7
Run Name          : happy_cori
Reads             : data/*{1,2}.fastq.gz
Fasta Ref         : https://raw.githubusercontent.com/nf-core/test-datasets/eager2/reference/Mammoth_MT_Krause.fasta
BAM Index Type    : CSI
Data Type         : Paired-End
Skip Collapsing   : No
Skip Trimming     : No
Output stripped fastq: No
Max Memory        : 6 GB
Max CPUs          : 2
Max Time          : 2d
Output dir        : ./results
Working dir       : /fast/users/a1222423/nf/work
Container Engine  : singularity
Container         : ./nf-core-eager-2.0.7.simg
Current home      : /home/a1222423
Current user      : a1222423
Current path      : /home/a1222423/fastdir/nf
Script dir        : /home/a1222423/.nextflow/assets/nf-core/eager
Config Profile    : test
Config Description: Minimal test dataset to check pipeline function
[2m----------------------------------------------------
executor >  SLURM (20)
executor >  SLURM (20)
executor >  SLURM (21)
executor >  SLURM (21)
[d6/33195a] process > makeBWAIndex (Mammoth_MT_Krause.fasta)                                                        [100%] 1 of 1 ✔
[77/d2c63e] process > makeFastaIndex (Mammoth_MT_Krause.fasta)                                                      [100%] 1 of 1 ✔
[28/d30b61] process > makeSeqDict (Mammoth_MT_Krause.fasta)                                                         [100%] 1 of 1 ✔
[-        ] process > convertBam                                                                                    -
[f5/16ac6e] process > fastqc (JK2782_TGGCCGATCAACGA_L008)                                                           [100%] 2 of 2 ✔
[-        ] process > fastp                                                                                         -
[7b/e78c62] process > adapter_removal (JK2785_TGGCCGATCAACGA_L008)                                                  [100%] 2 of 2 ✔
[98/eb4e83] process > fastqc_after_clipping (JK2785_TGGCCGATCAACGA_L008_R1_001.fastq.gz.tengrand.fq)                [100%] 2 of 2 ✔
[81/53959a] process > bwa (JK2785_TGGCCGATCAACGA_L008_R1_001.fastq.gz.tengrand.fq)                                  [100%] 2 of 2 ✔
[-        ] process > circulargenerator                                                                             -
[-        ] process > circularmapper                                                                                -
[-        ] process > bwamem                                                                                        -
[a9/274ad0] process > samtools_flagstat (JK2785_TGGCCGATCAACGA_L008_R1_001.fastq.gz.tengrand.fq.combined.fq.sorted) [100%] 2 of 2 ✔
[5b/c12308] process > samtools_filter (JK2785_TGGCCGATCAACGA_L008_R1_001.fastq.gz.tengrand.fq.combined.fq.sorted... [100%] 2 of 2 ✔
[-        ] process > strip_input_fastq                                                                             -
[53/e727e8] process > samtools_flagstat_after_filter (JK2785_TGGCCGATCAACGA_L008_R1_001.fastq.gz.tengrand.fq.com... [100%] 2 of 2 ✔
[f2/3dc5a3] process > dedup (JK2782_TGGCCGATCAACGA_L008_R1_001.fastq.gz.tengrand.fq.combined.fq.sorted.bam.filte... [100%] 1 of 1
[-        ] process > preseq                                                                                        -
[a7/02ddcf] process > damageprofiler (JK2785_TGGCCGATCAACGA_L008_R1_001.fastq.gz.tengrand.fq.combined.fq.sorted)    [100%] 2 of 2, failed: 2 ✘
[-        ] process > qualimap                                                                                      -
[-        ] process > markDup                                                                                       -
[-        ] process > pmdtools                                                                                      -
[-        ] process > bam_trim                                                                                      -
[e1/72706b] process > output_documentation                                                                          [100%] 1 of 1 ✔
[-        ] process > get_software_versions                                                                         -
[-        ] process > multiqc                                                                                       -

Error executing process > 'damageprofiler (JK2782_TGGCCGATCAACGA_L008_R1_001.fastq.gz.tengrand.fq.combined.fq.sorted)'

Caused by:
  Missing output file(s) `JK2782_TGGCCGATCAACGA_L008_R1_001.fastq.gz.tengrand.fq.combined.fq.sorted/dmgprof.json` expected by process `damageprofiler (JK2782_TGGCCGATCAACGA_L008_R1_001.fastq.gz.tengrand.fq.combined.fq.sorted)`

Command executed:

  damageprofiler -i JK2782_TGGCCGATCAACGA_L008_R1_001.fastq.gz.tengrand.fq.combined.fq.sorted.bam -r Mammoth_MT_Krause.fasta -l 100 -t 15 -o .

Command exit status:
  0

Command output:
  DamageProfiler v0.3.9

Work dir:
  /fast/users/a1222423/nf/work/21/b179b5823b6238985f8d8979d0826a

Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`

First, the test doesn't run fully.

Second, when I try to run through other test data:

$ ./nextflow run nf-core/eager --reads 'test_data_lazaridis17/I007*.fq.gz' --fasta /data/acad/Refs/Homo_sapiens/GATK/b37/human_g1k_v37_decoy.fasta --bwa_index /data/acad/Refs/Homo_sapiens/GATK/b37/human_g1k_v37_decoy.fasta --seq_dict /data/acad/Refs/Homo_sapiens/GATK/b37/human_g1k_v37_decoy.dict --fasta_index /data/acad/Refs/Homo_sapiens/GATK/b37/human_g1k_v37_decoy.fasta.fai --complexity_filter_poly_g --circularmapper --trim_bam 3 --bamutils_softclip --singleEnd --skip_adapterremoval -with-singularity nf-core-eager-2.0.7.simg

N E X T F L O W  ~  version 19.10.0
Launching `nf-core/eager` [big_bardeen] - revision: b8d3dec3f0 [master]
[2m----------------------------------------------------
                                        ,--./,-.
        ___     __   __   __   ___     /,-._.--~'
  |\ | |__  __ /  ` /  \ |__) |__         }  {
  | \| |       \__, \__/ |  \ |___     \`-._,-`-,
                                        `._,._,'
  nf-core/eager v2.0.7
----------------------------------------------------
Pipeline Name     : nf-core/eager
Pipeline Version  : 2.0.7
Run Name          : big_bardeen
Reads             : test_data_lazaridis17/I007*.fq.gz
Fasta Ref         : /data/acad/Refs/Homo_sapiens/GATK/b37/human_g1k_v37_decoy.fasta
BAM Index Type    : CSI
BWA Index         : /data/acad/Refs/Homo_sapiens/GATK/b37/human_g1k_v37_decoy.fasta
Data Type         : Single-End
Skip Collapsing   : No
Skip Trimming     : No
Output stripped fastq: No
Max Memory        : 128 GB
Max CPUs          : 16
Max Time          : 10d
Output dir        : ./results
Working dir       : /fast/users/a1222423/nf/work
Container Engine  : singularity
Container         : nf-core-eager-2.0.7.simg
Current home      : /home/a1222423
Current user      : a1222423
Current path      : /home/a1222423/fastdir/nf
Script dir        : /home/a1222423/.nextflow/assets/nf-core/eager
Config Profile    : standard
[2m----------------------------------------------------
executor >  SLURM (12)
[-        ] process > makeFastaIndex                                    -
[-        ] process > makeSeqDict                                       -
[-        ] process > convertBam                                        -
[75/f8a353] process > fastqc (I0070)                                    [ 50%] 1 of 2
[7c/d21503] process > fastp (I0071)                                     [100%] 2 of 2 ✔
[-        ] process > fastqc_after_clipping                             -
[-        ] process > bwa                                               -
[13/8950e7] process > circulargenerator (human_g1k_v37_decoy_500.fasta) [100%] 1 of 1 ✔
[27/dbe728] process > circularmapper (I0070)                            [ 25%] 1 of 4
[-        ] process > bwamem                                            -
[0b/15a48f] process > samtools_flagstat (I0070.sorted)                  [  0%] 0 of 1
[40/b99abb] process > samtools_filter (I0070.sorted.bam)                [  0%] 0 of 1
[-        ] process > strip_input_fastq                                 -
[-        ] process > samtools_flagstat_after_filter                    -
[-        ] process > dedup                                             -
[-        ] process > preseq                                            -
[-        ] process > damageprofiler                                    -
[-        ] process > qualimap                                          -
[-        ] process > markDup                                           -
[-        ] process > pmdtools                                          -
[-        ] process > bam_trim                                          -
[e9/33a63b] process > output_documentation                              [100%] 1 of 1 ✔
[-        ] process > get_software_versions                             -
[-        ] process > multiqc                                           -
Pulling Singularity image docker://nf-core-eager-2.0.7.simg [cache /fast/users/a1222423/nf/work/singularity/nf-core-eager-2.0.7.simg.img]
[0;35m[nf-core/eager] Pipeline completed with errors
WARN: Singularity cache directory has not been defined -- Remote image will be stored in the path: /fast/users/a1222423/nf/work/singularity
WARN: Killing pending tasks (5)
WARN: Killing pending tasks (4)
Error executing process > 'samtools_filter (I0070.sorted.bam)'

Caused by:
  Failed to pull singularity image
  command: singularity pull  --name nf-core-eager-2.0.7.simg.img docker://nf-core-eager-2.0.7.simg > /dev/null
  status : 1
  message:
    WARNING: pull for Docker Hub is not guaranteed to produce the
    WARNING: same image on repeated pull. Use Singularity Registry
    WARNING: (shub://) to pull exactly equivalent images.
    ERROR UNAUTHORIZED: authentication required
    ERROR Check existence, naming, and permissions
    ERROR: pulling container failed!
java.lang.IllegalStateException: Failed to pull singularity image
  command: singularity pull  --name nf-core-eager-2.0.7.simg.img docker://nf-core-eager-2.0.7.simg > /dev/null
  status : 1
  message:
    WARNING: pull for Docker Hub is not guaranteed to produce the
    WARNING: same image on repeated pull. Use Singularity Registry
    WARNING: (shub://) to pull exactly equivalent images.
    ERROR UNAUTHORIZED: authentication required
    ERROR Check existence, naming, and permissions
    ERROR: pulling container failed!

$ ll
total 32K
-rw-rw-r--  1 a1222423 a1222423  134 Oct 29 10:00 config
-rwx--x--x  1 a1222423 a1222423  16K Oct 29 09:51 nextflow
drwxrwxr-x 11 a1222423 a1222423 4.0K Oct 29 11:11 results
drwxrwxr-x  2 a1222423 a1222423 4.0K Oct 29 09:43 test_data_lazaridis17
drwxrwxr-x 53 a1222423 a1222423 4.0K Oct 29 11:15 work
  • I specified the fasta dict file and Indices, but the pipeline is disregarding that and re-building them.
  • The image just disapears in the middle of the run.

I attached the log file.
nextflow_failed_singularity.log

@jfy133
Copy link
Member

jfy133 commented Oct 29, 2019

@yassineS

  1. Thanks for the log - but you could please try Alex's suggestion first? i.e.
nextflow pull nf-core/eager
nextflow run nf-core/eager -profile phoenix,test

(is there a reason why you are not using the phoenix config? It would help us if you were using it, as then Alex and I exactly know what is going on with the configs you are using)

  1. It looks like we are making progress though (hopefully). The first instance is a known bug in the stable version of EAGER because of an out-of-sync version of DamageProfiler.

Could you try:

./nextflow run nf-core/eager -profile test --pairedEnd -with-singularity './nf-core-eager-2.0.7.simg' -r fixes-for-2.1.0

(this is is our 'edge' version which has a lot of bug fixes and changes).

  1. I also realise that none of your paths in the second example you just gave are in quotes, can you try again but with the quotes? I also wonder if the singularity image path also needs to be in quotes? Could you re-run the second command again but with the quotes?

I realise now this is mis-documented, and I will go fix that now.

@yassineS
Copy link
Author

Note: I setup nf-core Slack, I'm Yassine Souilmi on there.

Alright, I tried what you suggested above:

1- This crashes right away:

$ nextflow run nf-core/eager -profile phoenix,test --pairedEnd
N E X T F L O W  ~  version 19.10.0
Launching `nf-core/eager` [silly_sax] - revision: b8d3dec3f0 [master]
[2m----------------------------------------------------
                                        ,--./,-.
        ___     __   __   __   ___     /,-._.--~'
  |\ | |__  __ /  ` /  \ |__) |__         }  {
  | \| |       \__, \__/ |  \ |___     \`-._,-`-,
                                        `._,._,'
  nf-core/eager v2.0.7
----------------------------------------------------
Pipeline Name     : nf-core/eager
Pipeline Version  : 2.0.7
Run Name          : silly_sax
Reads             : data/*{1,2}.fastq.gz
Fasta Ref         : https://raw.githubusercontent.com/nf-core/test-datasets/eager2/reference/Mammoth_MT_Krause.fasta
BAM Index Type    : CSI
Data Type         : Paired-End
Skip Collapsing   : No
Skip Trimming     : No
Output stripped fastq: No
Max Memory        : 6 GB
Max CPUs          : 2
Max Time          : 2d
Output dir        : ./results
Working dir       : /fast/users/a1222423/nf/work
Container Engine  : singularity
Container         : nfcore/eager:2.0.7
Current home      : /home/a1222423
Current user      : a1222423
Current path      : /home/a1222423/fastdir/nf
Script dir        : /home/a1222423/.nextflow/assets/nf-core/eager
Config Profile    : phoenix,test
Config Description: Minimal test dataset to check pipeline function
Config Contact    : Yassine Souilmi / Alexander Peltzer (@yassineS, @apeltzer)
Config URL        : https://www.adelaide.edu.au/phoenix/
[2m----------------------------------------------------
[-        ] process > makeBWAIndex                   -
[-        ] process > makeFastaIndex                 -
[-        ] process > makeSeqDict                    -
[-        ] process > convertBam                     -
[-        ] process > fastqc                         -
[-        ] process > fastp                          -
[-        ] process > adapter_removal                -
[-        ] process > fastqc_after_clipping          -
[-        ] process > bwa                            -
[-        ] process > circulargenerator              -
[-        ] process > circularmapper                 -
[-        ] process > bwamem                         -
[-        ] process > samtools_flagstat              -
[-        ] process > samtools_filter                -
[-        ] process > strip_input_fastq              -
[-        ] process > samtools_flagstat_after_filter -
[-        ] process > dedup                          -
[-        ] process > preseq                         -
[-        ] process > damageprofiler                 -
[-        ] process > qualimap                       -
[-        ] process > markDup                        -
[-        ] process > pmdtools                       -
[-        ] process > bam_trim                       -
[-        ] process > output_documentation           -
[-        ] process > get_software_versions          -
[-        ] process > multiqc                        -
Pulling Singularity image docker://nfcore/eager:2.0.7 [cache /fast/users/a1222423/nf/work/singularity/nfcore-eager-2.0.7.img]
[0;35m[nf-core/eager] Pipeline completed with errors
WARN: Singularity cache directory has not been defined -- Remote image will be stored in the path: /fast/users/a1222423/nf/work/singularity
Error executing process > 'output_documentation'

Caused by:
  Failed to pull singularity image
  command: singularity pull  --name nfcore-eager-2.0.7.img docker://nfcore/eager:2.0.7 > /dev/null
  status : 1
  message:
    WARNING: pull for Docker Hub is not guaranteed to produce the
    WARNING: same image on repeated pull. Use Singularity Registry
    WARNING: (shub://) to pull exactly equivalent images.
    ERROR: Image file exists, not overwriting.

2- I'm afraid that didn't work either:

nextflow run nf-core/eager -profile test --pairedEnd -with-singularity './nf-core-eager-2.0.7.simg' -r fixes-for-2.1.0
N E X T F L O W  ~  version 19.10.0
Launching `nf-core/eager` [nice_liskov] - revision: d8e8be8e40 [fixes-for-2.1.0]
[2m----------------------------------------------------
                                        ,--./,-.
        ___     __   __   __   ___     /,-._.--~'
  |\ | |__  __ /  ` /  \ |__) |__         }  {
  | \| |       \__, \__/ |  \ |___     \`-._,-`-,
                                        `._,._,'
  nf-core/eager v2.1.0dev
----------------------------------------------------
Pipeline Name     : nf-core/eager
Pipeline Version  : 2.1.0dev
Run Name          : nice_liskov
Reads             : data/*{1,2}.fastq.gz
Fasta Ref         : https://raw.githubusercontent.com/jfy133/test-datasets/eager/reference/Mammoth/Mammoth_MT_Krause.fasta
BAM Index Type    : CSI
Data Type         : Paired-End
Skipping FASTQC?  : No
Skipping AdapterRemoval?: No
Skip Read Merging : No
Skip Adapter Trimming: No
Running BAM filtering: No
Run Fastq Stripping: No
Skipping Mapping? : No
Skipping Preseq?  : No
Skipping Deduplication?: No
Skipping DamageProfiler?: No
Skipping Qualimap?: No
Run BAM Trimming? : No
Run PMDtools?     : No
Run Genotyping?   : No
Run MultiVCFAnalyzer: No
Max Memory        : 6 GB
Max CPUs          : 2
Max Time          : 2d
Output Dir        : ./results
Working Dir       : /fast/users/a1222423/nf/work
Container Engine  : singularity
Container         : ./nf-core-eager-2.0.7.simg
Current Home      : /home/a1222423
Current User      : a1222423
Current Path      : /home/a1222423/fastdir/nf
Script Dir        : /home/a1222423/.nextflow/assets/nf-core/eager
Config Profile    : test
Config Description: Minimal test dataset to check pipeline function
[2m----------------------------------------------------
executor >  SLURM (10)
[1c/afcaf0] process > makeBWAIndex (Mammoth_MT_Krause.fasta)                                         [100%] 1 of 1 ✔
[7b/73b46c] process > makeFastaIndex (Mammoth_MT_Krause.fasta)                                       [100%] 1 of 1 ✔
[7b/57434e] process > makeSeqDict (Mammoth_MT_Krause.fasta)                                          [100%] 1 of 1 ✔
[-        ] process > convertBam                                                                     -
[-        ] process > indexinputbam                                                                  -
[ba/ec4f22] process > fastqc (JK2782_TGGCCGATCAACGA_L008)                                            [100%] 2 of 2 ✔
[-        ] process > fastp                                                                          -
[3e/e94180] process > adapter_removal (JK2802_AGAATAACCTACCA_L008)                                   [ 50%] 1 of 2 ✔
[1f/734ccc] process > fastqc_after_clipping (JK2802_AGAATAACCTACCA_L008_R1_001.fastq.gz.tengrand.fq) [  0%] 0 of 1 ✔
[aa/a080a5] process > bwa (JK2802_AGAATAACCTACCA_L008_R1_001.fastq.gz.tengrand.fq)                   [  0%] 0 of 1 ✔
[-        ] process > circulargenerator                                                              -
[-        ] process > circularmapper                                                                 -
[-        ] process > bwamem                                                                         -
[-        ] process > samtools_flagstat                                                              -
[-        ] process > samtools_filter                                                                -
[-        ] process > strip_input_fastq                                                              -
[-        ] process > samtools_flagstat_after_filter                                                 -
[-        ] process > dedup                                                                          -
[-        ] process > markDup                                                                        -
[-        ] process > preseq                                                                         -
[-        ] process > damageprofiler                                                                 -
[-        ] process > qualimap                                                                       -
[-        ] process > pmdtools                                                                       -
[-        ] process > bam_trim                                                                       -
[-        ] process > download_gatk_v3_5                                                             -
[-        ] process > genotyping_ug                                                                  -
[-        ] process > genotyping_hc                                                                  -
[-        ] process > genotyping_freebayes                                                           -
[-        ] process > multivcfanalyzer                                                               -
[-        ] process > sex_deterrmine                                                                 -
[-        ] process > nuclear_contamination                                                          -
[-        ] process > print_nuclear_contamination                                                    -
[98/b127c3] process > output_documentation                                                           [100%] 1 of 1 ✔
[-        ] process > get_software_versions                                                          -
[-        ] process > multiqc                                                                        -
Error executing process > 'adapter_removal (JK2782_TGGCCGATCAACGA_L008)'

Caused by:
  Process `adapter_removal (JK2782_TGGCCGATCAACGA_L008)` terminated with an error exit status (1)

Command executed:

  mkdir -p output
  AdapterRemoval --file1 JK2782_TGGCCGATCAACGA_L008_R1_001.fastq.gz.tengrand.fq.gz --file2 JK2782_TGGCCGATCAACGA_L008_R2_001.fastq.gz.tengrand.fq.gz --basename JK2782_TGGCCGATCAACGA_L008_R1_001.fastq.gz.tengrand.fq --trimns --trimqualities --adapter1 AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC --adapter2 AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTA --minlength 30 --minquality 20 --minadapteroverlap 1 --gzip --threads 1 --collapse

  #Combine files
  if [   = "--preserve5p" ] && [ N = "N" ]; then
    zcat *.collapsed.gz *.singleton.truncated.gz *.pair1.truncated.gz *.pair2.truncated.gz | gzip > output/JK2782_TGGCCGATCAACGA_L008_R1_001.fastq.gz.tengrand.fq.combined.fq.gz
  elif [   = "--preserve5p" ] && [ N = "Y" ] ; then
    zcat *.collapsed.gz | gzip > output/JK2782_TGGCCGATCAACGA_L008_R1_001.fastq.gz.tengrand.fq.combined.fq.gz
  elif [ N = "Y" ] ; then
    zcat *.collapsed.gz *.collapsed.truncated.gz | gzip > output/JK2782_TGGCCGATCAACGA_L008_R1_001.fastq.gz.tengrand.fq.combined.fq.gz
  else
    zcat *.collapsed.gz *.collapsed.truncated.gz *.singleton.truncated.gz *.pair1.truncated.gz *.pair2.truncated.gz | gzip > output/JK2782_TGGCCGATCAACGA_L008_R1_001.fastq.gz.tengrand.fq.combined.fq.gz
  fi

  mv *.settings output/

Command exit status:
  1

Command output:
  (empty)

Command error:
  Trimming paired end reads ...
  Opening FASTQ file 'JK2782_TGGCCGATCAACGA_L008_R1_001.fastq.gz.tengrand.fq.gz'
  ERROR: Unhandled exception in thread:
      line_reader::open: failed to open file ('No such file or directory')
  ERROR: AdapterRemoval did not run to completion;
         do NOT make use of resulting trimmed reads!

Work dir:
  /fast/users/a1222423/nf/work/c0/c2db32f1a351da9c708cc4a7a821d3

Tip: when you have fixed the problem you can continue the execution adding the option `-resume` to the run command line

@apeltzer apeltzer changed the title workflow requires singularity images with 2 different names Singularity Run issue Oct 29, 2019
@apeltzer apeltzer added this to the Unclear Topics / Feature Requests milestone Feb 29, 2020
@apeltzer
Copy link
Member

apeltzer commented Jun 9, 2020

@npavlovikj was nice to point us to another instance of this issue and seems to have found the issue which was in Singularity and will be fixed in Singularity 3.6.+ on.

See her bug report in nf-core/rnaseq and also her link to the original issue + fix in the Singualarity project :-) 🎉

nf-core/rnaseq#427

apptainer/singularity#5151

@jfy133
Copy link
Member

jfy133 commented Jun 30, 2020

@yassineS @uoabowen I am doing a clean up of issues: as we've not had anyone else reporting this particular error with nf-core/eager, I am going to assume this is a singularity configuration. Please see the message from Alex above about a possible solution. Feel free to make a new isssue if you continue to have problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants