-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Singularity Run issue #95
Comments
The issue still persists, I tried manually pulling the singularity image, I also tried the automated way. At first, it works fine and downloads the image to work/singularity/, however, at some point during the execution the image just disappears and the jobs start crashing.
|
This is something I've never seen before, to be honest. The image is set once for the entire pipeline and should contain all the required dependencies in one single image file. (see here: Line 14 in c44b881
Can you provide you command to run? |
I don't understand it either, it's very strange. Here's the command:
|
I fear this is something with the PHOENIX clusters way of loading modules. The containers are solely defined in the Local tests and on multiple HPC systems suggest this is more an issue of the way PHOENIX requires setting SINGULARITY_BINDPATHs... @pditommaso Maybe you have an idea whats going on here? |
@apeltzer Could you summarise the problem? |
Yes! Give me a few minutes! |
@yassineS is trying to use the pipeline to run on the Phoenix cluster. Apparently this cluster requires setting the eager/conf/acad-pheonix.config Line 10 in 4c8f7f3
However, when doing this, apparently the pipeline doesn't find the pulled image properly anymore:
I don't really understand whats going on there - just assuming that it has to do something with this BINDPATH being not the same as in the container then, thus not finding the image pulled by the main nextflow process and used in the container ? |
What's the value of the |
Yes, apparently it's defined: |
Hm when reading this through, I have the feeling that we can get rid of this eager/conf/acad-pheonix.config Line 11 in 4c8f7f3
|
I think so. Alternatively you can ty to to specify that path using |
Pushed an update @yassineS - can you please check this again? According to @uoabowen's comment here, we could simply rely on the automount feature: #68 (comment) I removed the envWhitelist - can you please test again using the current |
Unfortunately, the newly pushed changes did not fix the issue. I still get:
@uoabowen, any clues what we might be doing wrong with singularity? |
@yassineS I just ran a singularity image for OpenFOAM, a CFD application., under /fast/users/a1XXXXXX/eager/work/singularity/. It worked well. I tried to reproduce your issue. But there is no subdirectory named HC under /data/acad/. @apeltzer I have changed our singularity.conf to auto bind /fast and /data since #68 The test with OpenFOAM showed the auto bind should still work. When I get any test case for the pipeline, I will test and update this. |
Hi @uoabowen I masked the true path as these are protected human samples. You can use the screen session under my account in |
@yassineS I am testing it with your Screen session. The test case has been running for around 1 hour and I haven't met any error codes. I will update if any errors appear |
I met the error when the test case running around 2 hours.
When I went to the work dir, executing the commands as the instructions, I got
And when I checked
|
When the workflow starts the image initially does exist under |
I tested it many times in the pasted few days. I found Besides using default I even tried to manually pull the image and copy it to the cachedir during running, but as long as the pipeline hasn't finished, it will be removed. In addition, I tried to use --with-singularity option to let the pipeline to use an existing image. After several minutes, the existing image would still be removed. |
Hi @yassineS - should we maybe find another date to fix this together? I'd say this is doable but probably easier when we have a joint Skype / Session to see whats going on... send me an e-mail 👍 |
We still are, we just didn't look into it for a while. @uoabowen shall we kick off another round of tests? |
I’m fine to re-test it. Please remind me where’s the reproducer and related commands/scripts, as I cannot remember most of the details☹.
From: Yassine Souilmi <notifications@github.com>
Reply to: nf-core/eager <reply@reply.github.com>
Date: Tuesday, 15 October 2019 at 1:34 pm
To: nf-core/eager <eager@noreply.github.com>
Cc: Bowen Chen <bowen.chen@adelaide.edu.au>, Mention <mention@noreply.github.com>
Subject: Re: [nf-core/eager] workflow requires singularity images with 2 different names (#95)
We still are, we just didn't look into it for a while. @uoabowen<https://github.com/uoabowen> shall we kick off another round of tests?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#95?email_source=notifications&email_token=AJCTQMRXLRCN3Q2SNUDDBBTQOUXI7A5CNFSM4GH75NZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBHH6MA#issuecomment-542015280>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AJCTQMQS4HPVJYH47KH5GQDQOUXI7ANCNFSM4GH75NZA>.
|
We should start from a fresh config. Please feel free to use my fast
directory space.
Yassine
…On Tue, Oct 15, 2019 at 6:53 PM uoabowen ***@***.***> wrote:
I’m fine to re-test it. Please remind me where’s the reproducer and
related commands/scripts, as I cannot remember most of the details☹.
From: Yassine Souilmi ***@***.***>
Reply to: nf-core/eager ***@***.***>
Date: Tuesday, 15 October 2019 at 1:34 pm
To: nf-core/eager ***@***.***>
Cc: Bowen Chen ***@***.***>, Mention <
***@***.***>
Subject: Re: [nf-core/eager] workflow requires singularity images with 2
different names (#95)
We still are, we just didn't look into it for a while. @uoabowen<
https://github.com/uoabowen> shall we kick off another round of tests?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<
#95?email_source=notifications&email_token=AJCTQMRXLRCN3Q2SNUDDBBTQOUXI7A5CNFSM4GH75NZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBHH6MA#issuecomment-542015280>,
or unsubscribe<
https://github.com/notifications/unsubscribe-auth/AJCTQMQS4HPVJYH47KH5GQDQOUXI7ANCNFSM4GH75NZA>.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#95?email_source=notifications&email_token=AANPQIKR5KHIKIBCBBPWDG3QOZJXLA5CNFSM4GH75NZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBKSNCY#issuecomment-542451339>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AANPQIIH2JQF2BXL2NB3UBDQOZJXLANCNFSM4GH75NZA>
.
--
Best regards,
—
Yassine Souilmi, PhD, MS,
ARC Postdoctoral Research Associate
Australian Centre for Ancient DNA <http://adelaide.edu.au/acad>
University of Adelaide
Darling Building, Room 205b
North Terrace Campus, Adelaide SA 5005
+61 83138242
|
The issue is still there. When it proceeded to some point, the singularity image was deleted.
[a1234567@l01 eager]$ nextflow run nf-core/eager -profile test --pairedEnd -c phoenix241019.conf -with-singularity /data/acad/singularity/cache/eager-2.0.7.simg
N E X T F L O W ~ version 19.10.0
Launching `nf-core/eager` [prickly_kalman] - revision: b8d3dec [master]
[2m----------------------------------------------------
,--./,-.
___ __ __ __ ___ /,-._.--~'
|\ | |__ __ / ` / \ |__) |__ } {
| \| | \__, \__/ | \ |___ \`-._,-`-,
`._,._,'
nf-core/eager v2.0.7
----------------------------------------------------
Pipeline Name : nf-core/eager
Pipeline Version : 2.0.7
Run Name : prickly_kalman
Reads : data/*{1,2}.fastq.gz
Fasta Ref : https://raw.githubusercontent.com/nf-core/test-datasets/eager2/reference/Mammoth_MT_Krause.fasta
BAM Index Type : CSI
Data Type : Paired-End
Skip Collapsing : No
Skip Trimming : No
Output stripped fastq: No
Max Memory : 125 GB
Max CPUs : 32
Max Time : 2d
Output dir : ./results
Working dir : /fast/users/a1234567/nf/eager/work
Container Engine : singularity
Container : /data/acad/singularity/cache/eager-2.0.7.simg
Current home : /home/a1234567
Current user : a1234567
Current path : /fast/users/a1234567/nf/eager
Script dir : /home/a1234567/.nextflow/assets/nf-core/eager
Config Profile : test
Config Description: Minimal test dataset to check pipeline function
[2m----------------------------------------------------
executor > slurm (12)
[6b/fb1bdf] process > makeBWAIndex (Mammoth_MT_Krause.fasta) [100%] 1 of 1 ✔
[df/836818] process > makeFastaIndex (Mammoth_MT_Krause.fasta) [100%] 1 of 1 ✔
[bb/f38cad] process > makeSeqDict (Mammoth_MT_Krause.fasta) [100%] 1 of 1 ✔
[- ] process > convertBam -
executor > slurm (12)
[6b/fb1bdf] process > makeBWAIndex (Mammoth_MT_Krause.fasta) [100%] 1 of 1 ✔
[df/836818] process > makeFastaIndex (Mammoth_MT_Krause.fasta) [100%] 1 of 1 ✔
[bb/f38cad] process > makeSeqDict (Mammoth_MT_Krause.fasta) [100%] 1 of 1 ✔
[- ] process > convertBam -
[53/c1a419] process > fastqc (JK2785_TGGCCGATCAACGA_L008) [100%] 2 of 2 ✔
[- ] process > fastp -
executor > slurm (12)
[6b/fb1bdf] process > makeBWAIndex (Mammoth_MT_Krause.fasta) [100%] 1 of 1 ✔
[df/836818] process > makeFastaIndex (Mammoth_MT_Krause.fasta) [100%] 1 of 1 ✔
[bb/f38cad] process > makeSeqDict (Mammoth_MT_Krause.fasta) [100%] 1 of 1 ✔
[- ] process > convertBam -
[53/c1a419] process > fastqc (JK2785_TGGCCGATCAACGA_L008) [100%] 2 of 2 ✔
[- ] process > fastp -
[a4/188323] process > adapter_removal (JK2785_TGGCCGATCAACGA_L008) [100%] 2 of 2 ✔
[f5/5bd7b8] process > fastqc_after_clipping (JK2785_TGGCCGATCAACGA_L008_R1_001.fastq.gz.tengrand.fq) [100%] 2 of 2, failed: 2 ✘
[8d/dc6b58] process > bwa (JK2785_TGGCCGATCAACGA_L008_R1_001.fastq.gz.tengrand.fq) [100%] 2 of 2, failed: 2 ✘
[- ] process > circulargenerator -
[- ] process > circularmapper -
[- ] process > bwamem -
[- ] process > samtools_flagstat -
[- ] process > samtools_filter -
[- ] process > strip_input_fastq -
[- ] process > samtools_flagstat_after_filter -
[- ] process > dedup -
[- ] process > preseq -
[- ] process > damageprofiler -
[- ] process > qualimap -
[- ] process > markDup -
[- ] process > pmdtools -
[- ] process > bam_trim -
[9e/008fbb] process > output_documentation [100%] 1 of 1 ✔
[- ] process > get_software_versions -
[- ] process > multiqc -
Execution cancelled -- Finishing pending tasks before exit
[0;35m[nf-core/eager] Pipeline completed with errors
Error executing process > 'fastqc_after_clipping (JK2782_TGGCCGATCAACGA_L008_R1_001.fastq.gz.tengrand.fq)'
Caused by:
Process `fastqc_after_clipping (JK2782_TGGCCGATCAACGA_L008_R1_001.fastq.gz.tengrand.fq)` terminated with an error exit status (255)
Command executed:
fastqc -q JK2782_TGGCCGATCAACGA_L008_R1_001.fastq.gz.tengrand.fq.combined.fq.gz
Command exit status:
255
Command output:
(empty)
Command error:
ERROR : Image path /data/acad/singularity/cache/eager-2.0.7.simg doesn't exist: No such file or directory
ABORT : Retval = 255
Work dir:
/fast/users/a1234567/nf/eager/work/0e/cef4ab51625398d1941193105ac953
Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`
Bowen Chen
From: Yassine Souilmi <notifications@github.com>
We should start from a fresh config. Please feel free to use my fast
directory space.
Yassine
…On Tue, Oct 15, 2019 at 6:53 PM uoabowen ***@***.***> wrote:
I’m fine to re-test it. Please remind me where’s the reproducer and
related commands/scripts, as I cannot remember most of the details☹.
From: Yassine Souilmi ***@***.***>
Reply to: nf-core/eager ***@***.***>
Date: Tuesday, 15 October 2019 at 1:34 pm
To: nf-core/eager ***@***.***>
Cc: Bowen Chen ***@***.***>, Mention <
***@***.***>
Subject: Re: [nf-core/eager] workflow requires singularity images with 2
different names (#95)
We still are, we just didn't look into it for a while. @uoabowen<
https://github.com/uoabowen> shall we kick off another round of tests?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<
#95?email_source=notifications&email_token=AJCTQMRXLRCN3Q2SNUDDBBTQOUXI7A5CNFSM4GH75NZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBHH6MA#issuecomment-542015280>,
or unsubscribe<
https://github.com/notifications/unsubscribe-auth/AJCTQMQS4HPVJYH47KH5GQDQOUXI7ANCNFSM4GH75NZA>.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#95?email_source=notifications&email_token=AANPQIKR5KHIKIBCBBPWDG3QOZJXLA5CNFSM4GH75NZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBKSNCY#issuecomment-542451339>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AANPQIIH2JQF2BXL2NB3UBDQOZJXLANCNFSM4GH75NZA>
.
--
Best regards,
—
Yassine Souilmi, PhD, MS,
ARC Postdoctoral Research Associate
Australian Centre for Ancient DNA <http://adelaide.edu.au/acad>
University of Adelaide
Darling Building, Room 205b
North Terrace Campus, Adelaide SA 5005
+61 83138242
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#95?email_source=notifications&email_token=AJCTQMXK7RP7IYILNX454UDQPOY5PA5CNFSM4GH75NZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBYAHDA#issuecomment-544211852>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AJCTQMXO6WLGOTGDQPE34ADQPOY5PANCNFSM4GH75NZA>.
|
Do you have the same issue when running other nf-core/nextflow pipelines? If yes, maybe we can bump the issue to the wider nf-core community? Also, does the same issue occur if you specify the singularity profile (alongside test) in addition to I also see that you specify a custom nextflow config. Is the |
@jfy133 I tested I tested EAGER with |
@yassineS could you supply the command you used for deepvariant? I wonder if If you're not already, maybe you could go to to the nf-core slack and join #eager, and describe exactly your set up and the commands you've tried. I think we need singularity/infrastructure experts... |
Normally the Would be great to see you in the nf-core slack and in channel #eager to discuss further there - its much faster to debug and resolve that way. Pretty sure its just a overriding of variables now and we'll get there quickly. Running the other deepvariant pipeline uses the centralized nf-core/configs, so if you just use eager this way: nextflow pull nf-core/eager
nextflow run nf-core/eager -profile phoenix,test Should then also work. Don't add options, just that above and let's see what kind of error that creates :-) |
In addition: Maybe start with a fresh EAGER install: And please give us your entire log file |
@jfy133 that's exactly what I did, I removed all old instances of eager and even freshly re-installed nextflow. I didn't use the
First, the test doesn't run fully. Second, when I try to run through other test data:
I attached the log file. |
(is there a reason why you are not using the phoenix config? It would help us if you were using it, as then Alex and I exactly know what is going on with the configs you are using)
Could you try:
(this is is our 'edge' version which has a lot of bug fixes and changes).
I realise now this is mis-documented, and I will go fix that now. |
Note: I setup nf-core Slack, I'm Yassine Souilmi on there. Alright, I tried what you suggested above: 1- This crashes right away:
2- I'm afraid that didn't work either:
|
@npavlovikj was nice to point us to another instance of this issue and seems to have found the issue which was in Singularity and will be fixed in Singularity 3.6.+ on. See her bug report in nf-core/rnaseq and also her link to the original issue + fix in the Singualarity project :-) 🎉 |
@yassineS @uoabowen I am doing a clean up of issues: as we've not had anyone else reporting this particular error with nf-core/eager, I am going to assume this is a singularity configuration. Please see the message from Alex above about a possible solution. Feel free to make a new isssue if you continue to have problems. |
Description
When running the workflow on Phoenix using the phoenix profile, it starts fine and automatically downloads the prebuild singularity image to
work/singularity/nf-core-eager.img
. However, some jobs require the image to have a different name:work/singularity/nfcore-eager-latest.img
.I worked around that by copying the image and having it available with both names.
Error:
The text was updated successfully, but these errors were encountered: