-
Notifications
You must be signed in to change notification settings - Fork 709
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix transcriptome staging issues on DNAnexus for rsem/prepareference #727
Comments
I am trying to reproduce this on AWS Batch at the moment but I suspect it will work there because our full-sized AWS tests work. Is this something you can help us to debug too please @GHAStVHenry? I'm sure @drejom would be happy to provide you with any info you need. |
@drejom would you mind dumping the contents of |
One way to debug this is copying the .command.sh and .command.run and running the locally (provided you have |
I'm a bit stumped because the process that causes the error (eg SALMON_QUANT in the attached logs) only appears in the error message; there's no record of the job being submitted, so I can't retrieve the run folder or its contents. Not sure how to proceed? |
I found the problem...
I'm not familiar with
and
Is there a way to add to the RSEM_PREPAREREFERENCE process
@drpatelh can you add that |
Hi @GHAStVHenry ! Thanks for troubleshooting this! In principle, this sounds like a plausible explanation but I am a little confused as to how it is happening with the default parameters used by the pipeline:
So if I had to narrow it down based on your observation, I suspect that when rnaseq/modules/nf-core/modules/rsem/preparereference/main.nf Lines 60 to 65 in 964425e
it is somehow writing a file with the same name internally but still need to confirm. |
Are you able to upload those files here or see if they are any different @GHAStVHenry along with any timestamps.
|
The files have the same md5sums, so they'll be the same, let me know if you want me to upload them and there is a sequential write time difference. The actual times are above. Hmmm, you are right re- default Actually my first post firmly put the blame in the rsem-prepare-reference and then I read the conditional for the sequential STAR and changed my mind, forgetting that that isn't even what happened with my test. I'm not familiar with rsem-prepare-reference but it seems weird that it would write the same file twice... but that seems to be what's happening. |
Ok. Thanks! If they are the same then that may be less problematic otherwise we would have no way of picking which one to take (or telling NF to anyway). I have pushed a quick fix to the This just takes a hard-named file called Would you mind giving it a go with |
Alright, tried it... it kinda sorta worked... the
SALMON_QUANT's work folder didn't have the fasta in there... actually no inputs are there, but looking at other process work folders, it doesn't appear that inputs are saved/retained, so I can't confirm this. |
Alright, I think I understand the problem... it isn't rnaseq/modules/nf-core/modules/rsem/preparereference/main.nf Lines 25 to 28 in 964425e
outputs the transcripts fasta both in the I'm not sure which, but DNAnexus/Nextflow is capable of handling the multiple files of the same name as is evidenced by my test from above: I set up some tests, but I accidentally pushed my commits to the wrong remote, I sent it to NFCore repo (tar branch) instead of my fork, but if it doesn't work, or if you have a better way of fixing it, you can revert it. Will update with result of test... |
Ok, after fighting with the glob output of EDIT: WORKED!!! |
Awesome! Great work! 🥳 Ok I pushed a last commit based on what you found. This is just so we don't mess with the default files generated by RSEM and pass them all along in the index. Would you mind running a last test using the |
The first test I tried was similar to that and didn't work... yours did though, at least it got past SALMON_QUANT, will update once it gets to the end! EDIT: WORKED!!! |
Worked for me too! Rippa!! Thanks @GHAStVHenry @drpatelh |
Rippa <- 🤣 Ok. Will leave this open until we properly push the fixes into the main pipeline. Thanks guys. |
Check Documentation
I have checked the following places for your error:
Description of the bug
Steps to reproduce
Steps to reproduce the behaviour:
When using the app v1.0.0-beta.6, the test profile runs successfully (
-profile test,docker -r tar --skip_bbsplit
), but I haven't managed a run otherwise. It fails pretty quietly somewhere around Star_Align.The log from that step shows some output files missing:
However, the log from the run shows an issue with SALMON_QUANT
The nextflow.log is stuck in an 'open' state, so I cant read/download/attach it
System
The text was updated successfully, but these errors were encountered: