Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

User provided --three_prime_adapter is not recognised #41

Closed
sirselim opened this issue Feb 24, 2020 · 15 comments · Fixed by #55
Closed

User provided --three_prime_adapter is not recognised #41

sirselim opened this issue Feb 24, 2020 · 15 comments · Fixed by #55
Labels
bug Something isn't working

Comments

@sirselim
Copy link
Contributor

When running on several small RNASeq data sets we have in house no miRNA are being detected. Upon inspecting the trimmed files they are all in the order of ~80Kb in size, and checking the trimming logs the issue is that the adapter being used isn't the one provided by the user.

This is the command being used:

nextflow run nf-core/smrnaseq -r 1.0.0 --reads 'fastq/*.fastq.gz' -profile conda \
  --genome 'GRCh37' --saveReference -resume --min_length 17 \
  --three_prime_adapter AGATCGGAAGAGC

However the adapter being reported in the trimming output is TGGAATTCTCGGGTGCCAAGG - the adapter that is defined in the illumina protocol.

When I clone the pipeline and edit hack the illumina protocol adapter to be the one we want to use for trimming everything works as expected:

nextflow run /tmp/smrnaseq/main.nf --reads 'fastq/*.fastq.gz' \
  -profile conda --protocol illumina --genome 'GRCh37' \
  --saveReference -resume --min_length 17

So for some reason the 'custom' user defined adapter parameter isn't being assigned or the illumina protocol is taking precedence. I haven't had time to dig into this anymore but am happy to do some more testing if required.

@lpantano
Copy link
Contributor

hum, can you try with the original code and --protocol custom --three_prime_adapter AGATCGGAAGAGC, I think by default is illumina so it is ignoring the adapter if protocol is not set up to something else. Let us know when you try this.

@sirselim
Copy link
Contributor Author

hmm, I thought that had fixed it and had started making a pull request to suggest adding that information to the documentation (there is no indication that custom is an option). However, when I checked back in on the pipeline it had exited with errors.

Upon inspection it seems that --protocol custom isn't an option that is accepted by miRtrace:

    -p, --protocol         One of the following (read structure schematic in parens):
                               illumina (miRNA--3'-adapter--index) [DEFAULT]
                               qiaseq (miRNA--3'-adapter--UMI--3'-adapter--index)
                                   NOTE: Only the first (leftmost) 3' adapter should be specified.
                               cats (NNN--miRNA--poly-A--3'-adapter--index)
                                   NOTE: It's not possible to specify an adapter for -p cats.
                               nextflex (NNNN--miRNA--NNNN--3'-adapter--index)

I haven't come across miRtrace before (nice tool though) so I am unsure of it's inner workings. In the documentation for miRtrace it states that -p/--protocol is an optional argument, so I tested removing that line from STEP 7 in main.nf:

/*
 * STEP 7 - miRTrace
 */
process mirtrace {
     tag "$reads"
     publishDir "${params.outdir}/miRTrace", mode: 'copy'
      
     input:
     file reads from raw_reads_mirtrace.collect()

     output:
     file '*mirtrace' into mirtrace_results

     script:
     primer = (protocol=="cats") ? " " : " --adapter $three_prime_adapter "
     """
     for i in $reads
     do
         path=\$(realpath \${i})
         prefix=\$(echo \${i} | sed -e "s/.gz//" -e "s/.fastq//" -e "s/.fq//" -e "s/_val_1//" -e "s/_trimmed//" -e "s/_R1//" -e "s/.R1//")
         echo \$path","\$prefix
     done > mirtrace_config

     mirtrace qc \\
         --species $params.mirtrace_species \\
         $primer \\
        //  --protocol $protocol \\  # removed this argument
         --config mirtrace_config \\
         --write-fasta \\
         --output-dir mirtrace \\
         --force
     """
 }

Running with the above modification and --protocol custom worked and completed as expected.

I am happy to make a pull request but thought it might be worth highlighting this first in case it has unexpected consequences.

@lpantano
Copy link
Contributor

lpantano commented Feb 26, 2020 via email

@sirselim
Copy link
Contributor Author

I'm not at all familiar with NextFlow, so am probably doing something wrong, but your suggestion didn't work (most likely a syntax issue on my end):

N E X T F L O W  ~  version 20.01.0
Launching `/store/mbenton/smrnaseq/main.nf` [voluminous_swirles] - revision: 47b2ec65ce
WARN: It appears you have never run this project before -- Option `-resume` is ignored
Script compilation error
- file : /store/mbenton/smrnaseq/main.nf
- cause: expecting ')', found 'or' @ line 717, column 33.
        primer = (protocol==“cats” or protocol==“custom”) ? " " : " --adapter $three_prime_adapter "
                                   ^
1 error

I still struggle to see how the suggested fix will happily pass to through miRtrace as "custom" won't be recognised as a protocol., right?

@lpantano
Copy link
Contributor

sorry you are right, we need a new line that put only protocols when is custom. I need to test it, something like this:

protocol_opt = (protocol=="custom") ? " " : " --protocol $protocol "

and then:

mirtrace qc \\
         --species $params.mirtrace_species \\
         $primer \\
         $protocol_opt \\
         --config mirtrace_config \\
         --write-fasta \\
         --output-dir mirtrace \\
         --force

@sirselim
Copy link
Contributor Author

Nice, I made those changes and have run a test set through - everything seems to be working as expected. Would you like me to make a pull request or would you prefer to do it and do some more testing?

@lpantano
Copy link
Contributor

yes, do the PR please. It will take some time to merge because we need to update the TEMPLATE branch to the last version, but is good the PR is there. Probably the test will fail but not because your changes. :)

@sirselim
Copy link
Contributor Author

sirselim commented Mar 1, 2020

OK, done - see pull request #42.

I also added a little more documentation making it clearer that --protocol custom is an option.

I think there is another 'issue' that needs addressing in terms of maybe just clearer documentation around user provided arguments. It's not clear that when using a defined --protocol other trimming arguments are not evaluated, i.e. if using --protocol illumina and also providing a different 3' adapter, the user defined adapter is ignored. Is this the desired behavior and that only --protocol custom should allow user defined arguments? If so then I believe that the documentation and maybe log/terminal messages should reflect this. Thoughts? Happy to open a new issue for further discussion, I just thought it might also fit within the current context. :)

@ewels ewels added the bug Something isn't working label Nov 7, 2020
ewels added a commit to ewels/nf-core-smrnaseq that referenced this issue Nov 7, 2020
@ewels
Copy link
Member

ewels commented Nov 7, 2020

PR merged to dev: #42

Change also included in the bigger update PR: #55

@ewels ewels linked a pull request Nov 7, 2020 that will close this issue
6 tasks
@ewels ewels closed this as completed Nov 11, 2020
@JRodrigoF
Copy link

I still get an error using the option "--protocol custom" when trying to use specific adapter sequences.

N E X T F L O W ~ version 21.04.3
singularity version 3.5.3
revision: 03333bf [1.1.0]

The following runs with no problem at all:
nextflow run nf-core/smrnaseq -r 1.1.0 -profile test,singularity

However the following stops due to an error:
nextflow run nf-core/smrnaseq -r 1.1.0 --input '~/test_custom/data/*.fq.gz' -profile singularity --genome 'GRCh37' --protocol 'custom' --three_prime_adapter 'AGATCGGAAGAGCACACGTCT'

I also would like to be able to input multiple sequence adapters to be passed to cutadapt,
(e.g --three_prime_adapter GATCGGAAGAGCACACGTCTGAACTCCAGTCAC --three_prime_adapter GAGCACACGTCTGAACTCCAGTCAC)

but well already with one I'm not being successful.

My workaround so far is to remove adapters using cutadapt on my own, and then pass the already trimmed fastq files to smrnaseq v1.1.0 with no protocol option (default settings). But getting this option to work would be great,

Task monitor] ERROR nextflow.processor.TaskProcessor - Error executing process > 'mirtrace
Process mirtrace .. terminated with an error exit status (255)
java -Xms45097156608 -Xmx45097156608 -jar $mirtracejar/mirtrace.jar --mirtrace-wrapper-name mirtrace qc
--species hsa
--adapter AGATCGGAAGAGCACACGTCT
--protocol custom
--config mirtrace_config
--write-fasta
--output-dir mirtrace
--force

Command error:
EXAMPLE CONFIG FILE:
path/sample1.fastq,sample 1 (control),TGGAATTC
path/sample2.fastq,sample 2 (+drug X),TGGAATTC

ERROR: Invalid --protocol argument: custom

@lpantano
Copy link
Contributor

lpantano commented Aug 9, 2021

Thank you for reporting, I see this is actually a code issue. The problem with that tool is that I don't know how it will run in this case. Do you know how the command line should look like for your case? Then I can try to find the logic in the code.

@JRodrigoF
Copy link

JRodrigoF commented Aug 12, 2021

Yes. I directly use cutadapt. I actually tune a bit several parameters, but my basic line would look like:

cutadapt --adapter AGATCGGAAGAGCACACGTCT --adapter GATCGGAAGAGCACACGTCTGAACTCCAGTCAC -q 10,10 --minimum-length 15 <input.fq.gz>

(cutadapt can take more than 1 custom adapter sequence and trim either if present)
from the TrimGalore documentation, the line below would be the equivalent

trim_galore --adapter AGATCGGAAGAGCACACGTCT --adapter GATCGGAAGAGCACACGTCTGAACTCCAGTCAC --quality 10 --length 15 <input.fq.gz>

@lpantano
Copy link
Contributor

Thank you, that is good to know. Could you find a command line for mirtrace that will work with your trimmed files?

@klkeys
Copy link
Contributor

klkeys commented Oct 6, 2021

late to the convo, but to shoehorn @JRodrigoF's case through mirtrace you might get away with --protocol illumina and --adapter INSERTYOURADAPTERSEQHERE if your reads follow the Illumina sequencing structure (see page 17 here)

Illumina adapters clip directly adjacent to their inserts, so if your kit's adapters do the same then they are structurally identical to the Illumina protocol, save for the different 3' adapter sequence

@keenhl
Copy link

keenhl commented Mar 31, 2022

New here, and having the some problems.

My quick workaround was just to edit the main.nf file (see below) and replace TGGAATTCTCGGGTGCCAAGG with my adapter sequence. There's probably a better way to do this.

if (params.protocol == "illumina"){
clip_r1 = 0
three_prime_clip_r1 = 0
three_prime_adapter = "TGGAATTCTCGGGTGCCAAGG"

nschcolnicov pushed a commit that referenced this issue Oct 10, 2024
Important! Template update for nf-core/tools v2.4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants