Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trim3p nextflex #386

Merged
merged 14 commits into from
Sep 7, 2024
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## v2.4.0dev - 2024-XX-XX - X

- [[#365]](https://github.com/nf-core/smrnaseq/issues/365) by [[#386]](https://github.com/nf-core/smrnaseq/pull/386). Fix Nextflex trimming support.
- [[#332]](https://github.com/nf-core/smrnaseq/issues/332) by [[#361]](https://github.com/nf-core/smrnaseq/pull/361) - Fix documentation to use only single-end
- [[#349]](https://github.com/nf-core/smrnaseq/pull/349) - Fix [MIRTOP_QUANT conda issue](https://github.com/nf-core/smrnaseq/issues/347), change conda-base to conda-forge channel
- [[#350]](https://github.com/nf-core/smrnaseq/pull/350) - Fix [MIRTOP_QUANT conda issue](https://github.com/nf-core/smrnaseq/issues/347), set python version to 3.7 to fix pysam issue
Expand Down
26 changes: 25 additions & 1 deletion conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ process {
ext.args = [ "",
params.trim_fastq ? "" : "--disable_adapter_trimming",
params.clip_r1 > 0 ? "--trim_front1 ${params.clip_r1}" : "", // Remove bp from the 5' end of read 1.
params.three_prime_clip_r1 > 0 ? "--trim_tail1 ${params.three_prime_clip_r1}" : "", // Remove bp from the 3' end of read 1 AFTER adapter/quality trimming has been performed.
// params.three_prime_clip_r1 > 0 ? "--trim_tail1 ${params.three_prime_clip_r1}" : "", // Remove bp from the 3' end of read 1 AFTER adapter/quality trimming has been performed.
nschcolnicov marked this conversation as resolved.
Show resolved Hide resolved
params.fastp_min_length > 0 ? "-l ${params.fastp_min_length}" : "",
params.fastp_max_length > 0 ? "--max_len1 ${params.fastp_max_length}" : "",
params.three_prime_adapter == "auto-detect" ? "" : "--adapter_sequence ${params.three_prime_adapter}"
Expand All @@ -73,6 +73,30 @@ process {
]
]
}
//
// FASTQ_FASTQC_UMITOOLS_FASTP
//
withName: '.*:FASTP3' {
ext.args = [ "",
"--disable_adapter_trimming",
"--disable_quality_filtering",
params.three_prime_clip_r1 > 0 ? "--trim_tail1 ${params.three_prime_clip_r1}" : "", // Remove bp from the 3' end of read 1 AFTER adapter/quality trimming has been performed.
params.fastp_min_length > 0 ? "-l ${params.fastp_min_length}" : "",
params.fastp_max_length > 0 ? "--max_len1 ${params.fastp_max_length}" : "",
].join(" ").trim()
publishDir = [
[
path: { "${params.outdir}/fastp/on_raw" },
mode: params.publish_dir_mode,
pattern: "*.{json,html}"
],
[
path: { "${params.outdir}/fastp/on_raw/log" },
mode: params.publish_dir_mode,
pattern: "*.log"
]
]
}
withName: '.*:FASTQ_FASTQC_UMITOOLS_FASTP:FASTQC_RAW' {
//the prefix is required for multiqc to pickup the files separately from the other fastqc instances
ext.prefix = { "${meta.id}.raw" }
Expand Down
33 changes: 33 additions & 0 deletions conf/test_no_genome_nextflex.config
nschcolnicov marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
/*
========================================================================================
Nextflow config file for running minimal tests
========================================================================================
Defines input files and everything required to run a fast and simple pipeline test.

Use as follows:
nextflow run nf-core/smrnaseq -profile test,<docker/singularity>

----------------------------------------------------------------------------------------
*/

params {
config_profile_name = 'Test profile'
config_profile_description = 'Minimal test dataset to check pipeline function'

// Limit resources so that this can run on GitHub Actions
max_cpus = 2
max_memory = '6.GB'
max_time = '6.h'

// Input data
input = 'https://github.com/nf-core/test-datasets/raw/smrnaseq/samplesheet/v2.0/samplesheet_test_nextflex.csv'
mature = 'https://github.com/nf-core/test-datasets/raw/smrnaseq/reference/mature.fa'
hairpin = 'https://github.com/nf-core/test-datasets/raw/smrnaseq/reference/hairpin.fa'
mirna_gtf = 'https://github.com/nf-core/test-datasets/raw/smrnaseq/reference/hsa.gff3'
mirtrace_species = 'hsa'

}

// Include nextflex config to run test without additional profiles

includeConfig 'protocol_nextflex.config'
42 changes: 42 additions & 0 deletions modules/local/trim3p.nf
nschcolnicov marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
process FASTP3 {
tag "$meta.id"
label 'process_medium'

conda "${moduleDir}/environment.yml"
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/fastp:0.23.4--h5f740d0_0' :
'biocontainers/fastp:0.23.4--h5f740d0_0' }"

input:
tuple val(meta), path(reads)

output:
tuple val(meta), path('*fastp3.fastq.gz') , optional:true, emit: reads
tuple val(meta), path('*.json') , emit: json
tuple val(meta), path('*.html') , emit: html
tuple val(meta), path('*.log') , emit: log
path "versions.yml" , emit: versions

when:
task.ext.when == null || task.ext.when

script:
def args = task.ext.args ?: ''
def prefix = task.ext.prefix ?: "${meta.id}"

"""
fastp \\
--in1 ${reads} \\
--out1 ${prefix}.fastp3.fastq.gz \\
--thread $task.cpus \\
--json ${prefix}.fastp3.json \\
--html ${prefix}.fastp3.html \\
$args \\
2> >(tee ${prefix}.fastp3.log >&2)

cat <<-END_VERSIONS > versions.yml
"${task.process}":
fastp: \$(fastp --version 2>&1 | sed -e "s/fastp //g")
END_VERSIONS
"""
}
2 changes: 1 addition & 1 deletion nextflow.config
Original file line number Diff line number Diff line change
Expand Up @@ -247,7 +247,7 @@ profiles {
test_index { includeConfig 'conf/test_index.config' }
test_technical_repeats { includeConfig 'conf/test_technical_repeats.config' }
test_mirgenedb { includeConfig 'conf/test_mirgenedb.config' }

test_no_genome_nextflex { includeConfig 'conf/test_no_genome_nextflex.config' }


//Protocol specific profiles
Expand Down
9 changes: 9 additions & 0 deletions workflows/smrnaseq.nf
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ include { CAT_FASTQ } from '../modules/nf-core/cat/fastq/
include { CONTAMINANT_FILTER } from '../subworkflows/local/contaminant_filter'
include { FASTQC } from '../modules/nf-core/fastqc/main'
include { FASTQ_FASTQC_UMITOOLS_FASTP } from '../subworkflows/nf-core/fastq_fastqc_umitools_fastp'
include { FASTP3 } from '../modules/local/trim3p.nf'
nschcolnicov marked this conversation as resolved.
Show resolved Hide resolved
include { FASTP as FASTP_LENGTH_FILTER } from '../modules/nf-core/fastp'
include { GENOME_QUANT } from '../subworkflows/local/genome_quant'
include { INDEX_GENOME } from '../modules/local/bowtie_genome'
Expand Down Expand Up @@ -110,6 +111,14 @@ workflow NFCORE_SMRNASEQ {

ch_fasta = params.fasta ? file(params.fasta): []
ch_reads_for_mirna = FASTQ_FASTQC_UMITOOLS_FASTP.out.reads
// Trim 3' end nucleotides after adapter is removed, otherwise they are not really trimmed
if (params.three_prime_clip_r1){
FASTP3(
ch_reads_for_mirna
)
ch_reads_for_mirna = FASTP3.out.reads
//trim_json = FASTP3.out.json
nschcolnicov marked this conversation as resolved.
Show resolved Hide resolved
}

// even if bowtie index is specified, there still needs to be a fasta.
// without fasta, no genome analysis.
Expand Down