Skip to content

Latest commit

 

History

History
133 lines (99 loc) · 5.97 KB

README.md

File metadata and controls

133 lines (99 loc) · 5.97 KB

famosab/wrrocmetatest

nf-test Nextflow run with conda run with docker run with singularity

Introduction

famosab/wrrocmetatest is a bioinformatics pipeline that runs a minimal working example (fastp + megahit) on a small metagenome data set containing simulated Illumina reads from 15 microbial genomes to test the nf-prov plugin for metagenomics.

The application of this pipeline was to be used as exemplary Nextflow pipeline for the development and testing of the nf-prov Nextflow plugin. More information can be found in the BioHackathon Europe report.

Using the pipeline to test the nf-prov Nextflow plugin

A small metagenome data set containing simulated Illumina reads from 15 microbial genomes

To run the pipeline locally on the testdata you need to download the fq files and create the following samplesheet

testsheet.csv:

sample,fastq_1,fastq_2
test,<path/to/>read1.fq.gz,<path/to/>read2.fq.gz

And add the following config file

testdata.config:

process{
    withName: FASTP {
        cpus = 8
        memory = 16.GB
        }
    withName: MEGAHIT {
        cpus = 8
        memory = 16.GB
        ext.args = { "--k-min 51 --k-max 71 --k-step 20" }
        }
}

// add the following lines to run it with nf-prov
plugins {
	id 'nf-prov@1.1.0'
}

prov {
	enabled = true
	formats {
    	wrroc {
        	file = "${params.outdir}/ro-crate-metadata.json"
        	overwrite = true
        	agent {
            	name = "John Doe"
            	orcid = "https://orcid.org/0000-0000-0000-0000"
        	}
          license = "https://spdx.org/licenses/MIT"
          profile = "provenance_run_crate"
    	}
	}
}

Then you can run the workflow with the following command

nextflow run <path/to/>wrrocmetatest/main.nf -profile docker --input <path/to/>testsheet.csv --outdir results -c <path/to/>testdata.config

Depending on your resources this test run takes around 5 minutes.

Installation of the nf-prov plugin

Clone the repository with the current working version to your local machine. In our case this is famosab/nf-prov. Checkout the relevant branch, here workflow-run-crate. Then run make install. This will add nf-prov to .nextflow/plugins/ from which it can be used with any pipeline.

Usage

Note

If you are new to Nextflow and nf-core, please refer to this page on how to set-up Nextflow. Make sure to test your setup with -profile test before running the workflow on actual data.

First, prepare a samplesheet with your input data that looks as follows:

samplesheet.csv:

sample,fastq_1,fastq_2
CONTROL_REP1,AEG588A1_S1_L002_R1_001.fastq.gz,AEG588A1_S1_L002_R2_001.fastq.gz

Each row represents a pair of fastq files (paired end).

Now, you can run the pipeline using:

nextflow run famosab/wrroc-meta-test \
   -profile <docker/singularity/.../institute> \
   --input samplesheet.csv \
   --outdir <OUTDIR>

Warning

Please provide pipeline parameters via the CLI or Nextflow -params-file option. Custom config files including those provided by the -c Nextflow option can be used to provide any configuration except for parameters; see docs.

Credits

famosab/wrroc-meta-test was originally written by Famke Bäuerle, Tom Tubbesing, Keiler Collier, Matt Burridge, Benedikt Osterholz, Alex Sczyrba, Neil Wipat and Sandy Rogers during the BioHackathon Europe 2024.

Contributions and Support

If you would like to contribute to this pipeline, please see the contributing guidelines.

Citations

This pipeline uses code and infrastructure developed and maintained by the nf-core community, reused here under the MIT license.

The nf-core framework for community-curated bioinformatics pipelines.

Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.

Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x.