Nextflow pipeline designed for peak calling using MACS and IDR, coupled with QC generation using deeptools. The saturation option generates peaks by successively considering increasing percentages of the total reads, repeating the operation multiple times within the range of 0.05 to 0.95.
- Nextflow : for common installation procedures see the IARC-nf repository.
A conda receipe, and docker and singularity containers are available with all the tools needed to run the pipeline (see "Usage")
Type | Description |
---|---|
--input_file | input tabulation-separated values file with columns sample (sample name), tag (short name for figures), bam (bam file path) and group (group), for chip mode, you must also provide input : 0 for normal samples and 1 for input sample |
eg:
sample | tag | bam | group | input |
---|---|---|---|---|
SAM015 | S15 | S15.bam | 1 | 0 |
SAM016 | S16 | S16.bam | 1 | 0 |
SAM010 | S10 | S10.bam | 1 | 1 |
Name | Example value | Description |
---|---|---|
--ref | hg38 | Reference fasta file hg19, hg38 or mm10' |
--gencode | gencode.bed | gencode file |
Name | Default value | Description |
---|---|---|
--mode | atac | There is two mode : atac or chip, chip require "input" sample(s) |
--output_folder | bam2peaks | Output folder name |
--cpu | 16 | number of CPUs |
--mem | 16 | memory |
--extsize | 150 | MACS extsize : extendsize of peaks to to fix-sized fragments. |
Flags are special parameters without value.
Name | Description |
---|---|
--help | print usage and optional parameters |
--broad | Compute broadpeaks instead of narrowpeaks |
--ignoreDuplicates | Ignore duplicates reads |
--saturation | Run saturation process |
To run the pipeline for ATAC, one can type:
nextflow run iarcbioinfo/bam2peaks-nf -r latest -profile singularity --input_file input.tsv --ref hg38 --gencode gencode.bed --output_folder output --ignoreDuplicates
To run the pipeline without singularity just remove "-profile singularity". Alternatively, one can run the pipeline using a docker container (-profile docker) the conda receipe containing all required dependencies (-profile conda).
To use the pipeline for Chip-seq, add the --chip flag :
nextflow run iarcbioinfo/bam2peaks-nf -r latest -profile singularity --input_file input.tsv --ref hg38 --gencode gencode.bed --output_folder output --mode chip --broad --extsize 320
Type | Description |
---|---|
bw/ | Outputs of bamCoverage in bigWig format |
Counts/ | With --saturation return the number of reads for each subsets |
Peaks | Peaks computed by MACS |
Peaks_intersect | Peaks intersections computed by idr |
QCs | deeptools graphics |
Saturation_peaks | With --saturation, all peaks files for each subsets |
Name | Description | |
---|---|---|
Vincent Cahais | CahaisV@iarc.who.int | Developer to contact for support |
Claire Renard | Renardc@iarc.who.int | Developer |