Skip to content

Commit

Permalink
Merge pull request #42 from nf-core/readd_quantification
Browse files Browse the repository at this point in the history
Add alignment based quantification with Salmon
  • Loading branch information
pinin4fjords authored Apr 8, 2024
2 parents 9f2403f + 5560996 commit fda4028
Show file tree
Hide file tree
Showing 39 changed files with 3,475 additions and 11 deletions.
3 changes: 2 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- [[#35](https://github.com/nf-core/riboseq/pull/35) - Sortmerna: index once ([@pinin4fjords](https://github.com/pinin4fjords), review by [@maxulysse](https://github.com/maxulysse))
- [[#36](https://github.com/nf-core/riboseq/pull/36) - Bump bbsplit module to prevent index overwrites ([@pinin4fjords](https://github.com/pinin4fjords), review by [@maxulysse](https://github.com/maxulysse))
- [#38](https://github.com/nf-core/riboseq/pull/38) - Important! Template update for nf-core/tools v2.13.1 ([@nf-core-bot](https://github.com/nf-core-bot), [@pinin4fjords](https://github.com/pinin4fjords))
- [#40](https://github.com/nf-core/riboseq/pull/40) - Ribotricer orf prediction ([@pinin4fjords](https://github.com/pinin4fjords), review by )
- [#40](https://github.com/nf-core/riboseq/pull/40) - Ribotricer orf prediction ([@pinin4fjords](https://github.com/pinin4fjords), review by [@maxulysse](https://github.com/maxulysse))
- [#42](https://github.com/nf-core/riboseq/pull/42) - Add alignment based quantification with Salmon ([@pinin4fjords](https://github.com/pinin4fjords), review by [@maxulysse](https://github.com/maxulysse))

Initial release of nf-core/riboseq, created with the [nf-core](https://nf-co.re/) template.

Expand Down
24 changes: 18 additions & 6 deletions conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -623,36 +623,48 @@ if (!params.skip_alignment && params.aligner == 'star') {
withName: '.*:QUANTIFY_STAR_SALMON:SALMON_QUANT' {
ext.args = { params.extra_salmon_quant_args ?: '' }
publishDir = [
path: { "${params.outdir}/${params.aligner}" },
path: { "${params.outdir}/quantification/salmon" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') || filename.endsWith('_meta_info.json') ? null : filename }
]
}

withName: '.*:QUANTIFY_STAR_SALMON:TX2GENE' {
withName: '.*:QUANTIFY_STAR_SALMON:CUSTOM_TX2GENE' {
publishDir = [
path: { "${params.outdir}/${params.aligner}" },
path: { "${params.outdir}/quantification" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
}

withName: '.*:QUANTIFY_STAR_SALMON:TXIMPORT' {
withName: '.*:QUANTIFY_STAR_SALMON:TXIMETA_TXIMPORT' {
ext.prefix = { "${quant_type}.merged" }
publishDir = [
path: { "${params.outdir}/${params.aligner}" },
path: { "${params.outdir}/quantification/salmon" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
}

withName: '.*:QUANTIFY_STAR_SALMON:SE_.*' {
publishDir = [
path: { "${params.outdir}/${params.aligner}" },
path: { "${params.outdir}/quantification/salmon" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
}
withName: '.*:QUANTIFY_STAR_SALMON:SE_GENE' {
ext.prefix = { "salmon.merged.gene" }
}
withName: '.*:QUANTIFY_STAR_SALMON:SE_GENE_LENGTH_SCALED' {
ext.prefix = { "salmon.merged.gene.length_scaled" }
}
withName: '.*:QUANTIFY_STAR_SALMON:SE_GENE_SCALED' {
ext.prefix = { "salmon.merged.gene.scaled" }
}
withName: '.*:QUANTIFY_STAR_SALMON:SE_TRANSCRIPT' {
ext.prefix = { "salmon.merged.transcript" }
}
}

if (params.with_umi) {
Expand Down
37 changes: 37 additions & 0 deletions docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -280,6 +280,43 @@ Read distribution metrics around annotated protein coding regions or based on al
- `*_psite_offsets.txt`: If the P-site offsets are not provided, txt file containing the derived relative offsets.
</details>

## Quantification

Quantification is done by passing transcriptome-level alignment BAM files to Salmon, producing the following outputs:

<details markdown="1">
<summary>Output files</summary>

- `quantification/`
- `tx2gene.tsv`: Tab-delimited file containing gene to transcripts ids mappings.
- `quantification/salmon/`
- salmon.merged.gene_counts.tsv`: Matrix of gene-level raw counts across all samples.
- salmon.merged.gene_tpm.tsv`: Matrix of gene-level TPM values across all samples.
- salmon.merged.gene.rds`: RDS object that can be loaded in R that contains a [SummarizedExperiment](https://bioconductor.org/packages/release/bioc/html/SummarizedExperiment.html) container with the TPM (`abundance`), estimated counts (`counts`) and transcript length (`length`) in the assays slot for genes.
- salmon.merged.gene_counts_scaled.tsv`: Matrix of gene-level library size-scaled counts across all samples.
- salmon.merged.gene\_\_scaled.rds`: RDS object that can be loaded in R that contains a [SummarizedExperiment](https://bioconductor.org/packages/release/bioc/html/SummarizedExperiment.html) container with the TPM (`abundance`), estimated library size-scaled counts (`counts`) and transcript length (`length`) in the assays slot for genes.
- salmon.merged.gene_counts_length_scaled.tsv`: Matrix of gene-level length-scaled counts across all samples.
- salmon.merged.gene_length_scaled.rds`: RDS object that can be loaded in R that contains a [SummarizedExperiment](https://bioconductor.org/packages/release/bioc/html/SummarizedExperiment.html) container with the TPM (`abundance`), estimated length-scaled counts (`counts`) and transcript length (`length`) in the assays slot for genes.
- salmon.merged.transcript_counts.tsv`: Matrix of isoform-level raw counts across all samples.
- salmon.merged.transcript_tpm.tsv`: Matrix of isoform-level TPM values across all samples.
- salmon.merged.transcript.rds`: RDS object that can be loaded in R that contains a [SummarizedExperiment](https://bioconductor.org/packages/release/bioc/html/SummarizedExperiment.html) container with the TPM (`abundance`), estimated isoform-level raw counts (`counts`) and transcript length (`length`) in the assays slot for transcripts.
</details>

Raw outputs from Salmon are available for each sample:

<details markdown="1">
<summary>Output files</summary>

- `quantification/salmon/<SAMPLE>/`
- `aux_info/`: Auxiliary info e.g. versions and number of mapped reads.
- `cmd_info.json`: Information about the Salmon quantification command, version and options.
- `lib_format_counts.json`: Number of fragments assigned, unassigned and incompatible.
- `libParams/`: Contains the file `flenDist.txt` for the fragment length distribution.
- `logs/`: Contains the file `salmon_quant.log` giving a record of Salmon's quantification.
- `quant.genes.sf`: Salmon _gene_-level quantification of the sample, including feature length, effective length, TPM, and number of reads.
- `quant.sf`: Salmon _transcript_-level quantification of the sample, including feature length, effective length, TPM, and number of reads.
</details>

### MultiQC

<details markdown="1">
Expand Down
27 changes: 26 additions & 1 deletion modules.json
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,11 @@
"git_sha": "1b0ffa4e5aed5b7e3cd4311af31bd3b2c8345051",
"installed_by": ["modules"]
},
"custom/tx2gene": {
"branch": "master",
"git_sha": "ec155021a9104441bf6a9bae3b55d1b5b0bfdb3a",
"installed_by": ["quantify_pseudo_alignment"]
},
"cutadapt": {
"branch": "master",
"git_sha": "6618151ed69274863dc6fe6d2920afa90abaca1f",
Expand Down Expand Up @@ -70,6 +75,11 @@
"git_sha": "de5811dd9ca15af1e131806001bcaae909e42021",
"installed_by": ["modules"]
},
"kallisto/quant": {
"branch": "master",
"git_sha": "de5811dd9ca15af1e131806001bcaae909e42021",
"installed_by": ["quantify_pseudo_alignment"]
},
"multiqc": {
"branch": "master",
"git_sha": "b7ebe95761cd389603f9cc0e0dc384c0f663815a",
Expand Down Expand Up @@ -108,7 +118,7 @@
"salmon/quant": {
"branch": "master",
"git_sha": "03a8562231d575c313266c193a980594b941e3ea",
"installed_by": ["fastq_subsample_fq_salmon"]
"installed_by": ["fastq_subsample_fq_salmon", "quantify_pseudo_alignment"]
},
"samtools/flagstat": {
"branch": "master",
Expand Down Expand Up @@ -150,11 +160,21 @@
"git_sha": "a21faa6a3481af92a343a10926f59c189a2c16c9",
"installed_by": ["modules"]
},
"summarizedexperiment/summarizedexperiment": {
"branch": "master",
"git_sha": "31751460f9ce9552846e13fdeec6953dcb47132d",
"installed_by": ["quantify_pseudo_alignment"]
},
"trimgalore": {
"branch": "master",
"git_sha": "d2c5e76f291379f3dd403e48e46ed7e6ba5da744",
"installed_by": ["fastq_fastqc_umitools_trimgalore"]
},
"tximeta/tximport": {
"branch": "master",
"git_sha": "c275c3baac6df8f0c7c500760a0cf014ce7b525d",
"installed_by": ["quantify_pseudo_alignment"]
},
"umitools/dedup": {
"branch": "master",
"git_sha": "3bd4f34e3093c2a16e6a8eefc22242b9b94641db",
Expand Down Expand Up @@ -219,6 +239,11 @@
"git_sha": "8aa7040ca55a511ee4bc079803a8446db00f34c8",
"installed_by": ["subworkflows"]
},
"quantify_pseudo_alignment": {
"branch": "master",
"git_sha": "bca4985339b3ba879f457565806deb2377873b83",
"installed_by": ["subworkflows"]
},
"utils_nextflow_pipeline": {
"branch": "master",
"git_sha": "5caf7640a9ef1d18d765d55339be751bb0969dfa",
Expand Down
9 changes: 9 additions & 0 deletions modules/nf-core/custom/tx2gene/environment.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

36 changes: 36 additions & 0 deletions modules/nf-core/custom/tx2gene/main.nf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

65 changes: 65 additions & 0 deletions modules/nf-core/custom/tx2gene/meta.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit fda4028

Please sign in to comment.