diff --git a/CHANGELOG.md b/CHANGELOG.md index 931e6374d..4a308dbb3 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -21,6 +21,7 @@ Piellorieppe is one of the main massif in the Sarek National Park. - [#164](https://github.com/nf-core/sarek/pull/164) - Add `--no_gatk_spark` params and tests - [#167](https://github.com/nf-core/sarek/pull/167) - Add `--markdup_java_options` documentation - [#169](https://github.com/nf-core/sarek/pull/169) - Add `RELEASE_CHECKLIST.md` document +- [#174](https://github.com/nf-core/sarek/pull/174) - Add `variant_calling.md` documentation ### Changed - [2.6dev] @@ -47,9 +48,8 @@ Piellorieppe is one of the main massif in the Sarek National Park. - [#141](https://github.com/nf-core/sarek/pull/141) - Update `VEP` databases to `99` - [#143](https://github.com/nf-core/sarek/pull/143) - Revert `snpEff` cache version to `75` for `GRCh37` - [#143](https://github.com/nf-core/sarek/pull/143) - Revert `snpEff` cache version to `86` for `GRCh38` -- [#152](https://github.com/nf-core/sarek/pull/152), [#158](https://github.com/nf-core/sarek/pull/158) - Update docs +- [#152](https://github.com/nf-core/sarek/pull/152), [#158](https://github.com/nf-core/sarek/pull/158), [#164](https://github.com/nf-core/sarek/pull/164), [#174](https://github.com/nf-core/sarek/pull/174) - Update docs - [#164](https://github.com/nf-core/sarek/pull/164) - Update `gatk4-spark` from `4.1.4.1` to `4.1.6.0` -- [#164](https://github.com/nf-core/sarek/pull/164) - Update docs ### Fixed - [2.6dev] diff --git a/README.md b/README.md index 6cc5d6f8e..1e1028e6f 100644 --- a/README.md +++ b/README.md @@ -68,6 +68,7 @@ The nf-core/sarek pipeline comes with documentation about the pipeline, found in * [Input files documentation](docs/input.md) * [Documentation about containers](docs/containers.md) 4. [Output and how to interpret the results](docs/output.md) + * [Extra documentation on variant calling](docs/variant_calling.md) * [Complementary information about ASCAT](docs/ascat.md) * [Extra documentation on annotation](docs/annotation.md) 5. [Troubleshooting](https://nf-co.re/usage/troubleshooting) diff --git a/docs/README.md b/docs/README.md index 43e1c6005..0c72bcadb 100644 --- a/docs/README.md +++ b/docs/README.md @@ -14,6 +14,7 @@ The nf-core/sarek documentation is split into the following files: * [Input files documentation](input.md) * [Documentation about containers](containers.md) 4. [Output and how to interpret the results](output.md) + * [Extra documentation on variant calling](variant_calling.md) * [Complementary information about ASCAT](ascat.md) * [Extra documentation on annotation](annotation.md) 5. [Troubleshooting](https://nf-co.re/usage/troubleshooting) diff --git a/docs/output.md b/docs/output.md index 8f16f9e21..1d1cbb8c5 100644 --- a/docs/output.md +++ b/docs/output.md @@ -135,7 +135,7 @@ For all samples: ## Variant Calling -All the results regarding Variant Calling are collected in this directory. +All the results regarding Variant Calling are collected in this directory. If some results from a variant caller do not appear here, please check out the [Variant calling](./variant_calling.md) documentation. Recalibrated BAM files can also be used as an input to start the Variant Calling, for more information see [TSV files output information](#tsv-files) @@ -143,7 +143,7 @@ Recalibrated BAM files can also be used as an input to start the Variant Calling #### FreeBayes -[FreeBayes](https://github.com/ekg/freebayes) is a Bayesian genetic variant detector designed to find small polymorphisms, specifically SNPs, indels, MNPs, and complex events smaller than the length of a short-read sequencing alignment.. +[FreeBayes](https://github.com/ekg/freebayes) is a Bayesian genetic variant detector designed to find small polymorphisms, specifically SNPs, indels, MNPs, and complex events smaller than the length of a short-read sequencing alignment. For further reading and documentation see the [FreeBayes manual](https://github.com/ekg/freebayes/blob/master/README.md#user-manual-and-guide). diff --git a/docs/usage.md b/docs/usage.md index a288a6e0c..da2538be2 100644 --- a/docs/usage.md +++ b/docs/usage.md @@ -328,8 +328,17 @@ Available: `mapping`, `recalibrate`, `variantcalling` and `annotate` ### --tools -Use this to specify the tools to run: -Available: `ASCAT`, `ControlFREEC`, `FreeBayes`, `HaplotypeCaller`, `Manta`, `mpileup`, `MSIsensor`, `Mutect2`, `Strelka`, `TIDDIT` +Use this parameter to specify the variant calling and annotation tools to be used. For example: + +```bash +--tools 'Strelka,mutect2,SnpEff' +``` + +Available variant callers: `ASCAT`, `ControlFREEC`, `FreeBayes`, `HaplotypeCaller`, `Manta`, `mpileup`, `MSIsensor`, `Mutect2`, `Strelka`, `TIDDIT`. + +> `/!\` Not all variant callers are available for both germline and somatic variant calling. For more details please check the [variant calling](variant_calling.md) extra documentation. + +Available annotation tools: `VEP`, `SnpEff`, `merge`. For more details, please check the [annotation](annotation.md) extra documentation. ### --sentieon @@ -471,8 +480,8 @@ The syntax for this reference configuration is as follows: params { genomes { 'GRCh38' { - ac_loci = '' - ac_loci_gc = '' + ac_loci = '' + ac_loci_gc = '' bwa = '' chr_dir = '' chr_length = '' @@ -481,11 +490,11 @@ params { dict = '' fasta = '' fasta_fai = '' - germline_resource = '' - germline_resource_index = '' + germline_resource = '' + germline_resource_index = '' intervals = '' - known_indels = '' - known_indels_index = '' + known_indels = '' + known_indels_index = '' snpeff_db = '' species = '' vep_cache_version = '' diff --git a/docs/variant_calling.md b/docs/variant_calling.md new file mode 100644 index 000000000..ad752e3e9 --- /dev/null +++ b/docs/variant_calling.md @@ -0,0 +1,56 @@ +# Variant calling + +- [Germline variant calling](#germline-variant-calling) +- [Somatic variant calling with tumor - normal pairs](#somatic-variant-calling-with-tumor---normal-pairs) +- [Somatic variant calling with tumor only samples](#somatic-variant-calling-with-tumor-only-samples) + +## Germline variant calling + +Using Sarek, germline variant calling will be performed always, if a variant calling tool is selected and it allows for germline variant calling. +You can specify the variant caller to use with the `--tools` parameter (see [usage](./usage.md) for more information). + +Germline variant calling can currently only be performed with the following variant callers: + +- HaplotypeCaller +- Manta +- mpileup +- Sentieon (check the specific [sentieon](sentieon.md) documentation) +- Strelka +- TIDDIT + +For more information on the individual variant callers, and where to find the variant calling results, check the [output](output.md) documentation. + +## Somatic variant calling with tumor - normal pairs + +Using Sarek, somatic variant calling will be performed, if your input tsv file contains tumor / normal pairs (see [input](input.md) documentation for more information). +Different samples belonging to the same patient, where at least one is marked as normal (`0` in the `Status` column) and at least one is marked as tumor (`1` in the `Status` column) are treated as tumor / normal pairs. + +If tumor-normal pairs are provided, both germline variant calling and somatic variant calling will be performed, provided that the selected variant caller allows for it. +If the selected variant caller allows only for somatic variant calling, then only somatic variant calling results will be generated. + +Here is a list of the variant calling tools that support somatic variant calling: + +- ASCAT (check the specific [ASCAT](ascat.md) documentation) +- ControlFREEC +- FreeBayes +- Manta +- MSIsensor +- Mutect2 +- Sentieon (check the specific [sentieon](sentieon.md) documentation) +- Strelka + +For more information on the individual variant callers, and where to find the variant calling results, check the [output](output.md) documentation. + +## Somatic variant calling with tumor only samples + +Somatic variant calling with only tumor samples (no matching normal sample), is not recommended by the GATK best practices. +This is just supported for a limited variant callers. + +Here is a list of the variant calling tools that support tumor-only somatic variant calling: + +- Manta +- mpileup +- Mutect2 +- TIDDIT + +For more information on the individual variant callers, and where to find the variant calling results, check the [output](output.md) documentation.