Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve variant calling documentation #174

Merged
merged 22 commits into from
Apr 2, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ Piellorieppe is one of the main massif in the Sarek National Park.
- [#164](https://github.com/nf-core/sarek/pull/164) - Add `--no_gatk_spark` params and tests
- [#167](https://github.com/nf-core/sarek/pull/167) - Add `--markdup_java_options` documentation
- [#169](https://github.com/nf-core/sarek/pull/169) - Add `RELEASE_CHECKLIST.md` document
- [#174](https://github.com/nf-core/sarek/pull/174) - Add `variant_calling.md` documentation

### Changed - [2.6dev]

Expand All @@ -47,9 +48,8 @@ Piellorieppe is one of the main massif in the Sarek National Park.
- [#141](https://github.com/nf-core/sarek/pull/141) - Update `VEP` databases to `99`
- [#143](https://github.com/nf-core/sarek/pull/143) - Revert `snpEff` cache version to `75` for `GRCh37`
- [#143](https://github.com/nf-core/sarek/pull/143) - Revert `snpEff` cache version to `86` for `GRCh38`
- [#152](https://github.com/nf-core/sarek/pull/152), [#158](https://github.com/nf-core/sarek/pull/158) - Update docs
- [#152](https://github.com/nf-core/sarek/pull/152), [#158](https://github.com/nf-core/sarek/pull/158), [#164](https://github.com/nf-core/sarek/pull/164), [#174](https://github.com/nf-core/sarek/pull/174) - Update docs
- [#164](https://github.com/nf-core/sarek/pull/164) - Update `gatk4-spark` from `4.1.4.1` to `4.1.6.0`
- [#164](https://github.com/nf-core/sarek/pull/164) - Update docs

### Fixed - [2.6dev]

Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,7 @@ The nf-core/sarek pipeline comes with documentation about the pipeline, found in
* [Input files documentation](docs/input.md)
* [Documentation about containers](docs/containers.md)
4. [Output and how to interpret the results](docs/output.md)
* [Extra documentation on variant calling](docs/variant_calling.md)
* [Complementary information about ASCAT](docs/ascat.md)
ggabernet marked this conversation as resolved.
Show resolved Hide resolved
* [Extra documentation on annotation](docs/annotation.md)
5. [Troubleshooting](https://nf-co.re/usage/troubleshooting)
Expand Down
1 change: 1 addition & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ The nf-core/sarek documentation is split into the following files:
* [Input files documentation](input.md)
* [Documentation about containers](containers.md)
4. [Output and how to interpret the results](output.md)
* [Extra documentation on variant calling](variant_calling.md)
* [Complementary information about ASCAT](ascat.md)
* [Extra documentation on annotation](annotation.md)
5. [Troubleshooting](https://nf-co.re/usage/troubleshooting)
4 changes: 2 additions & 2 deletions docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -135,15 +135,15 @@ For all samples:

## Variant Calling

All the results regarding Variant Calling are collected in this directory.
All the results regarding Variant Calling are collected in this directory. If some results from a variant caller do not appear here, please check out the [Variant calling](./variant_calling.md) documentation.

Recalibrated BAM files can also be used as an input to start the Variant Calling, for more information see [TSV files output information](#tsv-files)

### SNVs and small indels

#### FreeBayes

[FreeBayes](https://github.com/ekg/freebayes) is a Bayesian genetic variant detector designed to find small polymorphisms, specifically SNPs, indels, MNPs, and complex events smaller than the length of a short-read sequencing alignment..
[FreeBayes](https://github.com/ekg/freebayes) is a Bayesian genetic variant detector designed to find small polymorphisms, specifically SNPs, indels, MNPs, and complex events smaller than the length of a short-read sequencing alignment.

For further reading and documentation see the [FreeBayes manual](https://github.com/ekg/freebayes/blob/master/README.md#user-manual-and-guide).

Expand Down
25 changes: 17 additions & 8 deletions docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -328,8 +328,17 @@ Available: `mapping`, `recalibrate`, `variantcalling` and `annotate`

### --tools

Use this to specify the tools to run:
Available: `ASCAT`, `ControlFREEC`, `FreeBayes`, `HaplotypeCaller`, `Manta`, `mpileup`, `MSIsensor`, `Mutect2`, `Strelka`, `TIDDIT`
Use this parameter to specify the variant calling and annotation tools to be used. For example:

```bash
--tools 'Strelka,mutect2,SnpEff'
```

Available variant callers: `ASCAT`, `ControlFREEC`, `FreeBayes`, `HaplotypeCaller`, `Manta`, `mpileup`, `MSIsensor`, `Mutect2`, `Strelka`, `TIDDIT`.

> `/!\` Not all variant callers are available for both germline and somatic variant calling. For more details please check the [variant calling](variant_calling.md) extra documentation.

Available annotation tools: `VEP`, `SnpEff`, `merge`. For more details, please check the [annotation](annotation.md) extra documentation.

### --sentieon

Expand Down Expand Up @@ -471,8 +480,8 @@ The syntax for this reference configuration is as follows:
params {
genomes {
'GRCh38' {
ac_loci = '<path to the acLoci file>'
ac_loci_gc = '<path to the acLociGC file>'
ac_loci = '<path to the ac_loci file>'
ac_loci_gc = '<path to the ac_loci_gc file>'
bwa = '<path to the bwa indexes>'
chr_dir = '<path to the chromosomes folder>'
chr_length = '<path to the chromosomes lenght file>'
Expand All @@ -481,11 +490,11 @@ params {
dict = '<path to the dict file>'
fasta = '<path to the fasta file>'
fasta_fai = '<path to the fasta index>'
germline_resource = '<path to the germlineResource file>'
germline_resource_index = '<path to the germlineResource index>'
germline_resource = '<path to the germline_resource file>'
germline_resource_index = '<path to the germline_resource index>'
intervals = '<path to the intervals file>'
known_indels = '<path to the knownIndels file>'
known_indels_index = '<path to the knownIndels index>'
known_indels = '<path to the known_indels file>'
known_indels_index = '<path to the known_indels index>'
snpeff_db = '<version of the snpEff DB>'
species = '<species>'
vep_cache_version = '<version of the VEP cache>'
Expand Down
56 changes: 56 additions & 0 deletions docs/variant_calling.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# Variant calling

- [Germline variant calling](#germline-variant-calling)
- [Somatic variant calling with tumor - normal pairs](#somatic-variant-calling-with-tumor---normal-pairs)
- [Somatic variant calling with tumor only samples](#somatic-variant-calling-with-tumor-only-samples)

## Germline variant calling

Using Sarek, germline variant calling will be performed always, if a variant calling tool is selected and it allows for germline variant calling.
You can specify the variant caller to use with the `--tools` parameter (see [usage](./usage.md) for more information).

Germline variant calling can currently only be performed with the following variant callers:

- HaplotypeCaller
- Manta
ggabernet marked this conversation as resolved.
Show resolved Hide resolved
- mpileup
- Sentieon (check the specific [sentieon](sentieon.md) documentation)
- Strelka
- TIDDIT

For more information on the individual variant callers, and where to find the variant calling results, check the [output](output.md) documentation.

## Somatic variant calling with tumor - normal pairs

Using Sarek, somatic variant calling will be performed, if your input tsv file contains tumor / normal pairs (see [input](input.md) documentation for more information).
Different samples belonging to the same patient, where at least one is marked as normal (`0` in the `Status` column) and at least one is marked as tumor (`1` in the `Status` column) are treated as tumor / normal pairs.

If tumor-normal pairs are provided, both germline variant calling and somatic variant calling will be performed, provided that the selected variant caller allows for it.
If the selected variant caller allows only for somatic variant calling, then only somatic variant calling results will be generated.

Here is a list of the variant calling tools that support somatic variant calling:

- ASCAT (check the specific [ASCAT](ascat.md) documentation)
- ControlFREEC
- FreeBayes
- Manta
- MSIsensor
- Mutect2
- Sentieon (check the specific [sentieon](sentieon.md) documentation)
- Strelka

For more information on the individual variant callers, and where to find the variant calling results, check the [output](output.md) documentation.

## Somatic variant calling with tumor only samples

Somatic variant calling with only tumor samples (no matching normal sample), is not recommended by the GATK best practices.
This is just supported for a limited variant callers.

Here is a list of the variant calling tools that support tumor-only somatic variant calling:

- Manta
ggabernet marked this conversation as resolved.
Show resolved Hide resolved
- mpileup
- Mutect2
ggabernet marked this conversation as resolved.
Show resolved Hide resolved
- TIDDIT

For more information on the individual variant callers, and where to find the variant calling results, check the [output](output.md) documentation.