-
Notifications
You must be signed in to change notification settings - Fork 9
PG 2017.1
Moore, Ben edited this page Jun 8, 2017
·
3 revisions
- Includes new 'hybrid' truthsets for NA12878 which merge Genome in a Bottle v3.3.2 calls with PG 2017.1 using k-mer validation
- Improved confident regions:
- STRs are masked from confident regions if they are only partially covered
- Low density confident regions are filtered via sliding window
- Bug fixes and tweaks:
- All truthset records are now normalised (left-shifted and reference trimmed)
- Truthset insertions have an additional confidence base added for proper hap.py counting
The method for merging Genome in a Bottle and Platinum Genomes call sets will be described in a future publication, in brief:
- GiaB-exclusive truth variants were identified and merged into the PG 2017.1 NA12878 VCF
- A modified version of k-mer filtering (the validation technique introduced with PG 2016.1) was applied
to this merged VCF, to ensure haplotype sequences were present in original alignment data. Modifications were:
- K-mers were tested on alignments from the lower pedigree only
- Unphased input records were tested in all possible phasing combinations with nearby PG variants, with passing k-mers phased as appropriate
- Validated phased variants were included in the hybrid truth set and added to PG 2017.1 confident regions
Download this release from the releases tab.
For more information or to cite Platinum Genomes, see:
- Eberle, MA et al. (2017) A reference data set of 5.4 million phased human variants validated by genetic inheritance from sequencing a three-generation 17-member pedigree. Genome Research, 27:157-164. doi:10.1101/gr.210500.116