Skip to content

PG 2017.1

Moore, Ben edited this page Jun 8, 2017 · 3 revisions

Release notes

  • Includes new 'hybrid' truthsets for NA12878 which merge Genome in a Bottle v3.3.2 calls with PG 2017.1 using k-mer validation
  • Improved confident regions:
    • STRs are masked from confident regions if they are only partially covered
    • Low density confident regions are filtered via sliding window
  • Bug fixes and tweaks:
    • All truthset records are now normalised (left-shifted and reference trimmed)
    • Truthset insertions have an additional confidence base added for proper hap.py counting

Details

Hybrid truthset

The method for merging Genome in a Bottle and Platinum Genomes call sets will be described in a future publication, in brief:

  1. GiaB-exclusive truth variants were identified and merged into the PG 2017.1 NA12878 VCF
  2. A modified version of k-mer filtering (the validation technique introduced with PG 2016.1) was applied to this merged VCF, to ensure haplotype sequences were present in original alignment data. Modifications were:
    • K-mers were tested on alignments from the lower pedigree only
    • Unphased input records were tested in all possible phasing combinations with nearby PG variants, with passing k-mers phased as appropriate
  3. Validated phased variants were included in the hybrid truth set and added to PG 2017.1 confident regions

Download

Download this release from the releases tab.

Clone this wiki locally