VariantContext subcontexting- possible on alleles? #1646
Replies: 1 comment 1 reply
-
@makenzi-nzau Unfortunately there isn't a simple way to do this that is completely correct The big issue is that a many of the INFO and FORMAT fields refer to the alleles and need to be updated in non trivial ways when the allele list changes. Standard fields like "PL" need to change when alleles are subset in non trivial ways. For standard ones there ideally would be support in htsjdk (although I don't believe it is right now unfortunately). Another more intractable issue is that arbitrary custom fields may need to change and there's no way for htsjdk to handle those correctly. There's a bunch of code for handling a number of common cases in GATK.AlleleSubsettingUtils. You could either use that as a starting point for how to implement the cases you care about, or directly use GATK as a library (although it is admittedly a pretty heavyweight addition...) As a side note, VariantContext objects are designed to be immutable(ish) so you can' directly remove an allele from one. You can make a VariantContextBuilder initialized with your existing VC to start with a copy of it, and then make changes based on that. You can for instance then make 2 copies each with a different ALT allele list. (Although you have to fix up the attributes you care about yourself). You probably know which attributes you care about in your vcfs so you can split those as appropriate. Sorry there isn't an easy solution available. It would definitely be a good thing for us to add if we ever have extra time. |
Beta Was this translation helpful? Give feedback.
-
Hi!
I'm interested in using HTSJDK 3.0.4 to parse VCF files containing multiallelic 'ALT'-variants.
My application requires me to split or 'project' these multiallelic rows into monoallelic ones.
Which way would the
VariantContext
API fit this task the best?More specifically
VariantContext.subContextFromSamples(...)
. Is there an equivalent method for selecting alleles?VariantContext
mentions the class being 'highly validating' and 'dynamically typed'. Does this imply that aVariantContext
object balances itself if anAllele
is removed from it?Thank you for your time!
Beta Was this translation helpful? Give feedback.
All reactions