-
Notifications
You must be signed in to change notification settings - Fork 417
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do VQSR for HaplotypeCaller calls #89
Comments
I am playing with VQSR (Variant Quality Score Recalibration https://gatk.broadinstitute.org/hc/en-us/articles/360036734411-VariantRecalibrator ) that is available for HaplotypeCaller: this procedure adds PASS filter for the VCF, classifying germline variants for high or low confidence. What I can see is that we are starting from ~5M raw germline calls:
and snpEff leaves them intact:
on the other hand, VEP simply ignores those lines where there is no annotation available:
If I do a recalibration on the VEP results, more than half of those will be thrown away:
without recalibration there are ~600 "high-impact" germline variants:
Recalibration throws ~75% of these (supposedly false positive) high impact away:
|
Actual script used
|
Super cool work @szilvajuhos ! :-) |
I have to correct myself - colleague (Teresita) pointed out that we (as in Sarek) are using a VEP settings that is not printing out common variants: so VEP works just fine as asked, but not printing out the common ones. This implies we should do VQSR before annotation (as planned). Only have to find time to do the actual implementation/testing. |
Right now VSQR is working for WGS samples, but I will need help in adding some GATK bundle files to iGenomes and to config in general. The script I can use for VSQR is below. Comparison is at dev...szilvajuhos:vsqr
|
Issue by @malinlarsson, moved from SciLifeLab#513
Useful comment by @apeltzer
The text was updated successfully, but these errors were encountered: