You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm running into an issue where Clair3 produces invalid .g.vcf output resulting in errors when used with downstream tooling, e.g. BCFtools:
Incorrect number of FORMAT/PL values at chr2:14862711, cannot merge. The tag is defined as Number=G, but found
4 values and 3 alleles. See also http://samtools.github.io/bcftools/howtos/FAQ.html#incorrect-nfields
Example output (unfortunately I can't share the input due to its sensitive nature):
pileup.vcf.gz:
chr2 14862711 . C . 7.03 RefCall P GT:GQ:DP:AD:AF:PL 0/0:7:2:1:0.5000:2970
full_alignment.vcf.gz:
chr2 14862711 . C N 7.97 PASS F GT:GQ:DP:AD:AF:PL 0/1:7:3:1,1:0.3333:2970
merge_output.vcf.gz:
chr2 14862711 . C N 7.97 PASS F GT:GQ:DP:AD:AF:PL 0/1:7:3:1,1:0.3333:2970
output.g.vcf.gz:
chr2 14862711 . C N,<NON_REF> 7.97 PASS F GT:GQ:DP:AD:AF:PL 0/1:7:3:1,1,0:0.3333:2970,990,990,990
I'm running into an issue where Clair3 produces invalid
.g.vcf
output resulting in errors when used with downstream tooling, e.g. BCFtools:Example output (unfortunately I can't share the input due to its sensitive nature):
pileup.vcf.gz
:full_alignment.vcf.gz
:merge_output.vcf.gz
:output.g.vcf.gz
:I think the error is in the
PL
computation in case the alternate allele value isN
, caused by https://github.com/HKU-BAL/Clair3/blob/v1.0.0/clair3/CallVariants.py#L1395? Since the VCF fieldPL
is of typeG
the field still must have one value for each possible genotype according to the VCF specification.Possibly the following was intended in
full_alignment.vcf.gz
?The text was updated successfully, but these errors were encountered: