Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

different bcftools consensus versions produce different results at multiple ALT variants #1708

Closed
MarieLataretu opened this issue Apr 25, 2022 · 3 comments

Comments

@MarieLataretu
Copy link

Hi there,

with two different bcftools versions installed via conda (bcftools 1.11, Using htslib 1.11 and bcftools 1.14, Using htslib 1.14). I get different results for the same command and input (SARS-CoV-2 sample):

bcftools consensus \
                    -I \
                    -o ${vcf.baseName}.iupac_consensus.tmp \
                    -f ${reference} \
                    -m ${mask_bed} \
                    --sample ${name} \
                    ${vcf}

bcftools 1.11 applies the deletion, bcftools 1.14 the substitution for this position:

#CHROM	POS	ID	REF	ALT	QUAL	FILTER	INFO	FORMAT	sample
MN908947.3	11073	.	TC	T,TT	45167.2	PASS	AB=0.155887,0.841349;ABP=1863.62,1833.85;AC=1,1;AF=0.5,0.5;AN=2;AO=282,1522;CIGAR=1M1D1M,1M1X1M;DP=1809;DPB=1724.67;DPRA=0,0;EPP=11.9118,36.8465;EPPR=9.52472;GTI=0;LEN=1,1;MEANALT=4,4;MQM=59.9504,59.9382;MQMR=60;NS=1;NUMALT=2;ODDS=76.3545;PAIRED=0.985816,0.979632;PAIREDR=1;PAO=9.66667,9.66667;PQA=321.333,321.333;PQR=321.333;PRO=9.66667;QA=9034,50976;QR=102;RO=3;RPL=66,473;RPP=176.266,476.363;RPPR=9.52472;RPR=216,1049;RUN=1,1;SAF=186,1021;SAP=65.3824,388.796;SAR=96,501;SRF=0;SRP=9.52472;SRR=3;TYPE=del,snp;technology.ILLUMINA=1,1	GT:DP:AD:RO:QR:AO:QA:GL	1/2:1809:3,282,1522:3:102:282,1522:9034,50976:-4846.67,-4119.65,-4043.04,-719.418,0,-269.525

grafik
(from top: reference, 1.11, 1.14)

Variants are called with freebayes and normalized with bcftools.

Is that behavior expected? I haven't found something matching in the change log.

@MarieLataretu
Copy link
Author

After some testing, it seems like 1.14 always decides for the longer variant independent of FORMAT/GT.

@pd3
Copy link
Member

pd3 commented May 2, 2022

It was probably this commit that introduced the difference 89d61e6.
Also check the difference between -I and -H I: the former applies REF+ALT regardless of the genotype, the latter uses FORMAT/GT.

@pd3 pd3 closed this as completed May 2, 2022
@MarieLataretu
Copy link
Author

I see, FORMAT/GT is used only with --sample and -H.
And the correct IPAC code for that example with -I is TT

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants