-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Malformed VCF due to R
reference base
#124
Comments
Hello, @dennishendriksen In cuteSV, the content of REF column is extracted from the input reference genome, so that the R nucleotides comes from the reference genome. For the three questions mentioned above:
Hope it will help. Best, |
Thank you for your quick reply.
I can confirm that this is indeed what is stored in the reference genome. The In order to produce valid VCF you might consider the following for a proper fix: From https://samtools.github.io/hts-specs/VCFv4.3.pdf page 8:
Check:
Alternative check:
Additional examples:
This seems like an cuteSV issue? Working around this issue is tricky, because the
That makes perfectly sense. I'll do some postprocessing afterwards. |
We are still struggling with working around this issue. Any thoughts? |
Hello @tjiangHIT, @Meltpinkg and cuteSV developers,
cuteSV v2.0.3 can produce malformed VCF output containing
R
nucleotides in theREF
column. These are not allowed according to the VCF v4.2 specification: REF - reference base(s): Each base must be one of A,C,G,T,N (case insensitive). The VCF v4.3 specification additionaly mentions: IUPAC ambiguity codes should be converted to a concrete base. Downstream tools such as HTSJDK throw an error correctly stating that the VCF is malformed.For our use case this result in analysis that cannot complete.
Example:
Command:
We've replaced
Sniffles2
withcuteSV
which suffers from what seems to be the same issue including theREF_MISMATCH
warnings mentioned in that issue. Actually this issue was the primary reason for the switch.Greetings,
@dennishendriksen
The text was updated successfully, but these errors were encountered: