-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Output VRS_Error INFO field for VCF sites where VRS computation fails #301
Conversation
Add try/except to top-level VCF record iterator Add error message generation to except block Re-raise errors in _get_vrs_object()
Separated out each test to make fixing errors easier Tightened passing criteria by asserting "all played" on the cassette
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
@korikuzma let me know if you have time to review and merge or if I should go ahead and do it. |
@larrybabb I'll take a look now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything looks good to me. I made a small comment that can be accepted or rejected
VCF_ESCAPE_MAP = { | ||
"%": "%25", | ||
":": "%3A", | ||
";": "%3B", | ||
"=": "%3D", | ||
",": "%2C", | ||
"\r": "%0D", | ||
"\n": "%0A", | ||
"\t": "%09" | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:
, =
, and \r
are not used. Can we remove them? If so, then instead of
for search in [ "%", ",", ";", "\t", "\n" ]:
err_msg = err_msg.replace(search, self.VCF_ESCAPE_MAP[search])
we can do
VCF_ESCAPE_MAP = {
"%": "%25",
";": "%3B",
",": "%2C",
"\n": "%0A",
"\t": "%09"
}
VCF_ESCAPE_MAP_KEYS = VCF_ESCAPE_MAP.keys()
...
for search in VCF_ESCAPE_MAP_KEYS:
err_msg = err_msg.replace(search, self.VCF_ESCAPE_MAP[search])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ehclark do you agree with this feedback? If so, please submit the change if not, please explain why and I'll do the merge.
@larrybabb You're good to merge |
Update the implementation that annotates a VCF with VRS IDs such that when an error occurs, a
VRS_Error
INFO field is added to the site instead of the VRS INFO simply being left blank.The existing implementation caught and suppressed many exceptions at the translation step. These exceptions are still caught and logged, but then re-thrown to propagate to the top-level VCF site iterator.
With the new behavior, every VCF site will be modified according to the following:
VRS_Error
INFO field will be added with the relevant error messageVRS_Allele_IDs
INFO field will be added with the VRS ID for the alleles supported by the available translators and blank for those alleles not supportedPercent, newline, tab, semicolon and comma characters in the error messages are escaped with URL (percent) encoding according to the VCF spec.
The unit tests for VCF annotation have been split out into individual tests to make modifications to the VCR cassette fixtures easier to make and debug. Additional VCR configuration and assertions have been added to enforce a requirement that the cassettes must match exactly what is expected.
See also #297