-
Notifications
You must be signed in to change notification settings - Fork 593
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed support for newer Gencode GTF versions #8351
Conversation
@jamesemery How "future-proof" is this PR? That is, how likely is it that future releases of Gencode will break the parser again? Has the parser been relaxed to the point where it will tolerate the addition of new fields, etc.? |
@droazen I tried to relax absolutely everything but i stopped short of that. I have a test that any of the optional fields we previously special cased can now have any arbitrary value in them (which was the problem that sunk us here). However if they make DRASTIC changes to future gencode releases (like adding new top level transcript types or inventing a new reference orientation than + or -) then all bets are off... |
...roadinstitute/hellbender/tools/funcotator/dataSources/gencode/GencodeFuncotationFactory.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good. A few questions and change requests.
...a/org/broadinstitute/hellbender/tools/funcotator/dataSources/gencode/GencodeFuncotation.java
Show resolved
Hide resolved
src/main/java/org/broadinstitute/hellbender/utils/codecs/gtf/GencodeGtfCodec.java
Outdated
Show resolved
Hide resolved
...roadinstitute/hellbender/tools/funcotator/dataSources/gencode/GencodeFuncotationFactory.java
Show resolved
Hide resolved
src/main/java/org/broadinstitute/hellbender/utils/codecs/gtf/GencodeGtfFeature.java
Show resolved
Hide resolved
src/main/java/org/broadinstitute/hellbender/utils/codecs/gtf/GencodeGtfFeature.java
Outdated
Show resolved
Hide resolved
…from GencodeGTFFeature (which had a bug with ordering before)
58d2262
to
125aa5a
Compare
@jonn-smith responded to comments and back to you. I fully got rid of the old unparsed string of "anonymousOptionalFields" that were occasionally relevant for non-blessed fields in gencode. There is an enum list of the known optional fields for gencode but everything gets parsed into the same key-value list which should make phase 2 easier to finish here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
Fixes #7166 #7385
This doesn't contain any of the necessary work to support Gencode GFF3 files yet #, that will (probably) come in a subsequent PR as it requires a much more substantial refactoring effort of the Gencode datasources code.