-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SIGNOR 3.0 parser. Updated to map SIGNOR phenotypes to GO terms when … #251
base: master
Are you sure you want to change the base?
Conversation
…available from their mappings.
Common/data_sources.py
Outdated
@@ -78,7 +79,8 @@ | |||
PLANT_GOA: ("parsers.GOA.src.loadGOA", "PlantGOALoader"), | |||
REACTOME: ("parsers.Reactome.src.loadReactome", "ReactomeLoader"), | |||
SCENT: ("parsers.scent.src.loadScent", "ScentLoader"), | |||
SGD: ("parsers.SGD.src.loadSGD", "SGDLoader"), | |||
SGD: ("parsers.SIGNOR.src.loadSIGNOR", "SIGNORLoader"), | |||
SIGNOR: ("parsers.SGD.src.loadSGD", "SGDLoader"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry just lurking but noticed the constants might be reversed (signor values in sgd and the other way around)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh my bad!! Thanks for catching this.
Sorry I don't know enough about your project to review, I am a colleague of @DnlRKorn from an entirely different KG land.. 😂 good luck with your work! |
Currently, it cannot be run as written.
Please add files and define variable |
self.data_path is actually set in the init of the SourceDataLoader interface, which all parsers inherit from, and should call with super().init so that is not an issue. Maybe not the easiest to read code design but that's how it works for all them. ORION/Common/loader_interface.py Lines 32 to 37 in b124dfb
|
Fixed issue where SIGNOR parsing fails due to filename not being included in self.data_files list.
Deleted one unnecessary line.
get_data is a function that needs to be called, not a property, so this doesn't work
the typo on the website is "Mechansims" it just happened to work anyway because the mechanism download was the default
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if we can really say "affects" here. This default predicate would be used for 3 edges in the data where the effect and mechanism are unknown (the values of which define all the predicates in the data). Therefore, we don't have any information except that these two entities are related. We could imply that they affect each other, but I was trying to avoid implications.
I think you stumbled on a bigger problem. I completely agree with you on this example. But take a look at this: {"subject": "UniProtKB:Q8WXG6", "predicate": "biolink:related_to", "object": "UniProtKB:P20336", "primary_knowledge_source": "infores:signor", "knowledge_level": "knowledge_assertion", "agent_type": "manual_agent", "publications": ["PMID:11809763"], "description": ["Rab3A, a member of the Rab3 small G protein family, regulates Ca(2+)-dependent exocytosis of neurotransmitter. The cyclical activation and inactivation of Rab3A are essential for the Rab3A action in exocytosis. GDP-Rab3A is activated to GTP-Rab3A by Rab3 GDP/GTP exchange protein (Rab3 GEP), and GTP-Rab3A is inactivated to GDP-Rab3A by Rab3 GTPase-activating protein (Rab3 GAP)."], "species_context_qualifier": ["NCBITaxon:10116"], "anatomical_context_qualifier": ["BTO:0000142"], "subject_part_qualifier": null, "object_part_qualifier": null} Some how this row of data made two edges. Something else is going on here. The second edge is the one I would expect. |
parsers/SIGNOR/src/loadSIGNOR.py
Outdated
if effect in effect_mapping.keys(): | ||
# Handle edge from mechanism | ||
if mechanism: | ||
self.create_and_parse_edge(row, extractor, mechanism=mechanism) | ||
|
||
for predicate in effect_mapping[effect].keys(): | ||
edge_properties = effect_mapping[effect][predicate] | ||
|
||
# Final edge creation | ||
self.create_and_parse_edge(row, extractor, predicate=predicate, | ||
edge_properties=edge_properties) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks like it has the potential to create multiple edges from the same row
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just submitted a commit that reverses the "related_to" -> "affects" change and also handles this section so that edges are not duplicated. Now when I parse the source it only creates 3 "biolink:related_to" edges and the duplicate case we found only creates 1 edge, as it should.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this source should be good to go for inclusion in the graph now.
…cate edge creation.
Updated to map SIGNOR phenotypes to GO terms when available from their mappings. A similar mapping process can be repeated for the complexes, protein families, and stimuli terms if SIGNOR ever updates their mappings.