-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generate graphs using cell sets as unifying concept #24
Generate graphs using cell sets as unifying concept #24
Conversation
def visualize_rdf_graph(self): | ||
nx_graph = rdflib_to_networkx_multidigraph(self.graph) | ||
# Plot Networkx instance of RDF Graph | ||
pos = nx.spring_layout(nx_graph, scale=2, k=2) | ||
edge_labels = nx.get_edge_attributes(nx_graph, "r") | ||
nx.draw_networkx_edge_labels(nx_graph, pos, edge_labels=edge_labels) | ||
nx.draw(nx_graph, with_labels=True) | ||
plt.show() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just a placeholder; I have used OBASK to visualize graphs and examine them for validation purposes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a place where oaklib
could really help. Worth talking to @anitacaron about how she uses it for visualising validation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a conversation with Anita yesterday in the office. From what I gathered, it seems that visualising the relations and neighbours of a set of terms is necessary. However, I am unsure if this is something we actually need.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't know the context. I can have a closer look into oaklib
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's the OboGraph Interface
cl_namespace = Namespace("http://purl.obolibrary.org/obo/CL_") | ||
for curie, label in self.cell_type_dict.items(): | ||
resource = cl_namespace[curie.split(":")[-1]] | ||
self.graph.add((resource, RDFS.label, Literal(label))) | ||
for s, _, _ in self.graph.triples((None, self.ns["cell_type"], Literal(label))): | ||
self.graph.add((s, self.ns["consists_of"], resource)) | ||
# add subClassOf between terms in CL enrichment | ||
for _, row in self.enriched_df.iterrows(): | ||
for s, _, _ in self.graph.triples((None, RDFS.label, Literal(row["s_label"]))): | ||
for o, _, _ in self.graph.triples((None, RDFS.label, Literal(row["o_label"]))): | ||
self.graph.add((s, RDFS.subClassOf, o)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not entirely certain if any methods, other than simple_enrichment, contribute additional information to the graph. This is because those methods may involve CL terms from a subset and context that are not utilized in annotations.
We should talk about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By definition all enrichment methods link to terms related to those used in annotation.
Looking at this again, I think it is clear that I have underspecified. I think the challenge is how we deal with flattening. In the pipelines Anita has worked on we use ROBOT or Souffle to strip redundancy from the flattened graph. If we're sticking with pure python we will need something similar to the the Souffle redundancy stripping algo here - which will require some thought.
For this PR I'd suggest an MVP of building a graph based on co-annotation first. We can then move folding in the enrichment graph to a second ticket/PR.
a221a81
to
bb04f1e
Compare
* Merged from main * Updated anndata_analyzer.py * Removed state and state.l2 from free-text annotations * Refactored visualize_rdf_graph method * Refactored save_rdf_graph, visualize_rdf_graph method and added transitive_reduction method * Format changes in co_annotation_report * Added state and state.l2 to free-text annotations
This reverts commit 625b237.
Resolves #26
TODO:
Currently, we are using enriched_df to add cell type terms to the graph. However, we have noticed that if a cell type does not have any subClassOf relations with other cell types, those terms are missing from the graph. To address this issue, it would be better to utilize the co_annotation report for adding the cell type terms.I use the obs attribute in the anndata object to generate a cell type dictionary. This dictionary consists of cell type IDs and labels, which I then utilize to add the cell type terms to the graph. Then, we can use enriched_df specifically to incorporate the subClassOf relations between those terms.The root cause of the missing cell terms in the neo4j UI is attributed to the way I currently add the cell terms. To address this issue, I will be making updates to the enrich_rdf_graph enrich_rdf_graph method