Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add method to query cl slims #8

Merged
merged 6 commits into from
Jun 15, 2023
Merged

Conversation

ubyndr
Copy link
Collaborator

@ubyndr ubyndr commented Jun 14, 2023

I've added slim_list attribute to keep track of slim lists. It is initialised with a list of ontology names, the default value is ["Cell Ontology"]. I've also added added a custom exception class for invalid slim names. minimal_slim_enrichment and full_slim_enrichment raises that exception in case they are called with invalid slim names.
I've added missing anndata dependency.

@ubyndr ubyndr requested a review from dosumis June 14, 2023 21:05
@ubyndr ubyndr linked an issue Jun 14, 2023 that may be closed by this pull request
@ubyndr ubyndr linked an issue Jun 14, 2023 that may be closed by this pull request
@ubyndr ubyndr requested a review from hkir-dev June 15, 2023 09:19
def __init__(self, invalid_slim_list: List[str], valid_slim_list: list[dict[str, str]]):
self.message = (
f"The following slim names are invalid: "
f"{', '.join([slim for slim in invalid_slim_list])}. "
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we directly do ', '.join(invalid_slim_list)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦

@@ -15,15 +17,19 @@ def __init__(
file_path: str,
cell_type_field: Optional[str] = "cell_type_ontology_term_id",
context_field: Optional[str] = "tissue_ontology_term_id",
ontology_list_for_slims: Optional[List[str]] = ["Cell Ontology"],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should validate if all ontology titles are valid and raise an exception otherwise.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will having a Enum for ontology titles help users to find the correct ontology names?

from enum import Enum

class OntologyNames(Enum):
    CL = "Cell Ontology"
    ENVO = "The Environment Ontology"
    DPO= "Drosophila Phenotype Ontology (DPO)"
    ....

Then users can provide the list as [OntologyNames.CL.value, OntologyNames.ENVO.value]. If ubergraph is extended, users can still use the string values.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this particular case, I think we only care about CL but want to leave an option to over-ride.

Validation should be in pandasaurus itself. Pandasaurus also needs more support for letting users know what is possible. We could have a method that lists all ontologies (referencing that in docstring of method for listing slims.)

It looks like your method takes dc:title of ontology. Easy to retrieve a list of these (although be careful about consistency dc namespace).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like your method takes dc:title of ontology. Easy to retrieve a list of these (although be careful about consistency dc namespace).

This is the query I'm using to retrieve slim details; https://api.triplydb.com/s/lFTW8trIj.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have created INCATools/PandaSaurus#21 to keep track

@@ -102,3 +115,19 @@ def set_enricher_property_list(self, property_list: List[str]):
property_list (List[str]): The list of properties to include in the enrichment analysis.
"""
self.__enricher = Query(self.__seed_list, property_list)

def is_invalid_slim_list(self, slim_list):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name is_invalid_slim_list makes me expect this function to return a boolean value. Should we rename it to validate_slim_list ?

@ubyndr ubyndr requested a review from hkir-dev June 15, 2023 11:23
@ubyndr ubyndr merged commit 0761971 into main Jun 15, 2023
@ubyndr ubyndr deleted the 6-add-method-to-query-cl-slims branch June 15, 2023 12:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add method to query CL slims anndata depedency needs specifying?
3 participants