Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SchemaView.induced_enum does not induce permissible values from reachable_from #2343

Open
sneakers-the-rat opened this issue Oct 2, 2024 · 2 comments
Labels
bug Something that should work but isn't, with an example and a test case. schemaview

Comments

@sneakers-the-rat
Copy link
Collaborator

Describe the bug
from discussion in #2303
related to, but seems distinct from: #1690

this is described in the docs as being possible to do, but as far as i can tell it only happens once in the biolink ontology:

enums:
  AnatomicalContextQualifierEnum:
    reachable_from:
      source_ontology: bioregistry:uberon
      source_nodes:
        - UBERON:0001062
      is_direct: false
      relationship_types:
        - rdfs:subClassOf

but schemaview doesn't actually do that, and i'm not sure if the logic to actually compute a reachability query exists anywhere within linkml? i'd actually really like to see it because the description in the metamodel sounds cool.

The docs mention this package (which i also haven't seen before and also looks cool): https://github.com/INCATools/ontology-access-kit and vskit seems to be this? https://github.com/INCATools/ontology-access-kit/blob/main/src/oaklib/utilities/subsets/value_set_expander.py

but i can't start on an impl for linkml because the enum in the biolink model lists source_ontology as bioregistry:uberon and i can't find the bioregistry prefix.

However I can create an expanded schema with vskit, and that yields an enum with 16,121 possible values (!!!!! i understand why this feature was needed !!!!!).

So it seems like the ability to expand a schema needs to be ported over from oaklib, but it has ~30 top-level dependencies so we probably don't want to have linkml-runtime depend on it. That, or we can emit a warning that someone needs to expand the schema before generating from it when reachable_from is encountered in a schema.

idk i don't really have strong feelings about what needs to happen here, i just wanted to raise the issue as a partial "sorry here i'll do some work to pay back how much of your time i took up in the graphql pr" and stop hounding it lol.

Version of LinkML you are using

HEAD@main

Please provide a schema (and if applicable, a data file) that replicates the issue
(biolink schema in repo)

@sneakers-the-rat sneakers-the-rat added the bug Something that should work but isn't, with an example and a test case. label Oct 2, 2024
@cmungall
Copy link
Member

cmungall commented Oct 2, 2024

Individual generators shouldn't be concerned with dynamic enums(*), falling back to string should be sufficient.

Dynamic enums are documented at a high level here:
https://linkml.io/linkml/schemas/enums.html#tooling-to-support-dynamic-enums

Some additional context in #274

The challenge here is there are potentially thousands of vocabulary servers, formats, APIs, databases that could serve as a source of dynamic enums, so materialization is considered application/domain specific. OAK deals with a pretty large chunk of the most common ones but there is a longer tail.

Longer term the approach would be to come up with an API standard along the lines of what @hsolbrig originally proposed (originally called TCCM), plus references servers/shim layers (easy to implement for the 95% case with OAK). Runtime would only need to speak that standard, not the long tail of bespoke formats. This is also related to the NWB/NERD proposal, cc @rly.

*Note there was some thought at first that we might have this compile down to a standard set of extensions for JSON-Schema since a lot of the JSON-Schema community have rolled their own bespoke extensions that speak to OLS etc, but it looks like the JSON-Schema community will not adopt this:
https://github.com/orgs/json-schema-org/discussions/142

@sneakers-the-rat
Copy link
Collaborator Author

Individual generators shouldn't be concerned with dynamic enums

definitely agreed :)

materialization is considered application/domain specific.

makes sense. so until there is some api standard, does schemaview emitting a warning when it encounters a reachable_from directing them to oaklib for expand the enum before generating make sense? or what would your desired behavior for linkml be here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something that should work but isn't, with an example and a test case. schemaview
Projects
None yet
Development

No branches or pull requests

3 participants