Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: allow bijective functions to be applied to S and O prior to applying mapping predicate #61

Open
cmungall opened this issue Jun 8, 2021 · 13 comments

Comments

@cmungall
Copy link
Contributor

cmungall commented Jun 8, 2021

There are many cases where equivalence/exact is not appropriate but we want to be more precise that close etc

E..g

I propose to add two optional columns {sub,ob}ject_transform_function (SF, OF) such that a mapping is read as

SF(S) P OF(O)

E.g.

  1. encoded_by(uniprot:P12345) exactMatch HGNC:2345
  2. measurement_of(dwc:depth_in_meters) exactMatch measurement_of(foo:depth_in_cm)
  3. has_species_neutral_form(zfa:heart) exactMatch has_species_neutral_form(ma:heart)
  4. chebi:citric_acid exactMatch has_conjugate_acid(kegg:citrate)

However, this has the undesirable property of losing 1:1 of exact/equivalent/etc

It may be preferable to use 1:1 functions with the option to include an argument for building the function, e.g.

  1. no change
  2. measurement_in(m)(dwc:depth_in_meters) exactMatch measurement_of(cm)(foo:depth_in_cm)
  3. has_species_neutral_form(Drer)(zfa:heart) exactMatch has_species_neutral_form(Mmus)(ma:heart)
  4. no change

measurement_in(UNIT: u) is a function builder that returns a 1:1 function, e.g.

measurement_in(m) => F, F(10m) = 10, F(100cm) = 1, ...

This preserves 1:1ness of exactMatch, and allows users who don't care about semantic precision to see all 1:1 mappings, at the same time as preserving the precise semantics

The exact specification of the functions may be out scope for SSSOM itself, but could be handled by robot templates

This does complicate the mapping to OWL, particularly where a logical axiom type is used as the predicate. I am OK with simply saying these should not be translated without a force option

This may seem to be getting away from the Simple in SSSOM, but I would argue this keeps the mapping format simple and usable while dealing with genuinely tricky cases in a way that doesn't sacrifice precision.

@matentzn
Copy link
Collaborator

matentzn commented Jun 9, 2021

This is very close to what I was thinking, but called it "pattern" or "complex" matches. This needs to be discussed in a meeting though, too complicated for a ticket. I think its time to organise an SSSOM consortium meeting.

@matentzn
Copy link
Collaborator

matentzn commented Aug 20, 2021

In the workshop we need to decide whether this is out of scope or in scope.

The transformation function basically is a non-standard string that may or may not co-incide with a transformation function in another dataset - that is risky, so these would have to be enums or some such, and there are potentially 1000s.

EDIT, I removed my previous concerns a bit, although its still hard to see how the transformation function should be applied automatically if its unclear what constitutes and instance of dwc:depth_in_meters.

@matentzn
Copy link
Collaborator

EDIT, I removed some of my previous concerns now.

@matentzn
Copy link
Collaborator

Ok after my conversation with @cmungall it seems that what he calls "bijective functions" here really is a structural pattern-based system - so it does fall under the same sort of logic. If we wanted to keep it entirely general we can add a subject_mapping_pattern property that works by namespace, like upheno:abnormalAnatomicalEntity and then a mapping_pattern_system which is DOSDP - or whatever other system you want to use (OTTR, ROBOT template).

@matentzn
Copy link
Collaborator

matentzn commented Sep 1, 2021

There is still some disconnect between the complex term level mapping and transformation functions induced over instances of a term represented by subject/object.

Pattern level:

| subject_id | object_id | predicate_id | object_mapping_template | mapping_pattern_system |
| ZFA:heart | Uberon:heart|NCITaxon:zebrafish | owl:equivalentClass | uberon:ssae.yaml | dosdp |

This basically means that you can apply the object_mapping_template to the uberon:ssae.yaml to obtain the expression that is equivalent to subject_id.

Instance level

| subject_id | object_id | predicate_id | object_mapping_template | mapping_pattern_system |
| O1:length_in_cm | O2:length_in_m | skos:exactMatch | ? | ? |

How will you represent this in a computable fashion? Lets say you use a measurement pattern - how will you express the 10* using OWL? And how useful is this exactly? In the workshop, we need to decide whether these cases are really in scope here or not.

@matentzn
Copy link
Collaborator

matentzn commented Sep 2, 2021

Related: #36

@dosumis
Copy link

dosumis commented Sep 5, 2021

Simpler suggestion for X species mappings: just use a specific predicate for it as a subproperty of related. Cross-species mappings could never really be exact and need to be easily separable from mappings that are. We could potentially further specialise cross-species mappings with subproperties (less sure about these):

related
. cross-species
. . cross-species-equivalent ? # Use for identical logical def apart from clause in_taxon some X ?
. . cross-species broad ?
. . cross-species narrow ?

I see the major use case here as being consumption of mappings by users who want to map between annotated datasets.

Any user that wants to re-derive the OWL would presumably be sophisticated enough to work out how to do this based on the source ontologies. We should encourage source ontologies to declare taxon specificity.

In the case of mappings between species-general and species-specific anatomy ontologies maintained under OBO, I think OWL files or some standard templating system should be the master, with SSSOM files - aimed at consumers working with annotated datasets - being derived from these.

It might be worth considering similar strategies for other complex mappings here.

@dosumis
Copy link

dosumis commented Sep 5, 2021

CC @balhoff @gouttegd

@gouttegd
Copy link
Contributor

gouttegd commented Sep 6, 2021

@dosumis Regarding "mappings between species-general and species-specific anatomy ontologies", this looks good to me, but just to check that I understood correctly what you meant, if you can bear with me.

Currently, for e.g. the FBbt-to-CL mappings, we have a bridge file (src/ontology/bridge/cl-bridge-to-fbbt.obo, in Uberon repo) that looks like this:

[Term]
id: FBbt:00004936
property_value: IAO:00000589 "spermatocyte (drosophila)"
intersection_of: CL:0000017 ! spermatocyte
intersection_of: part_of NCBITaxon:7229

(this bridge being generated from the xrefs found in FBbt)

What you're suggesting is to have instead something like the following:

[Term]
id: FBbt:00004936
property_value: IAO:00000589 "spermatocyte (drosophila)"
relationship: cross_species CL:0000017 ! spermatocyte

(with cross_species being a new RO term?).

That file being either hand-crafted or, preferably, generated from a hand-crafted CSV table (but in any case not generated from xrefs extracted from one of the ontologies being mapped). That same CSV table would then also be used to generate a SSSOM file like the following:

subject_id	predicate_id	object_id	match_type	subject_label	object_label
FBbt:00004936	RO:cross_species	CL:0000017	HumanCurated	spermatocyte	spermatocyte

Did I get that approximately right?

@matentzn
Copy link
Collaborator

matentzn commented Sep 6, 2021

The general modelling looks right @gouttegd but these are not alternatives - the SSSOM mapping files and the normal bridge files will live side by side and fulfill different use cases!

@gouttegd
Copy link
Contributor

gouttegd commented Sep 6, 2021

The SSSOM file will live side by side with the bridge file that uses the new cross_species relationship, all right.

But that bridge file will replace the current form of bridge file that is currently using intersection_of: part_of <taxon id>, right? Or are you saying that we should keep both types of bridge files (in addition to the SSSOM)?

@dosumis
Copy link

dosumis commented Sep 6, 2021

We should keep the current bridge file or move to generating an identical one using a template (ROBOT most suitable I think.) SSSOM file - for sharing simple mappings, could be generated from the same template.

@dosumis
Copy link

dosumis commented Sep 6, 2021

Would be good to have some generic tooling for this. Maybe a standard ROBOT template + some SSSOM-py code to generate mapping files from this?

I don't think the SSSOM file is a massively high priority for FBbt - but no harm in having it and loading this into OxO.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants