Proposal: allow bijective functions to be applied to S and O prior to applying mapping predicate #61

cmungall · 2021-06-08T17:06:18Z

There are many cases where equivalence/exact is not appropriate but we want to be more precise that close etc

E..g

uniprot gene-centric reference protein to a gene (cc @sierra-moxon)
two properties representing measurements in different units mapping terms with different syntax for values #52
species-neutral to species-specific (uberon, upheno)
mapping between chemical entities in a way that is stereoisochemically or charge neutral cc @balhoff

I propose to add two optional columns {sub,ob}ject_transform_function (SF, OF) such that a mapping is read as

SF(S) P OF(O)

E.g.

encoded_by(uniprot:P12345) exactMatch HGNC:2345
measurement_of(dwc:depth_in_meters) exactMatch measurement_of(foo:depth_in_cm)
has_species_neutral_form(zfa:heart) exactMatch has_species_neutral_form(ma:heart)
chebi:citric_acid exactMatch has_conjugate_acid(kegg:citrate)

However, this has the undesirable property of losing 1:1 of exact/equivalent/etc

It may be preferable to use 1:1 functions with the option to include an argument for building the function, e.g.

no change
measurement_in(m)(dwc:depth_in_meters) exactMatch measurement_of(cm)(foo:depth_in_cm)
has_species_neutral_form(Drer)(zfa:heart) exactMatch has_species_neutral_form(Mmus)(ma:heart)
no change

measurement_in(UNIT: u) is a function builder that returns a 1:1 function, e.g.

measurement_in(m) => F, F(10m) = 10, F(100cm) = 1, ...

This preserves 1:1ness of exactMatch, and allows users who don't care about semantic precision to see all 1:1 mappings, at the same time as preserving the precise semantics

The exact specification of the functions may be out scope for SSSOM itself, but could be handled by robot templates

This does complicate the mapping to OWL, particularly where a logical axiom type is used as the predicate. I am OK with simply saying these should not be translated without a force option

This may seem to be getting away from the Simple in SSSOM, but I would argue this keeps the mapping format simple and usable while dealing with genuinely tricky cases in a way that doesn't sacrifice precision.

The text was updated successfully, but these errors were encountered:

matentzn · 2021-06-09T07:46:49Z

This is very close to what I was thinking, but called it "pattern" or "complex" matches. This needs to be discussed in a meeting though, too complicated for a ticket. I think its time to organise an SSSOM consortium meeting.

matentzn · 2021-08-20T11:02:24Z

In the workshop we need to decide whether this is out of scope or in scope.

The transformation function basically is a non-standard string that may or may not co-incide with a transformation function in another dataset - that is risky, so these would have to be enums or some such, and there are potentially 1000s.

EDIT, I removed my previous concerns a bit, although its still hard to see how the transformation function should be applied automatically if its unclear what constitutes and instance of dwc:depth_in_meters.

matentzn · 2021-08-20T11:07:20Z

EDIT, I removed some of my previous concerns now.

matentzn · 2021-08-30T17:47:17Z

Ok after my conversation with @cmungall it seems that what he calls "bijective functions" here really is a structural pattern-based system - so it does fall under the same sort of logic. If we wanted to keep it entirely general we can add a subject_mapping_pattern property that works by namespace, like upheno:abnormalAnatomicalEntity and then a mapping_pattern_system which is DOSDP - or whatever other system you want to use (OTTR, ROBOT template).

matentzn · 2021-09-01T13:36:30Z

There is still some disconnect between the complex term level mapping and transformation functions induced over instances of a term represented by subject/object.

Pattern level:

This basically means that you can apply the object_mapping_template to the uberon:ssae.yaml to obtain the expression that is equivalent to subject_id.

Instance level

How will you represent this in a computable fashion? Lets say you use a measurement pattern - how will you express the 10* using OWL? And how useful is this exactly? In the workshop, we need to decide whether these cases are really in scope here or not.

matentzn · 2021-09-02T12:18:49Z

Related: #36

dosumis · 2021-09-05T09:23:48Z

Simpler suggestion for X species mappings: just use a specific predicate for it as a subproperty of related. Cross-species mappings could never really be exact and need to be easily separable from mappings that are. We could potentially further specialise cross-species mappings with subproperties (less sure about these):

related
. cross-species
. . cross-species-equivalent ? # Use for identical logical def apart from clause in_taxon some X ?
. . cross-species broad ?
. . cross-species narrow ?

I see the major use case here as being consumption of mappings by users who want to map between annotated datasets.

Any user that wants to re-derive the OWL would presumably be sophisticated enough to work out how to do this based on the source ontologies. We should encourage source ontologies to declare taxon specificity.

In the case of mappings between species-general and species-specific anatomy ontologies maintained under OBO, I think OWL files or some standard templating system should be the master, with SSSOM files - aimed at consumers working with annotated datasets - being derived from these.

It might be worth considering similar strategies for other complex mappings here.

dosumis · 2021-09-05T09:26:06Z

CC @balhoff @gouttegd

gouttegd · 2021-09-06T14:19:35Z

@dosumis Regarding "mappings between species-general and species-specific anatomy ontologies", this looks good to me, but just to check that I understood correctly what you meant, if you can bear with me.

Currently, for e.g. the FBbt-to-CL mappings, we have a bridge file (src/ontology/bridge/cl-bridge-to-fbbt.obo, in Uberon repo) that looks like this:

[Term]
id: FBbt:00004936
property_value: IAO:00000589 "spermatocyte (drosophila)"
intersection_of: CL:0000017 ! spermatocyte
intersection_of: part_of NCBITaxon:7229

(this bridge being generated from the xrefs found in FBbt)

What you're suggesting is to have instead something like the following:

[Term]
id: FBbt:00004936
property_value: IAO:00000589 "spermatocyte (drosophila)"
relationship: cross_species CL:0000017 ! spermatocyte

(with cross_species being a new RO term?).

That file being either hand-crafted or, preferably, generated from a hand-crafted CSV table (but in any case not generated from xrefs extracted from one of the ontologies being mapped). That same CSV table would then also be used to generate a SSSOM file like the following:

subject_id	predicate_id	object_id	match_type	subject_label	object_label
FBbt:00004936	RO:cross_species	CL:0000017	HumanCurated	spermatocyte	spermatocyte

Did I get that approximately right?

matentzn · 2021-09-06T14:22:22Z

The general modelling looks right @gouttegd but these are not alternatives - the SSSOM mapping files and the normal bridge files will live side by side and fulfill different use cases!

gouttegd · 2021-09-06T15:04:16Z

The SSSOM file will live side by side with the bridge file that uses the new cross_species relationship, all right.

But that bridge file will replace the current form of bridge file that is currently using intersection_of: part_of <taxon id>, right? Or are you saying that we should keep both types of bridge files (in addition to the SSSOM)?

dosumis · 2021-09-06T15:05:07Z

We should keep the current bridge file or move to generating an identical one using a template (ROBOT most suitable I think.) SSSOM file - for sharing simple mappings, could be generated from the same template.

dosumis · 2021-09-06T15:06:54Z

Would be good to have some generic tooling for this. Maybe a standard ROBOT template + some SSSOM-py code to generate mapping files from this?

I don't think the SSSOM file is a massively high priority for FBbt - but no harm in having it and loading this into OxO.

matentzn added discussion priority labels Jun 9, 2021

This was referenced Aug 20, 2021

mapping terms with different syntax for values #52

Open

best practice for mapping two measurement concepts that take different units #56

Closed

matentzn added the workshop label Aug 20, 2021

matentzn mentioned this issue Sep 1, 2021

Representing data model elements and their values in SSSOM #43

Open

gouttegd mentioned this issue Sep 6, 2021

Bi-directionality of FBbt-CL mappings obophenotype/cell-ontology#1199

Closed

matentzn mentioned this issue Nov 26, 2021

Mapping involving post-coordinated subjects or objects #108

Open

gouttegd mentioned this issue Nov 29, 2021

Provide a framework to automatically manage a “mappings” component INCATools/ontology-development-kit#500

Closed

gouttegd mentioned this issue May 18, 2022

New properties for cross-species equivalents information-artifact-ontology/ontology-metadata#107

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: allow bijective functions to be applied to S and O prior to applying mapping predicate #61

Proposal: allow bijective functions to be applied to S and O prior to applying mapping predicate #61

cmungall commented Jun 8, 2021

matentzn commented Jun 9, 2021 •

edited

Loading

matentzn commented Aug 20, 2021 •

edited

Loading

matentzn commented Aug 20, 2021

matentzn commented Aug 30, 2021

matentzn commented Sep 1, 2021 •

edited

Loading

matentzn commented Sep 2, 2021

dosumis commented Sep 5, 2021 •

edited

Loading

dosumis commented Sep 5, 2021

gouttegd commented Sep 6, 2021

matentzn commented Sep 6, 2021

gouttegd commented Sep 6, 2021

dosumis commented Sep 6, 2021 •

edited

Loading

dosumis commented Sep 6, 2021

Proposal: allow bijective functions to be applied to S and O prior to applying mapping predicate #61

Proposal: allow bijective functions to be applied to S and O prior to applying mapping predicate #61

Comments

cmungall commented Jun 8, 2021

matentzn commented Jun 9, 2021 • edited Loading

matentzn commented Aug 20, 2021 • edited Loading

matentzn commented Aug 20, 2021

matentzn commented Aug 30, 2021

matentzn commented Sep 1, 2021 • edited Loading

matentzn commented Sep 2, 2021

dosumis commented Sep 5, 2021 • edited Loading

dosumis commented Sep 5, 2021

gouttegd commented Sep 6, 2021

matentzn commented Sep 6, 2021

gouttegd commented Sep 6, 2021

dosumis commented Sep 6, 2021 • edited Loading

dosumis commented Sep 6, 2021

matentzn commented Jun 9, 2021 •

edited

Loading

matentzn commented Aug 20, 2021 •

edited

Loading

matentzn commented Sep 1, 2021 •

edited

Loading

dosumis commented Sep 5, 2021 •

edited

Loading

dosumis commented Sep 6, 2021 •

edited

Loading