Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove inverses unless there's a compelling reason #506

Closed
rjyounes opened this issue Jul 8, 2021 · 25 comments · Fixed by #813
Closed

Remove inverses unless there's a compelling reason #506

rjyounes opened this issue Jul 8, 2021 · 25 comments · Fixed by #813
Assignees
Labels
impact: minor New, backward-compatible functionality (does not change inferences; e.g., adding a term) topic: design principles topic: inverses

Comments

@rjyounes
Copy link
Collaborator

rjyounes commented Jul 8, 2021

As per the decision in #294, we should purge gist of inverse properties that don't have a compelling reason to exist. In general, the guidance is that if there's a tree-like structure, keep the property that goes up rather than down the tree, since cardinality generally will be greater going downward. If there's no tree-like structure, it's dependent on individual context.

This will be a minor change if we deprecate the inverses.

[Added]
From Issue #551
precedesDirectly has inverse followsDirectly, but precedes has no inverse follows. Should be consistent one way or another.

@rjyounes
Copy link
Collaborator Author

We currently define 17 inverses.

We have to decide which ones to keep. @uscholdm and @mdfeickert will take a pass at a proposal and present.

Rebecca: use directionality guidelines in style guide.

Peter: If we don't provide inverses, people will be rolling their own. We should put recommendations in place for when it's appropriate.

Peter: could also have an optional import of the inverses.

Michael: or add an annotation with suggestions of how to name the inverses.

Rebecca: Why do we care what other people name properties in their own namespaces?

Rebecca: We don't need both the annotation and the import. The latter seems better.

Borislav: Remove all inverses from gist and put in a separate file.

Michael: There are some inverses I use all the time that I would like to keep, but I wouldn't want to use the whole file. So each person would have to pick and choose the ones they want. This is inconvenient, and argues for keeping the commonly used ones in gistCore.

Michael: May also propose bringing a few back.

@rjyounes
Copy link
Collaborator Author

Consider in conjunction with #551.

@JonathonGist
Copy link

Glad for these decisions--especially that all of them are to be available in an imports file. And I'm glad that the special ones, used often, are being preserved in gistCore.

@rjyounes
Copy link
Collaborator Author

rjyounes commented Jul 20, 2022

I've come across another compelling reason to remove all inverses from gistCore: when running inferencing on data with a heavy use of predicates with inverses, memory can be overloaded. Therefore keeping "special" ones - whatever that might mean - is still problematic.

This isn't incompatible with adding the inverses to a separate file, but I'd like to ask what the motivation would be. People can use inverse paths in SPARQL and SHACL. You can use owl:inverseOf to state restrictions. I don't see the need for any inverses in any file.

@JonathonGist
Copy link

It seems that with the availability of inverse paths in SPARQL and SHACL, there is limited or no need for adding inverse predicates to an ontology. My argument for keeping them in a separate file has evaporated.

@rjyounes rjyounes assigned marksem and unassigned uscholdm and mdfeickert Sep 8, 2022
@rjyounes
Copy link
Collaborator Author

rjyounes commented Sep 8, 2022

@marksem will provide a candidate list of removals and run it by the group. Probably will not make it into September release, but keeping open the possibility for now.

@uscholdm
Copy link
Contributor

uscholdm commented Sep 8, 2022

The inverse issue: what is the rationale for choosing which direction to keep? I wrote this up in a blog and in my book. Whitepaper: Quantum Entanglement, Flipping Out and Inverse Properties (semanticarts.com) Look for "preferred persepctive"
Whitepaper: Quantum Entanglement, Flipping Out and Inverse Properties
In this whitepaper, we take a deep dive into the pragmatic issues regarding the use of inverse properties when creating OWL ontologies.

@rjyounes
Copy link
Collaborator Author

rjyounes commented Sep 8, 2022

@marksem Please note that issue #551 should be considered in tandem. I've re-assigned that one to you as well.

@rjyounes
Copy link
Collaborator Author

rjyounes commented Oct 13, 2022

This is a high priority on one of my projects, where the has(Direct)Part is(Direct)PartOf inverse inferencing is causing memory blowout and we have had to resort to defining our own predicates without the inverses. This is not a desirable solution.

Specifically, of gist predicates that have inverses, the most commonly used in this client's data are (with counts after inferencing):

gist:isPartOf           5,860,775
gist:isDirectPartOf     5,860,775

gist:hasDirectPart      5,860,775
gist:hasPart            5,860,775

gist:isBasedOn          2,933,883
gist:isBasisFor         2,933,883

gist:isDescribedIn      1,568,573
gist:isAbout            1,568,573

Related: we should remove gist:isAspectOf and replace it with the logical inverse gist:hasAspect. For querying, you generally want to know the aspects of a particular thing, not all the things that have a particular aspect. E.g., you would rarely want to ask which things have length, vs which aspects does this thing have.

@Jamie-SA
Copy link
Contributor

Jamie-SA commented Oct 13, 2022

Choosing gist:hasAspect, and other similar directionalilty (gist:isCategorizedBy & gist:isCharacterizedBy), I think also aligns with the reduced cardinality preference.

Similarly, I think that would suggest use of gist:isPartOf and gist:isDirectPartOf over their inverses.

Now, if you really want to reduce triple explosion, get rid of those transitive/non-transitive pairs. Mostly kidding. But I do wonder how important others find them to be in their work. (warning... completely separate issue)

@rjyounes
Copy link
Collaborator Author

rjyounes commented Oct 13, 2022

Proposal:

  • Select one of a pair of inverses to remain in gistCore, the other to be moved to a separate file gistInverses.ttl that can optionally be imported by users.
  • General consideration for which predicate to keep in gistCore is cardinality, for query performance. For example, we generally expect fewer relationships in the isBasedOn direction than isBasisFor, or isPartOf than hasPart (a template is the basis for many instantiations, but the instantiation is typically based on a single template; an engine is part of only one car, but a car has many parts).
  • Some cases will not be clear, such as isAbout and isDescribedBy. Choose your favorite.

@rjyounes
Copy link
Collaborator Author

Another motivation for removing inverses: I'm seeing clients generating RDF sometimes using one inverse, sometimes another. You could say this is just bad practice, but the existence of the inverse sets us up for that practice. Then if inferencing is not being run, you have to know which predicate to query for in which cases.

@rjyounes
Copy link
Collaborator Author

rjyounes commented Oct 18, 2022

Addition to above proposal: In some cases the way we think about and use concepts is in the opposite direction. isIdentifiedBy is a good example: We generally ask "What is X's SSN?" rather than "Whose SSN is this?" So we should keep isIdentifiedBy in gistCore.

So my new proposed proposal is:

  • Select one of a pair of inverses to remain in gistCore.ttl, the other to be moved to a separate file gistInverses.ttl that can optionally be imported by users.
  • A general consideration for which predicate to keep in gistCore is cardinality, for query performance. For example, we generally expect fewer relationships in the isBasedOn direction than isBasisFor, or isPartOf than hasPart (a template is the basis for many instantiations, but the instantiation is typically based on a single template; an engine is part of only one car, but a car has many parts).
  • However, there are cases where the way we think of and use the predicates are overwhelmingly in the opposite direction. For example, we generally ask "What is X's SSN?" rather than "Whose SSN is this?" So isIdentifiedBy should remain in gistCore.
  • Some cases will not be clear according to either guideline, such as isAbout and isDescribedBy. Choose your favorite.

@marksem
Copy link
Collaborator

marksem commented Oct 24, 2022

I like the idea of no inverses to avoid confusion. I also applaud the pragmatism of letting overwhelming use modify the general consideration.

@Jamie-SA
Copy link
Contributor

@philblackwood offered to take a look at this.

@philblackwood
Copy link
Contributor

There are many good reasons to eliminate inverses; see Michael Uschold's document for a good discussion. His comment above is from Sep 8, and the link is:
https://www.semanticarts.com/wp-content/uploads/2018/10/QuantumEntanglementInverseProps041516MFUnewtemplate.pdf

Attached is a straw proposal for which ones to keep. The three main criteria are:

  1. Ease of understanding, e.g. hasSuperCategory is contrary to common usage (subClass, subTask, sub...)
  2. Does the property usually describe the subject? (e.g. isIdentifiedBy)
  3. Is it usually many-to-one, or functional vs. inverse functional? (isPartOf)

In the attached, each property on the "keep" list has highlights to show which of these 3 criteria is met in most cases of expected usage. Some of these might be debatable, so let's pool our experiences to refine the attached view.

inverses.xlsx

@rjyounes
Copy link
Collaborator Author

rjyounes commented Jan 31, 2023

@philblackwood We have defined a rule of thumb that the retained properties should go from "child" to "parent" for query performance and also to express the dependency relationship. These both contradict your first conclusion: (1) a subcategory generally has only one supercategory (except in polyhierarchies, which we don't often use), while a supercategory may have many subcategories. (2) The subcategory is dependent on the supercategory but not the reverse. Also, in this particular case I don't see any issue of ease of understanding. hasSubTask should be reversed to hasSuperTask, in fact, by the reasoning I've given. Though independently I've suggested we don't need it since we can use isPartOf; see issue #733.

It might be useful to look at the proposals and discussion above.

@rjyounes
Copy link
Collaborator Author

rjyounes commented Feb 9, 2023

Result of gist dev team meeting discussion:

Problems with inverses:

  • They increase cognitive load, during ontology development, querying (have to query with both predicates), and ETL.
  • They blow up inferencing engines like RDFox and Stardog.

Agreed on:

  • Remove all inverses from gistCore.
  • For some of the inverses in Phil's spreadsheet, agreed on which inverse to keep.
  • Define as best practice, to include in documentation, a white paper, etc.

Still open:

  • Provide a file of the inverses removed from gistCore? Some argue no, on the grounds that it doesn't make sense to document a best practice and at the same time provide a means to avoid it.
  • For some predicates, still need to agree on which to keep. Some argue that providing a separate file of the removed inverses would make it easier to agree on.

Attached is the in-progress spreadsheet, with column C indicating decisions that have been made.

inverses.xlsx

@rivettp
Copy link

rivettp commented Feb 9, 2023

Firstly, I agree with the decision.
But I'm wondering if you considered adding an annotation property to retain the name to be used (e.g. in user interfaces, or SHACL shapes) for the inverse direction. I recall that a few other ontologies have done this (but cannot recall which ones!).
Another potential annotation property might be the expected cardinality of the inverse direction.

@rjyounes
Copy link
Collaborator Author

rjyounes commented Mar 23, 2023

Form working group to handle "still open" items from Feb 9. Spreadsheet from that meeting is the up-to-date version of what we've agreed on.

Volunteers:

  • Michael
  • Dylan (lead)
  • Rebecca

Dylan to call meeting.

@rjyounes rjyounes assigned dylan-sa and unassigned marksem Mar 23, 2023
@marksem
Copy link
Collaborator

marksem commented Mar 23, 2023

^^ @dylan-sa , can you put me as optional on that meeting. I'd like to weigh-in if I have time.

@dylan-sa
Copy link
Contributor

dylan-sa commented Mar 28, 2023

Notes from working group meeting (3/28/2023):

  • @marksem, @rjyounes, and @uscholdm in attendance
  • I will create PR:
    • Remove all inverses (now have a list; table below)
    • Change relevant axioms that contain removed inverses
    • Add new property isSubCategoryOf; remove hasSubCategory and its inverse, hasSuperCategory (4/6 update: keep hasSuperCategory)
  • Decision on keeping inverses in a separate file: Inverses to be removed and not maintained in a separate file
  • I will create separate issue to consider whether occupiesGeographically and occupiesGeographicallyPermanently should be removed, along with their respective inverses.

Current proposal (updated after 4/6 working group meeting):

Property Kept Inverse Property Removed
containsGeographically isGeographicallyContainedIn
hasDirectPart isDirectPartOf
hasDirectSubTask isDirectSubTaskOf
hasMember isMemberOf
hasNavigationalParent hasNavigationalChild
hasPart isPartOf
hasSubTask isSubTaskOf
hasSuperCategory hasSubCategory
isAbout isDescribedIn
isAffectedBy affects
isBasedOn isBasisFor
isGovernedBy governs
isIdentifiedBy identifies
isRecognizedBy recognizes
occupiesGeographically isGeographicallyOccupiedBy
occupiesGeographicallyPermanently isGeographicallyPermanentlyOccupiedBy
precedes follows
precedesDirectly followsDirectly

@rjyounes
Copy link
Collaborator Author

rjyounes commented Mar 28, 2023

Decision on keeping inverses in a separate file: Inverses to be removed and not maintained in a separate file

I was under the impression that this was still an open question. @dylan-sa , are you still abstaining?

Also to add:

  • @marksem proposed, and @ryounes agreed, that class axioms should not be modified in a minor change. Since two of the changes in the upcoming release involve manipulating class axioms (e.g., restrictions referencing an inverse that is removed would have to be changed), we feel that this is too invasive, and we recommend that the upcoming release be 12.0.0. (This also involves a slight change to our new deprecation policy). We also agreed that, with the inverse and the Project/Task/Event changes, we have enough significant updates for a release, and should wrap up other issues, move In Triage items to the next release, and get this release out as early in April as possible. We'll discuss at the next meeting on April 13.

@dylan-sa
Copy link
Contributor

@rjyounes I'll go ahead and cast my vote in favor of complete removal so we've got a proper majority.

@rjyounes
Copy link
Collaborator Author

Must be accompanied with documentation changes - see #810.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
impact: minor New, backward-compatible functionality (does not change inferences; e.g., adding a term) topic: design principles topic: inverses
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants