Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Semantic Metadata API can break when labels are changed (e.g. Contact -> Point of Contact) #8590

Closed
pdurbin opened this issue Apr 8, 2022 · 0 comments · Fixed by #8592
Closed
Milestone

Comments

@pdurbin
Copy link
Member

pdurbin commented Apr 8, 2022

When we merged pull request #8454 that changed a variety of metadata field labels in the citation block (e.g. "Contact" became "Point of Contact"), we learned that it resulted in a backward-incompatible change to the Semantic Metadata API.

Specifically, testSemanticMetadataAPIs in DatasetsIT started failing.

The goal is for label changes to not affect the API. That is, pull requests against the citation block or other blocks where you are only changing the human-readable label ("Contact" vs. "Point of Contact") should not result in a breaking change to the API.

@qqmyers has provided a nice write up of the situation at #8533 (comment) and discussed all this during a recent tech hours.

I don't want to speak for everyone but I'm ok with a breaking change to the Semantic Metadata API in order to make it more tolerant of future label changes. One of the options we discussed is switching from the human readable label (Contact) to the machine readable one (datasetContact). That way, if "Contact" becomes "Point of Contact" in the future, the API continues to work the same way. So when creating a dataset, we'd change part of the JSON, perhaps like below.

old/current human readable

  "https://dataverse.org/schema/citation/Contact": {
    "https://dataverse.org/schema/citation/datasetContact#E-mail": "finch@mailinator.com",
    "https://dataverse.org/schema/citation/datasetContact#Name": "Finch, Fiona"
  }

new/proposed machine readable

  "https://dataverse.org/schema/citation/datasetContact": {
    "https://dataverse.org/schema/citation/datasetContactEmail": "finch@mailinator.com",
    "https://dataverse.org/schema/citation/datasetContactName": "Finch, Fiona"
  }

These machine readable names come from the tsv:

cat citation.tsv | cut -f2 | grep -i contact
datasetContact
datasetContactName
datasetContactAffiliation
datasetContactEmail

These are just ideas. Mostly I'm just trying to capture the problem. I'm not trying to specify the exact solution.

One more thing I'd feel remiss without saying... at some point we should decide that it's time for the Semantic Metadata API to graduate from the Developer Guide to the API Guide. No rush on this. Whenever we're ready.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants