Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove unchanged 2018-09-06 vs 2017-10-06 terms #262

Closed
peterdesmet opened this issue Jul 23, 2020 · 8 comments
Closed

Remove unchanged 2018-09-06 vs 2017-10-06 terms #262

peterdesmet opened this issue Jul 23, 2020 · 8 comments
Assignees

Comments

@peterdesmet
Copy link
Member

Originally brought up by @baskaufs in #252 (comment):

  1. A number of versions in the normative document are issued on 2018-09-06 and 2017-10-06. The versions issued on 2017-10-06 have examples moved from comments into a separate column. I can't see a difference between the 2018-09-06 and 2017-10-06 versions. The DwC branch of rs.tdwg.org has a hand-edited CSV that moves the examples to the comments. This branch has been used to generate a 2017-10-06 version of the DwC vocabulary as reflected in the draft list of terms document. The draft 2017-10-06 version does not (as of 2020-07-09) include the changes to term labels (discussed in a separate issue) nor include the flags column found in the Normative Document (also discussed in a separate issue). If the 2018-09-06 versions are real, then two versions should be generated. If not, then only the 2017-10-06 version should be generated.
@peterdesmet peterdesmet self-assigned this Jul 23, 2020
@peterdesmet
Copy link
Member Author

From email conversation:

The 191 "2018-09-06" terms were introduced by John in this commit. That commit seems to be a full overhaul of the structure of the term_versions.csv document. But since the overhaul also included minor corrections, I think John (to be safe) opted to just release new versions of all terms, even when those terms did not change.

A quick check learns that at least 160 terms are true duplicates (in terms of "label", "definition", "examples" and "comments"). I'd be fine to investigate further and revert the non-changed terms to their 2017-10-06 status

@peterdesmet
Copy link
Member Author

peterdesmet commented Jul 23, 2020

There are 378 rows in the 2017-10-06 and 2018-09-06 versions combined.

  1. 320 rows (160 terms) are full duplicates based on label, definition, comments, examples, organized_in, issued, rdf_type, term_iri, abcd_equivalence, flags. I think for those we can safely remove the 2018 version and reinstate the 2017 as recommended term. @baskaufs @tucotuco agreed?

  2. I will look into the remaining 58 rows. Since the new build script 1 will consider any change (including e.g. adding spaces to an example c288eee) a version change, I guess most of these can be considered true 2018 changes. @tucotuco @baskaufs I have a question regarding flag though: some 2018 terms now have "extension" as a value in flag, while the 2017 version is empty for that column. Are we keeping that column and is this addition a change that requires a version change?

@baskaufs
Copy link

  1. I think it should be OK to delete the full duplicates. Once the 2017 versions move through the pipeline, we can check the QRG and make sure that it still renders as designed. That should be a good quality check.
  2. I think that any change that includes space changes within quoted items should definitely be considered true changes. Outside of quotes probably doesn't matter (although you are correct that the script would consider them changes).

As far as flag is concerned, I think that whenever the flag column is added, that should be considered a version change, since we are calling it a new term metadata property (tdwgutility:usageScope). I believe that values were added retroactively to all older versions of current versions that have the property. I kind of think that's not right and that it should be added as a property only at the point at which we started using it. I'm not 100% sure about that though, because I think the idea of a term being categorized as part of simple DwC or not predates the metadata term itself and so the older versions had a value whether or not we said so explicitly. As a practical matter, it probably doesn't matter and it would be easier just to start applying it to the 2017 terms and carry it forward from there.

@tucotuco
Copy link
Member

The concept of the flag indeed dates back to the ratification of the standard. The implementation of the attribute was to facilitate the build script that read term_versions.csv to make the Quick Reference Guide.

  1. Yes, let's delete full duplicates.
  2. To be consistent from the outset, a change in flags should constitute a version change too. It's not a good idea to go changing legacy terms to add the flags, even though it is clear what they should have been if the flags attribute existed.

peterdesmet added a commit that referenced this issue Jul 24, 2020
…licates)

See #262:

- Remove 160 terms from 2018-09-06 that are unchanged duplicates of the 2017-10-06 version
- Reinstate "recommended" status for the 160 2017-10-06 terms 45 + 47 + 36 + 1  + 31

Not that the reinstated recommended terms are NOT YET SORTED to the top
@peterdesmet
Copy link
Member Author

Here is what changed in the remaining 31 terms in the 2018-09-06 version that are not true duplicates. I think those are all valid changes that merit a version change (and can thus remain). @tucotuco @baskaufs agree?

term replaces what changed
Event-2018-09-06 Event-2014-10-23 adds flag extension
footprintSRS-2018-09-06 footprintSRS-2017-10-06 change in spaces in example
FossilSpecimen-2018-09-06 FossilSpecimen-2017-10-06  adds flag extension and example
GeologicalContext-2018-09-06 GeologicalContext-2017-10-06 adds flag extension
HumanObservation-2018-09-06 HumanObservation-2017-10-06  adds flag extension
Identification-2018-09-06 Identification-2014-10-23 adds flag extension
LivingSpecimen-2018-09-06 LivingSpecimen-2017-10-06 adds flag extension and example
MachineObservation-2018-09-06 MachineObservation-2017-10-06  adds flag extension and example
MaterialSample-2018-09-06 MaterialSample-2017-10-06 adds flag extension
measurementAccuracy-2018-09-06 measurementAccuracy-2017-10-06 adds flag extension
measurementDeterminedBy-2018-09-06 measurementDeterminedBy-2017-10-06 adds flag extension
measurementDeterminedDate-2018-09-06 measurementDeterminedDate-2017-10-06 adds flag extension
measurementID-2018-09-06 measurementID-2017-10-06 adds flag extension
measurementMethod-2018-09-06 measurementMethod-2017-10-06 adds flag extension
MeasurementOrFact-2018-09-06 MeasurementOrFact-2017-10-06 adds flag extension
measurementRemarks-2018-09-06 measurementRemarks-2017-10-06 adds flag extension
measurementType-2018-09-06 measurementType-2017-10-06 adds flag extension
measurementUnit-2018-09-06 measurementUnit-2017-10-06 adds flag extension
measurementValue-2018-09-06 measurementValue-2017-10-06 adds flag extension
Occurrence-2018-09-06 Occurrence-2014-10-23 adds flag extension
Organism-2018-09-06 Organism-2014-10-23  adds flag extension
PreservedSpecimen-2018-09-06 PreservedSpecimen-2017-10-06 adds flag extension and example
relatedResourceID-2018-09-06 relatedResourceID-2017-10-06 adds flag extension
relationshipAccordingTo-2018-09-06 relationshipAccordingTo-2017-10-06 adds flag extension
relationshipEstablishedDate-2018-09-06 relationshipEstablishedDate-2017-10-06 adds flag extension
relationshipOfResource-2018-09-06 relationshipOfResource-2017-10-06 adds flag extension
relationshipRemarks-2018-09-06 relationshipRemarks-2017-10-06 adds flag extension
resourceID-2018-09-06 resourceID-2017-10-06 adds flag extension
ResourceRelationship-2018-09-06 ResourceRelationship-2017-10-06 adds flag extension
resourceRelationshipID-2018-09-06 resourceRelationshipID-2017-10-06 adds flag extension
Taxon-2018-09-06 Taxon-2014-10-23 adds flag extension

@baskaufs
Copy link

Yes, I agree.

@tucotuco
Copy link
Member

tucotuco commented Jul 24, 2020 via email

@peterdesmet
Copy link
Member Author

We'll be able to close this issue if you agree with PR #263

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants