-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bring metadata in rs.tdwg.org up to date with the "Normative Document" #252
Comments
Regarding the technical difference involving However, the TDWG Executive Decision http://rs.tdwg.org/decisions/decision-2014-11-06_17 deprecated the Dublin Core rights term in favor of the Dublin Core license term in 2014. So from the standpoint of Darwin Core, A secondary consequence of this is that |
Regarding the technical difference involving The current description in the Quick Reference Guide (generated from the Normative Document) provides the following guidance for using The other somewhat problematic issue is that Audubon Core recommends the following for The problem here is that historically, the practice for using several of the The rs.tdwg.org repo metadata currently shows I'm not sure how this should be fixed. How often is a value for |
The technical issue involving In this case, It seems clear to me that this should be changed so that The rs.tdwg.org repo metadata currently has We should consider a course of action here from the standpoint of stability and not "breaking" applications. How often is |
Regarding point 4 I agree that it would be better to remove it and adapt the build script (which we might have to do anyway for #251). It would be best to record this as a separate issue. |
If the build script is to be re-done and the term_versions.csv file is to continue to be used for the time being to build the QRG, then If you want to see what's in the http://rs.tdwg.org/dwc/terms/attributes/ namespace now, you can browse to its URI. You'll see that they are not only terms used in Darwin Core. There are now a bunch of other "made up" terms that we use in Audubon Core to indicate term properties like whether they are repeatable or not, plus new TDWG-wide terms like |
As of the 2020-04-09 snapshot of GBIF. the language term is filled in for 216998286 (15.4% of) Occurrence records, and of those, 138804401 (64%) are two-letter language codes, and 1245724 (0.6%) as three-letter language codes. The rest are fully spelled out names of languages or garbage that is not supposed to be in that field. There appear to be no IRIs in that field, and I can not find any authority that gives IRIs for langauges, though I don't have access to the actual standards ISO 639-2 or ISO 639-3. The documentation for dcterms:language even allows for a string literal if it is a language tag. In any case, the intention for every Dublin Core borrowed term was that it be the one for the string literal, and only late in the game was it realized that dcterms: was not it. They really all should be dc:, with corresponding dcterms: versions for the DwC IRI section. |
One of the issues with the various Ultimately, it would be good to have some kind of JSON or JSON-LD file that related those IRIs to the two letter codes so that the string values could be mapped to the IRIs. That hasn't happened in AC because nobody yet has indicated that they cared enough for us to expend the energy to do it. But for now, just have a vocabulary to recommend is probably good enough. |
As of the 2020-04-09 snapshot of GBIF. the type term is filled in for 228614731 (16.3% of) Occurrence records, and of those, 118583281 (51.9%) are legal unqualified DCMI term names and 1372635 (0.6%) are valid DCMI IRIs. The rest are labels with various kinds of orthography and various languages or garbage. So, again the better choice based on usage is the original intention, which is dc:type. |
I just noticed an open pull request that brings up the same concern as Item 4 above: #172 . It can be deleted if item 4 is resolved here. |
Closed pull request #172 without merge. Let's resolve item 4 here. I think we have a consensus that UseWithIRI should go under tdwgutility management and be removed from term_versions.csv. However, I think we also have a consensus that term_versions.csv will no longer be necessary once the full scripting pathway is functional. |
Agreed on last sentence in John's comment. |
With regard to the comments on the use of language codes. I recently had the example of an DwC file including vernacular names in Icelandic. I recommended to the authors that they tag the language of the whole resource as Icelandic, so that it was clear what language the vernacular names were in. Having said that, everything else in the resource and its metadata are either in English or Latin. |
Moved checkbox 4 to a separate issue at #266 so that this issue could be closed. |
All checkboxes ticked |
As of 2020-07-09, the metadata in the master branch of rs.tdwg.org differs from the Normative Document in the following ways:
flags
column found in the Normative Document (also discussed in a separate issue). If the 2018-09-06 versions are real, then two versions should be generated. If not, then only the 2017-10-06 version should be generated.http://dublincore.org/usage/terms/history/#language-007
,http://dublincore.org/usage/terms/history/#type-006
, andhttp://dublincore.org/usage/terms/history/#rightsT-001
. See this replacement table for details. It is not clear whether this difference would have any effect on the generation of the Quick Reference Guide.recommended
term having versionhttp://rs.tdwg.org/dwc/terms/version/accordingTo-2009-01-21
. That version is missing completely from the Normative Document. That term was a legacy property that had subproperties of the formdwc:xAccordingTo
. When semantics were removed from the basic DwC "bag of terms", this term became irrelevant. I don't think it was ever actually used for anything. Nevertheless, I believe that it should be retailed in the historical record, but have its status changed fromrecommended
todeprecated
.http://rs.tdwg.org/dwc/terms/attributes/UseWithIRI
is used in the "Normative Document" to categorize terms that should be grouped together in the UseWithIRI section. However, that class itself is NOT part of Darwin Core (and doesn't appear on the Quick Reference Guide itself). Thetdwgutility:UseWithIRI
term and all the other ones in that namespace were moved out of Darwin Core so they could be managed more nimbly as needed. Can we removehttp://rs.tdwg.org/dwc/terms/attributes/UseWithIRI-2017-10-06
as a row in the "Normative Document" without breaking anything? It doesn't belong there.The text was updated successfully, but these errors were encountered: