Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TG2-AMENDMENT_TYPESTATUS_STANDARDIZED #286

Open
ArthurChapman opened this issue Feb 9, 2024 · 20 comments
Open

TG2-AMENDMENT_TYPESTATUS_STANDARDIZED #286

ArthurChapman opened this issue Feb 9, 2024 · 20 comments
Labels
Amendment Conformance CORE TG2 CORE tests OTHER Parameterized Test requires a parameter Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT TG2 VOCABULARY

Comments

@ArthurChapman
Copy link
Collaborator

ArthurChapman commented Feb 9, 2024

TestField Value
GUID b3471c65-b53e-453b-8282-abfa27bf1805
Label AMENDMENT_TYPESTATUS_STANDARDIZED
Description Proposes an amendment to the value of dwc:typeStatus using the bdq:sourceAuthority.
TestType Amendment
Darwin Core Class dwc:Occurrence
Information Elements ActedUpon dwc:typeStatus
Information Elements Consulted
Expected Response EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL PREREQUISITES_NOT_MET if dwc:typeStatus is bdq:Empty; AMENDED the value of the first word in each | delimited portion of dwc:typeStatus if it can be unambiguously matched to a term in the bdq:sourceAuthority; otherwise NOT_AMENDED.
Data Quality Dimension Conformance
Term-Actions TYPESTATUS_STANDARDIZED
Parameter(s) bdq:sourceAuthority
Source Authority bdq:sourceAuthority default = "GBIF TypeStatus Vocabulary" {[https://api.gbif.org/v1/vocabularies/TypeStatus]} {dwc:typeStatus vocabulary API [https://api.gbif.org/v1/vocabularies/TypeStatus/concepts]}
Specification Last Updated 2024-11-11
Examples [dwc:typeStatus="Holo.": Response.status=AMENDED, Response.result=dwc:typeStatus="Holotype", Response.comment="dwc:typeStatus found in the bdq:sourceAuthority"]
[dwc:typeStatus="x": Response.status=NOT_AMENDED, Response.result="", Response.comment="dwc:typeStatus not found in the bdq:sourceAuthority"]
Source TG2
References
Example Implementations (Mechanisms)
Link to Specification Source Code
Notes Valuable for data quality needs related to voucher specimens in natural science collections. Almost all occurrence data will have no value in dwc:typeStatus. For reference, a vocabulary of synonyms can be found for dwc:typeStatus at [https://registry.gbif.org/vocabulary/TypeStatus/concepts.
@ArthurChapman ArthurChapman added TG2 Amendment OTHER Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT VOCABULARY NEEDS WORK Supplementary Tests supplementary to the core test suite. These are tests that the team regarded as not CORE. Conformance Parameterized Test requires a parameter labels Feb 9, 2024
@ArthurChapman
Copy link
Collaborator Author

@chicoreus @tucotuco I'm not sure that the link I have for the API is an actual API or if one exists (https://gbif.github.io/parsers/apidocs/org/gbif/api/vocabulary/TypeStatus.htm) thus the NEEDS WORK label

@ymgan
Copy link
Collaborator

ymgan commented Feb 22, 2024

@CecSve mentioned in #284 that GBIF is working on the typeStatus vocabulary gbif/vocabulary#87 Flagging this here.

@ArthurChapman ArthurChapman added Immature/Incomplete A test where substantial work is needed to develop the specification to the point where the test ca and removed Supplementary Tests supplementary to the core test suite. These are tests that the team regarded as not CORE. labels Feb 22, 2024
@ArthurChapman
Copy link
Collaborator Author

Changed to Immature/Incomplete pending development of Vocabulary by GBIF

@tucotuco
Copy link
Member

Changed to Immature/Incomplete pending development of Vocabulary by GBIF

GBIF has a vocabulary, it just isn't accessible via API from the vocabulary server. Implementations don't necessarily need an API to function. In fact, they would be more efficient or much more efficient without API calls, depending on how they were implemented. In other words, I do not think that having API access to a controlled vocabulary is a requirement for implementation, but having a controlled vocabulary is.

@ArthurChapman
Copy link
Collaborator Author

I am happy with that @tucotuco. Any comments @chicoreus?

@ArthurChapman ArthurChapman added CORE TG2 CORE tests and removed Immature/Incomplete A test where substantial work is needed to develop the specification to the point where the test ca labels Mar 25, 2024
@ArthurChapman
Copy link
Collaborator Author

Changed to CORE and deleted some wording from Notes. Left as "NEEDS WORK" following discussion with @chicoreus on need for MEASURE test. More discussion needed.

@chicoreus
Copy link
Collaborator

Not sure that this is tractable. The expectation for values in dwc:typeStatus is a pipe delimited list of {type status term of taxon name {publication}}. The definition explicitly includes the taxon name as part of the expected value: "A list (concatenated and separated) of nomenclatural types (type status, typified scientific name, publication) applied to the subject."

One example includes citation information, the other just type status term and taxon name.

For just type status terms and taxon names, we could probably manage with two source authorities, one for the type status term and one for the taxon name, but with publication citations included, that will not be tractable.

We might get away with conforming the first word of each pipe delimited block to a type status term vocabulary.

Examples in Darwin Core are:

holotype of Ctenomys sociabilis. Pearson O. P., and M. I. Christie. 1985. Historia Natural, 5(37):388

holotype of Pinus abies | holotype of Picea abies

@tucotuco
Copy link
Member

We might also make a change term request for dwc:typeStatus and see if that flies.

@ArthurChapman
Copy link
Collaborator Author

Interesting - perhaps we need to do what @tucotuco suggests. Originally, I thought we were just checking against a list of types of Types regardless of other data such as the taxon and the publication. We generally look at terms in isolation, but I wasn't realising Darwin Core included the taxon name and publication. That certainly makes it a lot more difficult and wonder if it is still worth keeping (as CORE at least - possibly as SUPPLEMENTARY). I believe our original thoughts were to just test to see if the type of type was included in a vocabulary - holotype, neotype, lectotype, etc. (i.e. as in https://rs.gbif.org/vocabulary/gbif/type_status_2021-01-18.xml). My suggestion would be to drop this test as I can't think of another way to word it so it is consistent with Darwin Core - i.e. taking just the first part of the Darwin Core definition ("A list (concatenated and separated) of nomenclatural types (type status") without the second part. Perhaps the suggestion by @tucotuco or a new Darwin Core term - but it is too late for that for us.

@ArthurChapman
Copy link
Collaborator Author

Perhaps do what @tucotuco suggests and in the meantime drop to Incomplete/Immature.

@chicoreus
Copy link
Collaborator

Alternative is to split into parts by the pipe character and evaluate the first word of each part.

Perhaps something like:

EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL PREREQUISITES_NOT_MET if dwc:typeStatus is EMPTY; AMENDED the value of the first word in each | delimited portion of dwc:typeStatus if it can be unambiguously matched to a term in bdq:sourceAuthority; otherwise NOT_AMENDED

@tucotuco
Copy link
Member

Also, there is this open issue which we can support. tdwg/dwc#28

chicoreus added a commit to FilteredPush/rec_occur_qc that referenced this issue Jul 28, 2024
…CENSE_STANDARDIZED with unit test and default method. Adding implementation for tdwg/bdq#286 without unit tests and adding a note to it and the other dwc:typeStatus test tdwg/bdq#285 that these may need to be reworked or removed.
@ArthurChapman
Copy link
Collaborator Author

@chicoreus - your suggestion seems reasonable and workable. As discussed under tdwg/dwc#28 a lot of databases have just the type of Type under typeStatus in their databases. I see a good case for us to support the DwC proposal, but in the mean time use the pipe suggestion of @chicoreus

chicoreus added a commit to FilteredPush/rec_occur_qc that referenced this issue Aug 2, 2024
@Tasilee
Copy link
Collaborator

Tasilee commented Aug 3, 2024

With general agreement, I am changing the Expected Response from

EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL PREREQUISITES_NOT_MET if dwc:typeStatus is EMPTY; AMENDED the value of dwc:typeStatus if it can be unambiguously matched to a term in bdq:sourceAuthority; otherwise NOT_AMENDED

to

EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL PREREQUISITES_NOT_MET if dwc:typeStatus is EMPTY; AMENDED the value of the first word in each | delimited portion of dwc:typeStatus if it can be unambiguously matched to a term in bdq:sourceAuthority; otherwise NOT_AMENDED

and updating Specification Last Updated

@Tasilee Tasilee removed the NEEDS WORK label Aug 3, 2024
@ArthurChapman
Copy link
Collaborator Author

I wonder if it should be "value of the first word in the first | delimited portion" rather than "value of the first word in each | delimited portion"

@chicoreus
Copy link
Collaborator

chicoreus commented Aug 4, 2024 via email

@ArthurChapman
Copy link
Collaborator Author

@chicoreus - how do you see the parsing of this with the pipes (|)?

@chicoreus
Copy link
Collaborator

chicoreus commented Aug 4, 2024 via email

@Tasilee
Copy link
Collaborator

Tasilee commented Aug 16, 2024

I needed to add "" to pipe in the Expected Response for general interpretation

chicoreus added a commit to FilteredPush/rec_occur_qc that referenced this issue Nov 11, 2024
…q#286, marking with TODO comments that test needs to be updated to fit the specification.
@chicoreus
Copy link
Collaborator

Darwin Core does not provide a vocabulary for type status values. Correcting source authority to gbif vocabulary:

bdq:sourceAuthority default = "GBIF TypeStatus Vocabulary" {[https://api.gbif.org/v1/vocabularies/TypeStatus]} {dwc:typeStatus vocabulary API [https://api.gbif.org/v1/vocabularies/TypeStatus]}

chicoreus added a commit to FilteredPush/rec_occur_qc that referenced this issue Nov 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Amendment Conformance CORE TG2 CORE tests OTHER Parameterized Test requires a parameter Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT TG2 VOCABULARY
Projects
None yet
Development

No branches or pull requests

5 participants