-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TG2-VALIDATION_GENUS_NOTEMPTY #214
Comments
All of #206, #207, #208. #213, #214, #215, #217, #218, #219, and #220 need to consider additional information about whether an identification is at a rank above that of the term under test and if so the term is correctly empty. Suggest for all of these: (1) add dwc:taxonRank as an information element consulted. (2) rewrite the test specifications in the form: COMPLIANT if the value in dwc:taxonRank is of a rank higher than genus or if dwc:genus is not EMPTY; otherwise NOT_COMPLIANT Without such a change, these tests have limited power to identify data that has quality, with this being worse the lower the rank of the term under test is. |
Removed #216 from the list, dwc:kingdom doesn't need examination of another term. |
Noting that #265 is a similar, but more complex problem. |
Changing the ER to COMPLIANT if the value in dwc:taxonRank is of a rank higher than genus or if dwc:genus is not EMPTY; otherwise NOT_COMPLIANT is similar to #256 in that IF dwc:taxonRank is 'higher than genus', then dwc:genus is effectively ignored as the Information Element Acted Upon, and you would be implying dwc:genus is not EMPTY, regardless. If this test is considered useful in some context (use case), then I would suggest maybe (not sure this is right taxonomically) POTENTIAL_ISSUE if dwc:genus is EMPTY and dwc:taxonRank is lower than "family"; otherwise NOT_ISSUE ? |
@Tasilee - I would keep it Supplementary with your top wording. An ISSUE test would be a separate test and I don't think worth considering at this time. |
@chicoreus is the wording given by @Tasilee for an ISSUE worth making a test for this? |
@ArthurChapman given that there are darwin core terms for ranks lower than family and higher than genus, and taxonomic ranks that fall between the two, I think that, as phrased, @Tasilee 's issue would be difficult to phrase and implement. Current phrasing looks good. |
Made DO NOT IMPLEMENT as this test doesn't imply an aspect of Data Quality as it is redundant when compared with dwc:scientificName. It should probably be better testing dwc:genericName rather than dwc:genus. i.e. VALIDATION_GENERICNAME_NOTEMPTY. |
DO NOT IMPLEMEMENT as this test is an artifact of our thinking prior to the formulation of dwc:genericName, its development came from dwc:genus being widely used (incorrectly) as a parse of the generic portion of scientific name, and thus it was being considered part of the suite of tests related to scientific name parsing. With the clearer separation between dwc:genus as part of the classification and dwc:genericName as a parse of the scientific name, this thinking is no longer relevant. So, the current classification at the generic level of an occurrence has very little utility for assessing data quality of the occurrence data. It may have some value in very narrow cases for evaluation of taxonomic data sets, but even then entries may be of higher taxa, and be expected to not have a classification at the level of genus, so without clear explication of a use case and the potential pitfalls of implementation, we are recommending this as DO NOT IMPLEMENT. |
Our treating this as DO NOT IMPLEMEMENT was based on the incorrect belief that it stood in isolation. It is one of a family of supplemental tests that examine emptyness of higher classification terms. With the current inclusion of dwc:rank as an information element it is actually a good representative for that entire family. Should probably be considered supplemental, and the set of supplemental tests listed above brought into conformance with it |
OK, but is dwc:genus a special case (as @ArthurChapman suggested in the Zoom 16th April 2024)? |
On Mon, 15 Apr 2024 17:34:39 -0700 Lee Belbin ***@***.***> wrote:
OK, but is dwc:genus a special case (as @ArthurChapman suggested in
the Zoom 16th April 2024)?
It was, but it isn't anymore with the expression of dwc:genericName. This test is paralell to the other higher classification not empty supplementary tests. They can all (except the test for Kingdom), simply be evaluated by considering the rank and the presence or absence of a value. If the dwc:rank expresses a rank higher than the term under test, then the term is expected to be empty. This is how we had reformulated this test, but instead of extending the reformulation to the paralell set of tests we though it was a test in isolation and thus coming from its special case history.
We would be better off making this supplementary and reformulating all the other paralell higher classifcation term not empty tests to paralell the phrasing of the expected response in this one.
|
@Tasilee and I have been looking at this test and associated tests #206, #207, #208 which are all "FOUND" tests and #213, #215, #216, #217, #218, #219 and #220 which are all NOT EMPTY" tests. We think that this test should be kept simple like #213, etc. and be simple NOTEMPTY/EMPTY tests and that we shouldn't make them more complicated by adding in dwc:taxonRank as a consulted Element which would need altering Expected Response to include EXTERNAL_PREREQUISITES_NOTMET and INTERNAL_PREREQUISITES_NOTMET etc. and adding Source Authorities, etc. It would also mean having to alter all #213-#220 tests. They should remain SUPPLEMENTARY. When thinking about how people may use these types of tests if they wish to implement them - most will just want to know if the field is EMPTY or not. If we make the tests much more complicated then people probably won't implement them. If we wanted to go more complicated, I think we would need to keep the "NOTEMPTY" tests, but add a new set of "FOUND" tests, something I think would be unnecessary use of our time at this stage. With respect to this test - I think it should be altered to the same as #213, etc. and keep it simple and SUPPLEMENTARY |
On Sun, 12 May 2024 16:18:35 -0700 Arthur Chapman ***@***.***> wrote:
@Tasilee and I have been looking at this test and associated tests
#206, #207, #208 which are all "FOUND" tests and #213, #215, #216,
#217, #218, #219 and #220 which are all NOT EMPTY" tests. We think
that this test should be kept simple like #213, etc. and be simple
NOTEMPTY/EMPTY tests and that we shouldn't make them more complicated
by adding in dwc:taxonRank as a consulted Element
I disagree. If treated as simple Empty/Not Empty tests, these tests will assert
that any data where the identificaiton is above the rank being examined will
lack quality.
These tests need to inteligently check whether a value should be present before asserting that
an empty state lacks quality. This means examining taxonRank.
|
Okay @chicoreus following your logic, does the following satisfy your requirements (of course we would need to add something in sourceAuthority). EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL_PREREQUISITES_NOT_MET if dwc:taxonRank is EMPTY or is at a higher rank than Genus; COMPLIANT if dwc:genus is not EMPTY; otherwise NOT_COMPLIANT The wording now saying ..."COMPLIANT if the value in dwc:taxonRank is higher than genus or if dwc:genus is not EMPTY; otherwise NOT_COMPLIANT" is not logical - because it is saying that if taxonRank is Family and Genus is EMPTY - that the test is COMPLIANT although GENUS is EMPTY so makes no logical sense for a test for NOTEMPTY. The way I have suggested above makes logical sense. |
@chicoreus ?? |
@ArthurChapman I think the family of NOT_EMPTY tests for higher taxon rank terms (#213, #215, #216, #217, #218, #219 and #220 and this one, should follow the same pattern: COMPLIANT if the value in dwc:taxonRank is higher than genus or if dwc:genus is not EMPTY; INTERNAL_PREREQUISITES_NOT_MET if dwc:genus is EMPTY, dwc:taxonRank is NOT_EMPTY, and dwc:taxonRank contains a value that is not interpretable as a taxon rank; otherwise NOT_COMPLIANT. This asserts that the data have quality if dwc:genus contains a value, or if dwc:genus correctly does not contain a value, it handles a case were dwc:genus does not contain a value, and it isn't possible to tell if it should or not, and marks data where dwc:genus incorrectly lacks a value as not having quality. I don't think a reference to a source authority is needed, as taxonRank can be assessed without reference to a source authority for the purposes of this test, if this isn't the case, then: COMPLIANT if the value in dwc:taxonRank is higher than genus or if dwc:genus is not EMPTY; INTERNAL_PREREQUISITES_NOT_MET if dwc:genus is EMPTY, dwc:taxonRank is NOT_EMPTY, and dwc:taxonRank contains a value that is not interpretable as a taxon rank; EXTERNAL_PREREQUISITES_NOT_MET if dwc:genus does not contain a value, dwc:taxonRank contains a value and the sourceAuthority is needed and not available to interpret whether dwc:taxonRank has a rank higher than genus; otherwise NOT_COMPLIANT. Key point is that data can have quality, and be COMPLIANT even if dwc:genus does not contain a value in those cases when dwc:genus should not contain a value. This isn't a simple family of tests for emptyness. |
Thanks @chicoreus. I agree about Source Authority but I'm inclined to align with the structure we have been using and simplifying it - INTERNAL_PREREQUISITES_NOT_MET if dwc:genus is EMPTY or dwc:taxonRank contains a value that is not interpretable as a taxon rank; COMPLIANT if the value in dwc:taxonRank is higher than genus or if dwc:genus is not EMPTY; otherwise NOT_COMPLIANT. Is that ok? |
Changed Expected Response from COMPLIANT if the value in dwc:taxonRank is higher than genus or if dwc:genus is not EMPTY; otherwise NOT_COMPLIANT | to INTERNAL_PREREQUISITES_NOT_MET if dwc:genus is EMPTY or dwc:taxonRank contains a value that is not interpretable as a taxon rank; COMPLIANT if the value in dwc:taxonRank is higher than genus or if dwc:genus is not EMPTY; otherwise NOT_COMPLIANT. |
On Thu, 30 May 2024 15:25:49 -0700 Lee Belbin ***@***.***> wrote:
Thanks @chicoreus. I agree about Source Authority but I'm inclined to
align with the structure we have been using and simplifying it -
INTERNAL_PREREQUISITES_NOT_MET if dwc:genus is EMPTY or dwc:taxonRank
contains a value that is not interpretable as a taxon rank; COMPLIANT
if the value in dwc:taxonRank is higher than genus or if dwc:genus is
not EMPTY; otherwise NOT_COMPLIANT.
Is that ok?
That doesn't quite work. The first clause will incorrectly return INTERNAL_PREREQUISITES_NOT_MET when the taxonRank contains an uninterpretable value, even if the genus contains a value.
In that order of operations, needs an "and" in the first clause, not an "or".
INTERNAL_PREREQUISITES_NOT_MET if dwc:genus is EMPTY and dwc:taxonRank
contains a value that is not interpretable as a taxon rank; COMPLIANT
if the value in dwc:taxonRank is higher than genus or if dwc:genus is
not EMPTY; otherwise NOT_COMPLIANT.
The second clause will return COMPLIANT if dwc:taxonRank contains a value higher than genus, regardless of the state of the genus, this is probably formally correct, but likely confusing.
Clearer statement is probably:
INTERNAL_PREREQUISITES_NOT_MET if dwc:genus is EMPTY and dwc:taxonRank
contains a value that is not interpretable as a taxon rank; COMPLIANT
if dwc:genus is not EMPTY, or dwc:genus is EMPTY and the value in dwc:taxonRank
is higher than genus; otherwise NOT_COMPLIANT.
This phrasing more clearly expresses the intent.
|
That seems to work @chicoreus |
OK, thanks @chicoreus. Changed Expected Response from INTERNAL_PREREQUISITES_NOT_MET if dwc:genus is EMPTY or dwc:taxonRank contains a value that is not interpretable as a taxon rank; COMPLIANT if the value in dwc:taxonRank is higher than genus or if dwc:genus is not EMPTY; otherwise NOT_COMPLIANT. to INTERNAL_PREREQUISITES_NOT_MET if dwc:genus is EMPTY and dwc:taxonRank contains a value that is not interpretable as a taxon rank; COMPLIANT if dwc:genus is not EMPTY, or dwc:genus is EMPTY and the value in dwc:taxonRank is higher than genus; otherwise NOT_COMPLIANT. |
…tdwg/bdq#215 tdwg/bdq#217 and tdwg/bdq#218 not empty tests for higher ranks below kingdom, including utility method to evaluate ordering of pairs of rank values and unit tests.
The text was updated successfully, but these errors were encountered: