-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DwC fields not being indexed #391
Comments
The raw fields get indexed. https://biocache-ws.ala.org.au/ws/occurrences/search?q=data_resource_uid%3Adr342&facets=raw_georeferenced_by,raw_georeference_protocol,raw_georeferenced_date,raw_georeference_sources&pageSize=0 Looking at the cassandra table, georeferencedBy_p is not being updated from georeferencedBy. However, georeferencedDate_p is. |
@charvolant user came back and said |
@nickdos wrote: "User flagged that some DwC fields do not appear in a download file but the fields can be seen on an individual record page." From 2018 paper (https://doi.org/10.3897/zookeys.751.24791) "identifiedBy: ...The original identifiedBy_raw data item appears on the ALA webpage as “Identified by” for the record but is missing from the standard (recommended) download." These 2 were subsequently fixed, but was no automated check put in place to ensure that downloaded fields were the same as the databased fields, or at least not empty vs non-empty? Left it to users to spot, instead? |
Additional fields to add if applicable:
These are related to iNaturalist and the community identification of a sighting. Neither of these is currently exported in any download, making it impossible to determine the community's confidence on a record's ID in any downloaded set of iNat data. Issue raised in helpdesk ticket 84773 as I couldn't advise the user to specifically use those fields in a download to gauge accuracy of records. |
AtlasOfLivingAustralia/biocache-service#317 is still an issue even though it was closed at one point due to confusion about the nature of the bug. The sampling protocol processed field is not consistently populated with the raw values, so downloads look odd and are missing values in the "samplingProtocol" column because of the bug. |
Not yet appearing in prod SOLR. Keeping in QA
|
Facets now have values. |
See support ticket https://support.ehelp.edu.au/a/tickets/81984.
User flagged that some DwC fields do not appear in a download file but the fields can be seen on an individual record page.
georeferencedBy
georeferenceProtocol
georeferenceSources
type
(dc:type see ws example)samplingProtocol
(see Regression: samplingProtocol field is not populated in Full Darwin Core downloads biocache-service#317 - marked as closed and fixed so maybe a regression bug)scientificNameAuthorship
(https://support.ehelp.edu.au/a/tickets/82364 - not even listed in https://biocache-ws.ala.org.au/ws/index/fields)EDIT: Outstanding tasks moved to #394
See https://biocache-ws.ala.org.au/ws/occurrences/search?q=data_resource_uid%3Adr342&facets=georeferenced_by,georeference_protocol,georeferenced_date,georeference_sources&pageSize=0
Only
georeferenced_date
shows values and this is also the only column populated for CSV downloads. All thegeoref*
fields are marked as being indexed and stored - https://biocache.ala.org.au/fields?filter=georef*.Investigate why these fields are not being added to the SOLR index.
The text was updated successfully, but these errors were encountered: