-
-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Default specific locality in specimen search results #862
Comments
We are trying to work through this same issue with serial sampling of the same individuals through time and across space (i.e. serial blood sampling of Mexican wolves at the various reintroduction program sites). Not only do you just get a single event in search results, but you cannot download or map the other events either. |
So in our case at MSB, we need to be able to search on, map and download On Fri, Apr 8, 2016 at 1:45 PM, jldunnum notifications@github.com wrote:
|
Picking a "favored" event is possible - it's (computationally) expensive but happens asynchronously, so whatever. (It's not random - events with coordinates should float to the top all else being equal, etc. - but it probably looks that way to most users for most specimens!) The "see all locality" issue is #755. The short version is that "locality data" is a bunch (~100) of columns for every specimen-event, and a specimen can have any number of events. That doesn't fit in anything tabular (results table, download ), and having data in maps/queries (things that can deal with variable cardinality) which can't be seen in the table would be extremely confusing. |
Maybe we could have a way to mark records that contain multiple events so at least people will know when they see it in the search results and can go deeper if they wish. |
Yes, that's the core intent of #755 - and if the "marker" contains the data (eg, as JSON - and I have no idea if that's practical until I play with it) then having that available should make it somewhat simpler to go deeper - just unwind into the variable-cardinality format of your choice, or flatten it out into DWC Occurrences (which we already create and could make available), or use the clicky-viewer (if we can figure out how to build one), or whatever. Or maybe nobody (or nobody without access to the writeSQL tool) would make use of the JSON and a simple "this thing has 48 localities see specimen detail" flag is enough??
It turns out the "simple" way is REALLY expensive - a small batch update (500 records) went from ~2 seconds to ~7 minutes, which will be disruptive even as an asynchronous process. I'll keep looking.... |
I may have a workable solution to selectively picking the one specimen event that appears in specimenresults + downloads. Priority currently is:
in all cases excluding "unaccepted place of collection." Other requests? |
By date - earliest and most recent.
|
I was referring to machine behavior - given http://arctos.database.museum/guid/MSB:Mamm:193683, which one of the 5 events is "prioritized" to fit into http://arctos.database.museum/SpecimenResults.cfm?guid=MSB:Mamm:193683? (Current answer: The one with the coordinates, http://arctos.database.museum/guid/MSB:Mamm:193683?seid=593167.) I don't understand the above comments. |
Those priorities work for me and seems logical. thank you for working on this. It will make a huge difference for our users.
Angela J. Linn Explore our collections: http://www.uaf.edu/museum/collections/ethno/search-collections/ |
Could use date as the next level of hierarchy within those categories. Earliest event gets priority. |
https://github.com/ArctosDB/DDL/blob/master/functions/getPrioritySpecimenEvent.sql is now experimentally running at prod - it's a bit slower than the previous revision, but the ~15K specimens with a place of manufacture updated in ~10 minutes or so, which seems workable. Adding more logic to the ordering, as long as it doesn't use data outside of specimen_event, collecting_event, and locality, should (!) have a minimal impact on performance, and adjusting the function is simple as long as the input and output parameters don't change. The function is now finding the earliest event (based on began_date) within the winning category. http://arctos.database.museum/guid/MSB:Mamm:224771 has a bunch of equivalent events (accepted place of collection, no coordinates) and so....
... the earliest is returned, which hopefully won't offend anyone. Including State would require one more join (to geography), and if there's no Arctos-wide agreement on which state is most important (seems unlikely) then an additional 3 jumps the other way to get at Collection. There are 1488 unique States in Arctos at the moment, which might be enough to have a noticeable impact on the post-query processing as well (especially if collection is a multiplier). So possible, yes, but likely fairly expensive. ("Expense" can be measured in how long it takes an update to appear in the interfaces and is difficult to quantify, but my wild guess is that adding state would be noticeable/disruptive.) |
I am re-opening this because the solution isn't working for me. See the issue referenced above. We need to be able to tell people that more than one event exists in the search results/download. |
The UAM:EH specimen records typically have between 1-3 specimen events (e.g., place of manufacture, place of use, place of collection) with sometimes three different localities. It seems that the specific locality that is displayed in the search results is randomly selected from those three events. I request that the default specific locality that is displayed is the locality associated with the "place of manufacture". Likewise, the georeferenced place of manufacture should be what shows up on the map following a search. Finally, this same information should be the locality information displayed at the top of the specimen record.
The text was updated successfully, but these errors were encountered: