-
-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Organism ID vs Entity ID vs Agent #3765
Comments
To minimize confusion, changed preferred name to "Bernice Pan troglodytes" to distinguish from any other Bernices out there. |
I changed Bernice's agent type to other agent. |
Here is a project for Bernice. What can it do that an agent can't?
What can and Agent do that a project can't?
What other ways should we evaluate this idea? |
I had to remove "first name" as other agents cannot have a first name. |
@dustymc has always said we should just catalog organisms. We had some discussion about organisms during the TDWG MaterialSample Task Group meeting yesterday. In order to create an organism ID GUID, the Field museum creates a new "specimen-less" catalog record. Given this and all of our discussions, here is what I propose, because I think we can do slightly better for our community.
The reason I propose this special collection for organisms is that it will help prevent the creation of duplicate identifiers in Arctos for any given organism. It also allows us to share the burden of keeping up with them and will not impose additional fees for collections doing the work to connect things. I have settled on "Entity" because cultural collections may have use for this as well - to bring together various parts of a set and so on. Over time, I think we will find opportunities to seed catalog records from Arctos:Entity and to add data to Arctos:Entity from other catalog records, but we can work those out as they become apparent and useful. We will likely need some rules about when an Arctos:Entity record should be created, but I think they can be fairly simple. What about this is nuts? I'm sure something is! |
This is definitely an interesting idea and corresponds with previous
suggestions for a synthetic "organism view" that would combine multiple
records from the same individual via the UI. This makes more sense than
that approach. How would we integrate this with other platforms and GBIF?
…On Thu, Nov 18, 2021, 8:02 AM Teresa Mayfield-Meyer < ***@***.***> wrote:
* [EXTERNAL]*
@dustymc <https://github.com/dustymc> has always said we should just
catalog organisms. We had some discussion about organisms during the TDWG
MaterialSample Task Group meeting yesterday. In order to create an organism
ID GUID, the Field museum creates a new "specimen-less" catalog record.
Given this and all of our discussions, here is what I propose, because I
think we can do slightly better for our community.
1. We create a community managed collection. (Arctos:Entity)
2. Arctos:Entity will use the Teach collection code
3. Arctos:Entity will be managed by those with manage_code_table and
requests for new entities will be handled the same way code table requests
are
4. Records in Arctos:Entity will ALWAYS be part-less
5. GUIDS from Arctos:Entity can be used as values in other identifier
= Organism ID
6. We create a few new event types: birth and death so that we can
record these events for organisms when they are known
The reason I propose this special collection for organisms is that it will
help prevent the creation of duplicate identifiers in Arctos for any given
organism. It also allows us to share the burden of keeping up with them and
will not impose additional fees for collections doing the work to connect
things. I have settled on "Entity" because cultural collections may have
use for this as well - to bring together various parts of a set and so on.
Over time, I think we will find opportunities to seed catalog records from
Arctos:Entity and to add data to Arctos:Entity from other catalog records,
but we can work those out as they become apparent and useful. We will
likely need some rules about when an Arctos:Entity record should be
created, but I think they can be fairly simple.
What about this is nuts? I'm sure something is!
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#3765 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADQ7JBBGJN3KB55Q7ZSBEGTUMUIQDANCNFSM5A42AMTA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
By sending the url for the Arctos:Entity record as the Organism ID for the catalog record item. This is exactly what Field Museum does. We would NOT transmit anything in the Arctos:Entity collection to GBIF (yet). Eventually, we may want to send that information in its own kind of Darwin Core Archive, but for now, it would remain with us (although everyone could see it through the Organism ID link). |
No, the data themselves are and have been doing that. (I suppose it's too late to switch my title from "data janitor" to "Speaker for the Data"?)
I have been saying bigger is better - something bigger than Arctos would be better, but if we must do this then this seems as good as it gets.
This seems entirely unnecessary, I'm not sure what the goals of such restrictions would be, but it's also "details" that can be easily adjusted at any time using familiar tools.
I don't get that either (but it's also just more details). Why not accumulate parts? "This critter has blood samples in these 48 places that we know of, none of them have useful public data" seems incredibly useful, condition ("we have no idea") and disposition ("not here") handle the details.
and
Attributes - events are interactions with humans, Attributes are - well, Attributes. (I was wondering how we don't have this, turns out we do but we call it "numeric age" and the collecting event after "verbatim preservation date." Might be a good opportunity for some reconciliation.)
I think that's an impossible (albeit worthy) goal. The printer stuck, 12 XYZ pages got printed, now there are 273 things that say "XYZ123" on them out there. Some sort of filter seems very useful, but starting out with the idea that you can do something that you won't actually be able to do will be frustrating. Duplicates WILL be created, and this data object is capable of doing something about it: https://handbook.arctosdb.org/documentation/catalog.html#recataloging-specimens
Not if doing so requires manage_codetables. This would just be a collection, you can grant access to anyone who can demonstrate that they understand how to use it.
I'll again advocate for openness/inclusiveness. If the bar is very high then this just won't get used. Definitely rules and guidelines are needed, but I think they should generally encourage new "entities" when there's some question - "this seems like something that might have a chunk cataloged elsewhere" would be a HUGE benefit for some future researcher who might be willing to dig around in the collection and maybe make everyone's data better. Anyway, all details, I still don't see a better approach or more appropriate data object for this.
Such as?
AFAIK they don't accept Organisms but they do Occurrence-->OrganismID - we'd just give them what they can handle. |
I have submitted a prospective collection request so we can discuss there. |
I don't know where "there" is - should I? |
It will become an issue in the New Collections repo, which is where we make decisions about incoming stuff... |
AWG at 9 Dec 2021 meeting agreed to try it out. |
Please review the project - https://github.com/ArctosDB/new-collections/projects/59 |
I don't need if this is the right place to post this, but I just finished uploading MVZObs:Bird records that go with MVZ:Bird records (= same organisms = same entity) - with reciprocal relationships 'same individual as' Here is the saved search. This will be a good test case once the new 'entity' collection is set up. |
Will you be adding additional records for the ones that died,
e.g. MVZ:Bird:193195?
For that one, you have one record with an encounter event, another record
cataloged in your Obs catalog, and a final record for the carcass?
…On Tue, Dec 21, 2021 at 6:35 PM Carla Cicero ***@***.***> wrote:
* [EXTERNAL]*
@Jegelewicz <https://github.com/Jegelewicz> @dustymc
<https://github.com/dustymc> @campmlc <https://github.com/campmlc>
I don't need if this is the right place to post this, but I just finished
uploading MVZObs:Bird records that go with MVZ:Bird records (= same
organisms = same entity) - with reciprocal relationships 'same individual
as'
Here is the saved search
<https://arctos.database.museum/saved/DougBell_GoldenEagles>.
This will be a good test case once the new 'entity' collection is set up.
—
Reply to this email directly, view it on GitHub
<#3765 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADQ7JBABUFTXAAIR37TGVITUSETO7ANCNFSM5A42AMTA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
The Arctos Entity Collection is live! Before we start writing up procedures and such, I'd like to create a couple of entities so that we can review and talk about how it should work for the community. I was going to test with Kianga and then one pair of Carla's bird/observation records. Is everyone OK with me doing that? |
Yes
…On Fri, Jan 7, 2022, 2:00 PM Teresa Mayfield-Meyer ***@***.***> wrote:
* [EXTERNAL]*
@dustymc <https://github.com/dustymc> @campmlc
<https://github.com/campmlc> @ccicero <https://github.com/ccicero>
The Arctos Entity Collection is live! Before we start writing up
procedures and such, I'd like to create a couple of entities so that we can
review and talk about how it should work for the community.
I was going to test with Kianga and then one pair of Carla's
bird/observation records.
Is everyone OK with me doing that?
—
Reply to this email directly, view it on GitHub
<#3765 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADQ7JBFCQRUD5HUU7ROAJRDUU5H6TANCNFSM5A42AMTA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Awesome Teresa, and go for it. Thanks! Mariel - with regard to your question, we only will be getting the blood samples, no carcasses (even if the bird died). So the data in Arctos are what we have, and I don't expect more for those records. |
something about clone/entity accessions - allow pick, pop up a warning, ??
add:
instead: on results that includes entities, add a big verbose explanation somewhere (or link to docs or something) would be cool: find entities, then find all records that use found entityIDs as OrganismID new approach
@ccicero can you send me a bulkloader file of your "entity components" so I can play with the tool at test? Or @campmlc if your elephants are in test (or you have the bulk file) that could work too. |
Carla’s Questions today
|
We could also register our entity domain on http://bioguid.org |
FYI - I just fixed the one bluebird record so it's now in Arctos:Entity accession 4. I'll add the total # records once I figure that out.
Update: instead of check box on Search page, we decided to include them in search but have something on the results page that says something like "Your results include entity records [with link to documentation for what an entity is]. Check box to remove from results set." [where with one click you can remove all entity records]>
This would be AWESOME! I added the bird band number to Arctos:Entity:33 and will wait for Dusty's magic tool before creating new entity records for the remaining bluebirds. |
Excepting social problems like #4200, Arctos identifiers are generally born actionable and linked via "how the internet works" - what could an additional registry DO for us? @ccicero see #3765 (comment) - some sort of data in test would be very useful (maybe necessary) for this, do you happen to have the record bulkloader hanging around? |
People are terrible at using https://arctos.database.museum/guid/Arctos:Entity:13 BUT if they had Arctos:Entity:13 and plugged it into BioGUID (and we had registered our domain there), they could find that they are missing the https://arctos.database.museum/guid/ part of an actionable thing. BUT it does bring up a good point, because ALL colelctions registered there from Arctos would use https://arctos.database.museum/guid/ as their "dereference service prefix" and if someone only as MSB Mamm 5000, they will never get where they need to be. Just for grins I put MSB:Mamm:5000 into the BioGUID search, and I got the GBIF occurrence record If you Google MSB:Mamm:5000 you get some GBIF and iDigBio stuff Can we make it so that searching MSB:Mamm:5000 gets the Arctos record into the search results? |
Because we don't and that leaks, see #4200 (and loan instructions and etc etc etc.) This is a social problem and those seldom have satisfactory technical solutions. We're currently demanding cruddy data, and that's what we get.
That's probably on the "computers might actually figure it out" end of the spectrum, but it's still a string - anyone can create or use it for any reason, any number of times, and then you can never be truly sure they think it's what you think it might be. We have good identifiers, we just refuse to use them.
Given enough resources maybe we could get Google to do better, but the real answer will always revolve around Google doing whatever Google wants to do. So, maybe bioguid could be useful, but only if we refuse to embrace real IDs (and who is in that group? Hopefully nobody!). ANYWAY - there's magic in next release, should go out in an hour or so, go break it at test before making a giant mess in production please! I poked around at some elephants and a mouse and can't break it. |
Done-ish, but the current documentation isn't great so no link. |
Where is the best place for this? I'll add to my to-do list but want to make sure I get it in the right place. |
That's old magic. (And no, a bunch of these are the same data, if someone wants to import them for some reason they may consider that to be a feature. Current guidance is "don't use that button at all" but users can make whatever messes they want....) New multirecord magic is here: (And FYI
I suppose it could be anywhere, but it's annoying and I'm not convinced it's necessary at all so it's up mostly-sorta out of the way.
|
@dustymc I won't be able to work more on this until after my vacation (gone 4/7-4/16). What do you need from me? I have the bulkloader from the eagle data, but haven't done it yet for the bluebirds. Do you need me to send you the eagle bulkloader? |
@ccicero I found some elephants to play with, I think the tool is happy, have fun! |
Agree
…On Tue, Apr 5, 2022 at 8:49 AM Teresa Mayfield-Meyer < ***@***.***> wrote:
* [EXTERNAL]*
Your results include entity records
Can we move that closer to the results? I don't think people will notice
it up at the very top - I think most tend to scroll past the search
parameters right down to the results.
[image: image]
<https://user-images.githubusercontent.com/5725767/161781332-1d23fe27-a45f-4bd0-a3af-c31ffed96394.png>
—
Reply to this email directly, view it on GitHub
<#3765 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADQ7JBHBHBYC66G66BG2U2DVDRHGNANCNFSM5A42AMTA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Add to entity magic form:
add docs: be clear when the tool works, when something else is necessary, suggest what that might be disallow nature of ID pick, just use relationship |
@ccicero I snuck results/manage/entity magic/requery into prod. |
@dustymc cool, thanks! I'll start playing around. Meanwhile, here is an entity record for an owl where we have the skeleton and AMNH has the skin: I re-purposed that entity # from a bluebird because I want all the bluebirds to be consecutive. All good (I think but please check) except for the map. There are no coordinates in the component record, but the map is showing a point off the west coast of Africa. The locality is in Thailand. ???? |
A georeference is two clicks away - click this: then save (or I can do that for your collection, or some subset, or whatever). |
This mapping occurs when the entity has no event. See Entity 1
…On Mon, Apr 25, 2022, 3:40 PM Carla Cicero ***@***.***> wrote:
* [EXTERNAL]*
@dustymc <https://github.com/dustymc> cool, thanks! I'll start playing
around.
Meanwhile, here is an entity record for an owl where we have the skeleton
and AMNH has the skin:
https://arctos.database.museum/guid/Arctos:Entity:33
with component
https://arctos.database.museum/guid/MVZ:Bird:160626
I re-purposed that entity # from a bluebird because I want all the
bluebirds to be consecutive.
All good (I think but please check) except for the map. There are no
coordinates in the component record, but the map is showing a point off the
west coast of Africa. The locality is in Thailand. ????
[image: image]
<https://user-images.githubusercontent.com/967275/165179152-42abe41d-0a41-4dfc-9bcf-91153bad0b51.png>
—
Reply to this email directly, view it on GitHub
<#3765 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADQ7JBELJSBWUR6PLSIQH7LVG4GNNANCNFSM5A42AMTA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
This is true only when the entity is a component of itself, which should probably never be the case (but whatever, I can't and won't try to stop such things, maybe there are good reasons to build self-referencing Entities).
Maybe, but I've come around to the idea that a (0,0) point is more informative than a big empty map. If we're going to change that then maybe we need to entirely rethink the prominent map (which I like!). |
Agree
…On Mon, Apr 25, 2022, 4:25 PM Carla Cicero ***@***.***> wrote:
* [EXTERNAL]*
@campmlc <https://github.com/campmlc> ok, but that shouldn't be the case.
@dustymc <https://github.com/dustymc> I can magic click the coordinates,
and was going to do that but wanted to show you first. Seems like there
should be no map if no coordinates?
—
Reply to this email directly, view it on GitHub
<#3765 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADQ7JBAS3MKGZ37W2MGEBG3VG4LXDANCNFSM5A42AMTA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Mapping: A topic for discussion at our next organism meeting on May 9th. I do like the big prominent map, but if there are no coordinates, they why have a map at all? We don't for records without coordinates. |
I just went through the process of creating an entity for the first time, which was excrutiating. I have no idea if I did anything correctly. https://arctos.database.museum/entity.cfm?action=edit&entity_id=4
The process is opaque, and even if documentation were present this is an incredibly complex operation.
After I created the entity, I added the entity IDs (as url - this is not clear) to each record as an organism ID.
I could then click the entity and see all the records, whew.
Then, I experimented with the same thing by creating an agent for the same animal, "Bernice". She was a chimp at the ABQ zoo.
I added all her IDs as akas to her agent profile.
I added her as an agent to each record (as "subject").
Much easier, much clearer, no need for reciprocally connecting anything.
And voila, all the specimen records and identifiers show up in a single, clear dashboard on her agent page.
I strongly suggest, yet again, that we incorporate the agent model for use with organisms. It works and delivers precisely what we need.
The text was updated successfully, but these errors were encountered: