-
-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Organism ID #1966
Comments
I have passed this by John Wieczorek and here is our discussion:
My response:
John responds:
From me:
|
To be clear - I don't propose there be two IDs, but to MOVE those other IDs that are truly Organism IDs to the new table. |
In general, I think having some sort of "individual ID" would be very useful. It's not at all clear to me why it would be in a separate table; that invites more denormalization (doing the same thing multiple ways), inevitably leading to even bigger messes. If the scope of this is Arctos, we could exploit relationships to assemble "individuals" and/or individualID without adding any overhead - there's much more discussion on that in #1545 - and see below. I believe that this is implicitly a proposal to recatalog http://arctos.database.museum/guid/MSB:Mamm:292063 as 5 specimens. At least for some use cases that goes against the "catalog the item of scientific interest" mantra; eventually two of the samples from the same wolf will be compared in a publication. I'm not sure that's more evil than the current situation, where 5 samples collected at different times under different conditions are likely seen as equivalent to 5 tubes from the same liver of another specimen, but it should be acknowledged. I think any consistent documented approach is an improvement. "Occurrences" are occasionally recorded in different collections, both in and out of Arctos, so cataloging Occurrences rather than individuals would make Arctos data more comparable with the rest of the world. I'm not sure how much weight that should carry, but again it is a consideration that should be addressed. All of that said, I don't think Arctos can or should dictate how material is cataloged. I think the most we can do is to provide documentation/guidance. This should extend beyond Arctos. A sample of http://arctos.database.museum/guid/MSB:Mamm:292063 stored in another system and shared with GBIF would ideally bear the same "individual ID" as the record(s) in Arctos. If it did, it would be trivial to assemble the individual in GBIF or similar systems. The "danger" is in assigning the identifiers, and I don't believe there is any technical solution to that - it's a social problem that needs a social solution. It took seconds to find https://arctos.database.museum/guid/MSB:Mamm:317312 and https://arctos.database.museum/guid/MSB:Mamm:324187 which share a NEON ID and probably are not the same organism. I have never encountered a "number series" that didn't have similar issues, and if that exists the NEON ID cannot do what you want. I think this would be best implemented as GUIDs, and for social reasons those should probably not be minted by Arctos. Drawing those from an independent source would let Curators determine what is or is not an Individual on a case-by-case basis independent of any problems with identifiers assigned by other organizations, and at least maintains some possibility that other collections holding material from the same individuals would buy in and assign those IDs to their specimens. Two candidates are UUIDs, which would not be resolvable or actionable, or ARKs which could be resolvable and could point to some shared view (eg, GBIF, which in turn could point to the various bits and pieces of the individual in various systems/collections). I think that also could be implemented only as guidance; I don't think Arctos can or should prevent someone from using "1" as an IndividualID, but we can help them understand the implications of doing so. |
How would this not be denormalization? organismID = Mexican Wolf Studbook Number 1216 These are all the same organism, but now we have three IDs for it. If we have: ORGANISM_ID where:
At least we eliminate the problem of the many ways "Mexican Wolf Studbook Number" might be spelled.
I agree with this statement - but no one is stepping up to the plate for biological specimens (at least no one I am aware of). While the solution above does not fix the problems of the world, it would be a start for Arctos collections and maybe we could use that to press the issue with the community. I looked up ARKs and I'm not clear on how that works - if is a solution, then let's explore, but I need an example because it seems very fuzzy to me and doesn't solve the social problem as far as I can tell. |
Yep - and the cataloging of separate events with one catalog number results in events and parts that are not properly associated with their accessions, their collectors and preparators, nor their attributes. (The event links are OK, but easily broken or incorrectly made). |
Should OrganismIDs be a DOI? |
I'm still not following. You want another table that's the same structure and does the same thing as OtherIDs?? And yes those data are denormalized - that's a lot easier to deal with that denormalized structure, and one of many reasons a GUID of some sort would be a useful value. There is no technical solution to social problems. We can make it enticing to assign unifying IDs, but that's about it. ARKs are functionally much like DOIs, but they're free (and don't come with the buy-in, which I suspect means they also don't come with the persistence). https://n2t.net/ark:/87299/x6d50k1v If I a couple million dollars and nothing better to do, everything in Arctos would have a DOI. DOIs would be great "individialIDs" but I don't think I can supply them. And that would lead back into the whole "controlled by Arctos" thing, which I don't think has any chance of being adopted by anyone outside of Arctos. I can provide tools, but the folks who own these specimens should also own the unifying identifiers. |
EXCEPT - those IDs would be passed to GBIF and other aggregators as "Organism_ID". I have also considered just using a check box in the Other_ID table "this is an organism ID".... |
Thanks - I might actually get it now! It's Arctos-centric and not very pretty, but at least it's not denormalization: http://arctos.database.museum/SpecimenResults.cfm?oidtype=Mexican%20wolf%20studbook%20number&oidnum=none is a perfectly valid value for other_id_type=OrgID (whatever we call it). That could be generated by a "this is an orgid" button. I could even abstract it to a saved search or ARK, but that gets us back to the "Arctos-centric" thing. And again, if the scope of this is just "works for Arctos" then I think we'd be better off doing something with relationships. (@tucotuco pointed out that an ID works from a spreadsheet where a relationship may not, so "something" might be generating a URL that finds ID=value as above - IDK, that's details, I'm totally open to ideas). |
" That could be generated by a "this is an orgid" button" - you mean in the
code table, correct?
Also, we would not want to see the "messy"
http://arctos.database.museum/SpecimenResults.cfm?oidtype=Mexican%20wolf%20studbook%20number&oidnum=none
in
the display. We'd want to see "Organism ID: Mexican Wolf Studbook Number:
1216".
possible?
…On Wed, Mar 13, 2019 at 3:11 PM dustymc ***@***.***> wrote:
Thanks - I might actually get it now!
It's Arctos-centric and not very pretty, but at least it's not
denormalization:
http://arctos.database.museum/SpecimenResults.cfm?oidtype=Mexican%20wolf%20studbook%20number&oidnum=none
is a perfectly valid value for other_id_type=OrgID (whatever we call it).
That could be generated by a "this is an orgid" button. I could even
abstract it to a saved search or ARK, but that gets us back to the
"Arctos-centric" thing.
And again, if the scope of this is just "works for Arctos" then I think
we'd be better off doing something with relationships. ***@***.***
<https://github.com/tucotuco> pointed out that an ID works from a
spreadsheet where a relationship may not, so "something" might be
generating a URL that finds ID=value as above - IDK, that's details, I'm
totally open to ideas).
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#1966 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AOH0hCEHhRD5iBe6CGraaQvG4XAq94Duks5vWWl1gaJpZM4buGmY>
.
|
No, in the interface. http://arctos.database.museum/SpecimenResults.cfm?oidtype=Mexican%20wolf%20studbook%20number&oidnum=none is a GUID - and an actionable one at that. There's only one of them on the planet and it's easy to tell what it does. (It's not very pretty and may or may not be very persistent, but that's details.) Mexican Wolf Studbook Number: 1216 is a string. Anyone can use it for any purpose anywhere; it doesn't natively do anything, and trying to do anything with it comes with a big pile of indefensible assumptions. Edit for completeness: https://n2t.net/ark:/87299/x68g8hqw currently does the same thing as http://arctos.database.museum/SpecimenResults.cfm?oidtype=Mexican%20wolf%20studbook%20number&oidnum=none. It's prettier and likely more persistent. If I find another Occurrence of "none" I could re-point the ARK to somewhere mutually agreeable (eg, GBIF) in order to build a more complete picture of the Organism. It's a MUCH better solution than the URL, but also likely to take more investment than clicking a button. 2nd edit: I'm throwing ARKs around only because they're not-Arctos and super easy to create. They're not the only possible GUID, just a convenient and functional example. |
...not to mention that the indefensible assumptions would be distinct for
every different id type, ergo not scalable.
…On Wed, Mar 13, 2019 at 6:49 PM dustymc ***@***.***> wrote:
No, in the interface.
http://arctos.database.museum/SpecimenResults.cfm?oidtype=Mexican%20wolf%20studbook%20number&oidnum=none
is a GUID - and an actionable one at that. There's only one of them on the
planet and it's easy to tell what it does. (It's not very pretty and may or
may not be very persistent, but that's details.)
Mexican Wolf Studbook Number: 1216 is a string. Anyone can use it for any
purpose anywhere; it doesn't natively do anything, and trying to do
anything with it comes with a big pile of indefensible assumptions.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1966 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAcP68SCG5JP36cTtulJpcrF783lkU80ks5vWXJsgaJpZM4buGmY>
.
|
I don't get how what you propose is different from: IDType = text “Mexican Wolf Studbook Number” Description = definition of the IDType Studbook number assigned by the Mexican Wolf Recovery Program base URL = http://arctos.database.museum/SpecimenResults.cfm?oidtype=Mexican%20wolf%20studbook%20number&oidnum= |
I had been thinking there would be only one allowed organismID. Maybe that is silly. Maybe it is fine to have as many as you like. That way you could include your own AND those of other collections (in or out of Arctos). That way you could also potentially go directly to GBIF to get the set of Occurrences for all matching organismIDs. |
HMMMM..I hadn't considered that.
BUT when searching AT GBIF, how would they be related - so that some person who was unaware the two organism IDs were the same organism could make the connection? |
We were discussing earlier how we could link specimens at MSB and AMNH and
Collecion Boliviana de Fauna that are all part of the same animal. All
share the same field number, they are all the same organism, but how would
we relate them in GBIF if AMNH assigns one and MSB assigns a different one?
Ideally, we'd use the shared field number as the core ID, or we'd pay for a
doi.
…On Wed, Mar 13, 2019 at 5:21 PM John Wieczorek ***@***.***> wrote:
I had been thinking there would be only one allowed organismID. Maybe that
is silly. Maybe it is fine to have as many as you like. That way you could
include your own AND those of other collections (in or out of Arctos). That
way you could also potentially go directly to GBIF to get the set of
Occurrences for all matching organismIDs.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#1966 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AOH0hEL5hhP7FTXoNDyscDnEluqXKJ11ks5vWYf3gaJpZM4buGmY>
.
|
It is very different outside the world of Arctos. The organismID would have to be constructed from this, and what would you do to create the organismIDs of the ten collections that have parts of the same plant? Create ten new ID types and base URLS (just to cover that one organism - multiply by all the collections that share any parts of any Organisms in Arctos)? |
It eliminates data stored in arbitrary places.
Yea, I suspect reality will find a way to stomp all over that, but it would be nice....
Arctos can link to anything with a URL, and provides a mechanism for incoming links.
Everybody starts at "1." If you want links, you need actionable GUIDs. If you want discoverable, you need shared actionable GUIDs. You might get at "shared" by tracking down the other 40 samples in GBIF and adding their IDs to Arctos, although "here's a nice neutral persistent actionable identifier, would you mind using it so we can talk to each other?" would greatly simplify things. |
I think that is what I am getting at in tdwg/dwc-qa#131 (comment) |
Something akin to IGSNs, but for Organisms instead of for samples. |
I don't understand - you would only need one ID type. From any record in Arctos, I can click the link from the Mexican Wolf Studbook Number (no matter what number it is) and I'll get the specimen results page that show all of the wolves that share the same number. If UTEP or UMNH or any other Arctos collection had a wolf specimen and put the studbook number in the "Mexican Wolf Studbook Number" other ID, then it would show up in the search too, because the link is an actionable guid like Dusty described. It would be a social issue to decide upon an "ID Type" for the situation that you describe, but we should only need one. The challenge - as I pointed out in the very beginning is assigning the individual organism ID numbers, so that all collections with parts of the same plant would use "Individual Plant ID" = 1, etc. I guess I am missing something (which doesn't surprise me...) The wolves are easy because they are all here and they have a (somewhat) logical identifier. Everything else will be messy until we have a unique BOI (Biological Organism Identifier). |
In all of these situations, there is a shared organism number already that
links specimens. Examples currently in use within Arctos and between Arctos
and outside collections (AMNH, USNM) are Mexican Wolf Studbook Number, NK
number, AF number, Robert L. Rausch collector number, NEON individual ID.
These are used to find and create relationships. The problem with
relationships is that relationships are pairwise - we need a way to
reciprocally link a network, and organism ID would allow us to do that -
like the url link to the above IDs allows us to do that now within Arctos.
Can we mint DOIs or IGSNs?
…On Wed, Mar 13, 2019 at 5:28 PM John Wieczorek ***@***.***> wrote:
I don't get how what you propose is different from:
IDType = text “Mexican Wolf Studbook Number”
Description = definition of the IDType Studbook number assigned by the
Mexican Wolf Recovery Program
base URL =
http://arctos.database.museum/SpecimenResults.cfm?oidtype=Mexican%20wolf%20studbook%20number&oidnum=
It is very different outside the world of Arctos. The organismID would
have to be constructed from this, and what would you do to create the
organismIDs of the ten collections that have parts of the same plant?
Create ten new ID types and base URLS (just to cover that one organism -
multiply by all the collections that share any parts of any Organisms in
Arctos)?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#1966 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AOH0hPkdJpf-GdEBgmOtXbRz8iLU1X5Bks5vWYmkgaJpZM4buGmY>
.
|
Don't half-bake this! - I want those for events, localities, agents, .... too. Seriously, Arctos is built to plug in to something like that. If we have a local identifier for something it's only because nobody else would do it for us.
Not really - there's always an implied second THING out there, but we don't have to be able to find it. "{whatever relationship of} ABC:XYZ:1234" is fine even if ABC:XYZ isn't online, "{whatever relationship of} NK 1" is fine even if 40 specimens (that we can find) wear "NK 1", etc.
I don't think a lack of reciprocity will ever be Arctos' fault. I know many of your examples are not capable of acting as unique identifiers, and I suspect that's true of all of them.
Yes, in limited quantities - there are "get a DOI" links scattered all over the place.
Beats me - if they have a service and are willing to provide access we should be able to. We could also mint ARKs in unlimited quantities if there's a reason to do so. |
But we WANT to find it! 40 fish with "same lot as" requires 39 relationships on all 40 records and then I have no easy way to see them all in one place (or I just don't know how to do it). In the same way - 20 events of blood samples from Mexican wolf studbook number 1216 requires 19 relationships on 20 records (and a relationship needs to be added to ALL of them every time a new set of samples comes in! It is a lot of work.... |
We have litters of pups that are siblings of each other, offspring of two
parents, and parents of other litters. Each of these individual organisms
in turn may be handled multiple times over their lifetime resulting in
multiple catalog numbers of different accessions of parts, potentially at
different institutions. We need organism IDs to deal with the latter, and
relationships that can deal with the former.
…On Wed, Mar 13, 2019 at 5:52 PM dustymc ***@***.***> wrote:
organisms, mint compliant ID
Don't half-bake this! - I want those for events, localities, agents, ....
too.
Seriously, Arctos is built to plug in to something like that. If we have a
local identifier for something it's only because nobody else would do it
for us.
relationships are pairwise
Not really - there's always an implied second THING out there, but we
don't have to be able to find it. "{whatever relationship of} ABC:XYZ:1234"
is fine even if ABC:XYZ isn't online, "{whatever relationship of} NK 1" is
fine even if 40 specimens (that we can find) wear "NK 1", etc.
reciprocally
I don't think a lack of reciprocity will ever be Arctos' fault.
I know many of your examples are not capable of acting as unique
identifiers, and I suspect that's true of all of them.
Can we mint DOIs
Yes, in limited quantities - there are "get a DOI" links scattered all
over the place.
IGSNs
Beats me - if they have a service and are willing to provide access we
should be able to.
We could also mint ARKs in unlimited quantities if there's a reason to do
so.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#1966 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AOH0hM3K93lKilwho96shrOC8Z0_2qe0ks5vWY89gaJpZM4buGmY>
.
|
That's an interface problem.
That MAY be an interface problem too - eg, MAYBE I could just magic in reciprocals instead of the email. Not much problem technically, but there are social implications.
That does occasionally happen, but more normal is a coyote, a beaver, 3 mice (all because the printer stuck), and all of their parasites (for reasons that don't make much sense to me).
There's an Issue somewhere about making inferences from relationships - also just a display problem.
Yea, there's some overlap that I don't think we can avoid. I think we need both anywhere we can - orgID is useless unless all of the bits are accessible, and relationships can't be used to find all the bits in places like GBIF. I'm not real happy with that, but I think it's reality. |
OK, I'll be there at noon. I'm not doing something right. Here is a record with two events: I created an observational record for the second event: and selected for both a 'Organism ID' identifier (manually entered the URL which I'm sure is not correct, but I didn't see a base URL in the code table) When I click on the Organism ID link, I get "Entity not found! Please let us know what happened." |
You didn't create one. https://handbook.arctosdb.org/documentation/entity.html I did this for you: nope not there so and now you have the bare minimum. The next step would (ideally - this is now functional) be to add the components. Then clicking "pull" and accepting whatever it says would add some discoverability. |
@Jegelewicz was amazing and added the office hours to the calendar. |
Thanks! I'm up for anything. I'll probably be more useful with some warning, I think we can/should prioritize if someone wants to schedule a topic, otherwise just see what happens? |
How about: |
From meeting:
Changes
Unresolved:
It's less-dynamic for now, not sure we have the CPU to pull everything in anyway. Looking forward, this needs to (theoretically) work for hundreds (zoo critters have a rough life) if not thousands (GPS collar, maybe) of components, which probably demands separate search results and 'details' views. Needs further discussion:Entities are but one option for Organism ID, and therefore the code is "Entity-centric." Organism ID can be exported from Entities to catalog records, but Entity ID cannot be exported/created from catalog records. I suggest that this is sufficient; Entities are "super objects" that only need exist when there's something additional to say. If the only goal it a common identifier for Organism ID, there are many options which do not involve Entities (bird banding lab numbers, for example). Entities are "better" identifiers, and making sure that they are in fact "better" requires a small amount of focus. Yea But Anyway:Consider something in SpecimenResults-->Manage-->Add All Records to {pick an entity}
Needs Clarificationre: "bird banding lab numbers, for example" above: There is confusion around this point, it needs clarified somewhere. A number may/should be used in multiple types, because those types convey different information and have different functionality. For example, to use a BBL number as an Organism ID, the following should be entered (assuming BBL was an OtherID Type in Arctos):
The BBL number supports "find records with a BBL number" (and perhaps value, but free-text fields aren't very good at that), and potentially (should BBL come online) can serve as a link to external resources or additional data. The Organism ID serves as an Organism ID; it's an identifier that spans multiple Occurrences and links them together as one THING. In this case that link is dependent on users being consistent (eg, not using Entities (of type Organism) serve the same purpose; they're linking identifiers. They differ in two significant ways:
tl;dr: Any string can serve as Organism ID, but some can DO THINGS that others cannot. Bulk Tools:MSB's biopark data is recent and decent, but should have enough problems to be interesting. Try to make and "componentize" Entities from it, with a view towards developing bulk tools. (This may address any gaps left by the entity-centric approach described avove.) Reports:
Possibilities:Rather than Export, we could write to the ID loader with status=autoload
"Reports" above has the same implications; we could save a few minutes by automating, which might then require much more than a few minutes to fix the giant mess which could result from a relatively minor error. @campmlc @Jegelewicz @ccicero what'd I miss/mangle? |
There's some new stuff in test, https://handbook.arctosdb.org/documentation/entity.html#the-process-v2 documents creating http://test.arctos.database.museum/entity/2 Questions:
|
Not sure if this will be helpful, but here are several references for BBL bands: https://www.usgs.gov/centers/eesc/science/about-federal-bird-bands?qt-science_center_objects=0#qt-science_center_objects BBL bands always have two sets of numbers XXXX-XXXX or XXXX-XXXXX. The first string relates to the size of the band, and the second string is in sequence numerically assigned to individual banders. I can't find a reference for the numeric codes for the different sizes, but I'm sure it exists somewhere. I could dig deeper if you need me to. |
Thanks. Nothing can really change how unresolvable strings work, but entities could serve as a place to gather identifiers - the Entity itself can hold all the variations that might be found in GBIF-n-such (BBL:XXXX-XXXX; BBL XXXX-XXXX; XXXX-XXXX, XXXXXXXX, etc., etc.) and that has some possibility of leading users to those records if they find the Arctos record. |
Identification (taxon) |
Latest is in production, I rebuilt the two Entities I could, old data is in arctos-assets. |
Sorry I haven't worked on this - I've been busy cleaning ichnotaxa and part names..... |
I think we've all had our distractions lately! |
Yes, I can't wait to try these out. Any way to do a mass entity bulkload
for ABQ Biopark?
…On Fri, May 28, 2021, 4:37 PM dustymc ***@***.***> wrote:
* [EXTERNAL]*
I think we've all had our distractions lately!
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1966 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADQ7JBAW3LBY3PLFLIFOSYTTQALJZANCNFSM4G5YNGMA>
.
|
One problem with the "multiple events for a cataloged organism" model. This one, where NONE of the parts are associated with any one of the 12(!) events. I can tell you that at GBIF and iDigBio, each of the 12 occurrences includes all 28 parts, which is pretty misleading. Here is one of the GBIF occurrences: https://www.gbif.org/occurrence/1300283344 Also, ALL media are associated with ALL occurrences at GBIF, again misleading. This is sort of true at iDigBio as the "associated media" field links up with a search of media by the catalog number (at least I think that is what is happening) although this link has 9 results and there are 10 images at GBIF). How does this stuff look at GGBN? Interestingly enough, I was unable to find any Canis lupus baileyi at all through their search page! @campmlc you may want to follow up on why this is so. I did find Canis lupus baileyi x Canis familiaris I notice that GGBN results include this:
Well if all of the samples for this "specimen" get narrowed down to just one vial of blood in search results, then people would be missing out on the "over time" component of the sampling. Not to mention the fact that there may be more than one kind of sample (hair, blood, serum). HOWEVER, there's this so what exactly is a "specimen"? If I were someone looking in on this, it just looks a big pile of things and I don't have the time or inclination to sort it out amongst the 4 different resources (Arctos, GBIF, GGBN, iDigBio). The information for one cataloged item should really not look so incredibly different in all of these resources. Some of that is on the resources, but some of it is on us. Sorry for this, but I am looking into issues related to MaterialSample and as I was researching, I fell into this rabbit hole. I wanted to document it so when the time is right I can return to it. |
And extra infuriating is this. It looks like GGBN takes all of our individual "occurrences" and mashes them together. WHY do we have to split everything up for them? I don't understand how they couldn't take the data at GBIF and parse it into the separate "samples". https://www.gbif.org/occurrence/1229671489 And why in blazes are there only three samples when the "preparations" clearly show 6? AND the individual samples don't even show what they are?, just "tissue" |
Great observations - we need a designated discussion on this. |
I'd really like you guys to look at some of your stuff in all the various portals and think about what is happening! |
Issue Documentation is http://handbook.arctosdb.org/how_to/How-to-Use-Issues-in-Arctos.html
Is your feature request related to a problem? Please describe.
We have been working with organisms for which we have multiple occurrences, specifically Mexican Wolves in the Mexican Wolf recovery program. Throughout their lives, samples of blood are taken from these animals and deposited in the genomic resources collection at MSB. Traditionally, each set of samples (all from the same day) have been given a single catalog number. This results in multiple cataloged items for a single organism, which we can link to each other using the “same individual as” relationship.
These relationships are nice, but they don't allow us to see ALL events for an individual in one place. and they require the addition of a new relationship for ALL related cataloged items every time a new collection of blood is made. Each cataloged item includes the other ID “Mexican Wolf Studbook Number” and we have modified the Other ID url so that clicking this other ID allows us to find all of the samples from any given animal.
This method works, but there is one issue we need to address.
When our data leaves Arctos and is ingested by aggregators such as GBIF and iDigBio, there is no easy way for anyone using the data there to make the connection that the various cataloged items are all from the same animal. Although the Mexican Wolf Studbook numbers are included in the list of related IDs, the connection just isn’t as tight as we would like it to be.
Describe the solution you'd like
Our proposed solution is to make use of the Darwin Core field “Organism ID”. We envision this as a separate and distinct other ID – one which provides a link to all related specimens (the results of that link would look just like the search result you see when you search one of the Mexican Wolf Studbook numbers):
This identifier would be passed to aggregators in the “Organism ID” field – allowing those using the data there to make the appropriate connection between the related cataloged items. Currently it appears that we are just passing the catalog item to that field
which is what led to the solution we have been attempting to make work in #1545. This has created problems with data entry and maintenance on our end. This new solution will allow us to keep events matched with parts and parts matched with accessions. It will simplify data entry and end the need for the links between events and parts.
We envision a new code table: CTCOLL_ORGANISM_ID set up very much like CTCOLL_OTHER_ID_TYPE where:
When the Organism ID is used, there would be no need for all of the “same organism as” relationships, but they could be used if a collection so desired. Every cataloged item that included an Organism ID would instead appear like this:
With the text “Mexican Wolf Studbook Number: 1216” being a link taking you to the search results:
We would hope that this link could also be what appears at the aggregators in their “Organism ID” field:
Describe alternatives you've considered
The major challenge we see with this method is how to assign unique Organism IDs for things where there isn’t an obvious one. The Mexican Wolves (and eventually the Red Wolves that are expected to come in from Arkansas) and NEON recaptures are examples of when we would be using this method. These all have obvious unique identifiers (studbook numbers and NEON sample ID numbers). However, when the skin and skeleton of an animal are at DMNS and the tissues for that same animal are at MSB, there is no obvious organism ID type and we would need to come up with one. We are open to suggestions for how best to accomplish this.
What have we missed?
Additional context
See above
Priority
I would like to have this resolved by date: soonish
The text was updated successfully, but these errors were encountered: