-
-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
taxonomy relationships #735
Comments
Possible improvement: Move ALL relationships to classifications, something like we get from GlobalNames. Example: http://arctos.database.museum/name/Arhopalus%20cervinus#ITIS taxon name: Arhopalus cervinus This is "correct" from a data standpoint; our current data (http://arctos.database.museum/name/Echidna%20russellii) ~assert "Echidna (all uses) is a bad spelling of Bitis (vipers)," which isn't correct; Echidna remains a "good" name for eels and a "bad" synonym for some other stuff (pointy mammals, moths). We would lose the precision available under http://arctos.database.museum/info/ctDocumentation.cfm?table=CTTAXON_TERM, BUT we know many of those data are garbage anyway (see email "backwards synonyms" @DerekSikes 1 Apr 2016), and many of them intentionally avoid precision (eg., DLM uses "synonym of" to mean "sameish thing" with no ICxN intentions); I see little evidence that we're capable of usefully maintaining those data, and see no way of determining what's trustworthy. It is currently very easy to delete classifications, it should probably be more difficult/require confirmation/something to delete a "synonym bearing" classification. It's not exactly clear how we'll avoid synonym-bearing classifications in things like updating FLAT; all code dealing with "the collection's classification" would need reviewed. All "any taxa" queries would need rewritten, but performance should improve (we'd need to tune only one thing, albeit one very large thing). Question (possibly for GlobalNames): Under eg http://arctos.database.museum/name/Echidna%20russellii#CatalogueofLife (and many other examples) the query was for a "species" (binomial) and various sources return a monomial (genus). What exactly is the assertion? |
Just curious if we've considering assigning a number to each use of a taxon name the same way we do to a locality. Then could specific numbers be in each classification (and search etc.) Would that keep them straight and link the correct ones? Numbers would be unique. Names aren't and adding the author doesn't seem to be a huge improvement overall.
I find this happens frequently. If the species isn't found in these sources, they are returned only to the genus level. Also, if the species is invalid, WoRMS returns the valid species. Not sure if this will happen in WoRMS (via Arctos) or not. |
We do - names have taxon_name_id and classifications have classification_id.
...and just like localities, the ID isn't stable - they get replaced rather than updated when it's convenient, etc. Localities have 'locality_name' which IS stable - easy enough to add that to something like classifications, but (like locality_name) that would affect how the data may be managed. |
Taxon IDs sounds like a promising approach for dealing with the issue of
linking a name to an authority, date, and classification.
…On Mon, Dec 10, 2018 at 2:56 PM dustymc ***@***.***> wrote:
assigning a number
We do - names have taxon_name_id and classifications have
classification_id.
same way we do to a locality
...and just like localities, the ID isn't stable - they get replaced
rather than updated when it's convenient, etc. Localities have
'locality_name' which IS stable - easy enough to add that to something like
classifications, but (like locality_name) that would affect how the data
may be managed.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#735 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AOH0hJ928oIJqebrpwiQ9S3w-cGswTtbks5u3tiDgaJpZM4ICJ2v>
.
|
taxon_name_id uniquely identified NAMES. Names also uniquely identify names - we have a unique index. Classification_id uniquely identifies classifications. We replace those every time we clone-edit-delete instead of editing or use the classification bulkloader. |
Is it possible to have a stable classification (e.g. classification+taxon
name) "name" or ID?
…On Mon, Dec 10, 2018 at 3:24 PM dustymc ***@***.***> wrote:
taxon_name_id uniquely identified NAMES. Names also uniquely identify
names - we have a unique index.
Classification_id uniquely identifies classifications. We replace those
every time we clone-edit-delete instead of editing or use the
classification bulkloader.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#735 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AOH0hC2UuY7FBBxUJEdIZDIFadTipLvqks5u3t8jgaJpZM4ICJ2v>
.
|
Sure - we just don't allow them to change. "Don't allow certain data to change" seems like a critical component of managing taxon concepts anyway. I don't think that's any sort of deal-breaker, but it's absolutely a big change in how we view and manage classification data. We currently treat taxon names as "data" - eg, you can't change them once they're used. Classifications are treated like "metadata" - you can delete them or replace them (to make family consistent, or because it's easier than editing, or because someone left some junk behind, or whatever). Moving to taxon concepts - even if the "concept" is just name+name-author+year - would elevate classifications to "data" - they'd become things you pick (presumably for reasons) rather than things you inherit (eg, from collection preferences). Allowing you to pick specific "concepts" and allowing those concepts to arbitrarily change would be pointless, so we'd have to lock some things down. Keeping an identifier stable in that context should not be a problem. |
That sounds like a very promising approach to solving some of our issues
with choosing particular name+classification combos for a given collection
or specimen, and dealing with homonyms?
…On Mon, Dec 10, 2018 at 4:01 PM dustymc ***@***.***> wrote:
Sure - we just don't allow them to change. "Don't allow certain data to
change" seems like a critical component of managing taxon concepts anyway.
I don't think that's any sort of deal-breaker, but it's absolutely a big
change in how we view and manage classification data.
We currently treat taxon names as "data" - eg, you can't change them once
they're used. Classifications are treated like "metadata" - you can delete
them or replace them (to make family consistent, or because it's easier
than editing, or because someone left some junk behind, or whatever).
Moving to taxon concepts - even if the "concept" is just
name+name-author+year - would elevate classifications to "data" - they'd
become things you pick (presumably for reasons) rather than things you
inherit (eg, from collection preferences). Allowing you to pick specific
"concepts" and allowing those concepts to arbitrarily change would be
pointless, so we'd have to lock some things down. Keeping an identifier
stable in that context should not be a problem.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#735 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AOH0hFwY8T9aJllPUrzxpRosogpK06KPks5u3ufKgaJpZM4ICJ2v>
.
|
Agree. I have been wondering how the current model of "name as data and classification as metadata" came about. It seems like we are creating a lot of our own problems with the two layers of identification. What would we need to do to transition to such a model? and what am I missing about the current model that makes it more useful/appropriate? |
normalization
In that model (as I see it), normalization is even more critical. The only significant structural change would be identification_taxonomy.taxon_name_id becoming identification_taxonomy.classification_id. (That sort of modularity is another benefit of normalization.) That should just leave the usability issues to deal with. |
Closing to consolidate issues see #1136 |
Original issue reported on code.google.com by
dust...@gmail.com
on 13 Jul 2015 at 8:54The text was updated successfully, but these errors were encountered: