-
-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Taxa without classification - follow-up to #1761 and #1936 #2098
Comments
We could definitely use some consolidation/cleanup. #1761 looks like it can probably be closed. We've got the easy stuff, I don't think anyone will (or should) to go look up kingdom for 140K names. That's out of date anyway - #1641 (comment) made another 290,861 'bare' names.... If #1936 was not a one-time thing, then it needs to be a new issue. And FWIW I'd probably oppose simply replicating data across Source borders - we did what we could for existing specimens, someone intending to cataloged stuff in one source can bring data over from another (or request that be done), and continually flinging garbage around - #2074 (comment) - without some solid reason to do so just doesn't seem like a good idea to me. is actively looking for the gaps that matter - those with specimens. What exactly are you wanting me to do with the attached? And please resend as CSV - I'm not anxious to find new ways Excel can mangle data.... |
Specifically, could you refresh the taxon names on the csv. They have aphiaIDs and on the few that I checked, they just needed to be refreshed. They were on a list of taxa without classification. There may be a lot more. I only checked 1000 (the WoRMS limit for a match) and found these. Is there anyway to check for WoRMS (via Arctos) taxa that need refreshing? Also in the Dashboard, issue #1894, how would I adjust the SQL to get all the WoRMS (via Arctos) names that lack a higher classification regardless of whether or not DMNS:Inv is using them. That might turn up more that need refreshing. |
I'm still lost. A bunch of the aphiaids don't exist - there's nothing to refresh. Where did these data come from??
I'm not sure what this means. There were obviously a few problems with the initial import, or maybe some stuff changed in WoRMS before we were monitoring. Now changes in WoRMS should just appear in Arctos.
That's not really possible - if they're "WoRMS (via Arctos)" then they have a classification. If you mean some specific missing rank or something I can get that, but I'm not sure if that would be useful?? |
These came out of a file from months ago that were taxa without a kingdom. I took 999 of them and ran them against the WoRMS match and got matches on 670 or so of them - not all perfect matches but something. So today I started to look at them in WoRMS (via Arctos). For example: Example #1 - Row 634 - Acalia erythraea Acalia erythraea | 368961 | exact | 378679 | 9718 | alternate representation | Acalia erythraea | Linckia (Acalia) erythraea | Animalia | Echinodermata | Asteroidea | Valvatida | Ophidiasteridae | Acalia | | erythraea | In WoRMS via Arctos, this is what I find for Acalia erythraea So there's no classification right now, but if I refresh I get this Example #2 - Row 647 Craticula submolesta Craticula submolesta | 617377 | exact | 661379 | 44002 | alternate representation | Craticula submolesta | Navicula submolesta | Chromista | Ochrophyta | Bacillariophyceae | Naviculales | Stauroneidaceae | Craticula | | submolesta | This one doesn't have a WoRMS (via Arctos)or an Arctos entry but it does have a World Registry of Marine Species entry, so I'm not sure why we didn't get it. Example #3 - Row 487 Satiellina jamairiensis Satiellina jamairiensis | 795271 | exact | 795271 | 0 | accepted | Satiellina jamairiensis | Satiellina jamairiensis | Animalia | Arthropoda | Ostracoda | Palaeocopida | Satiellina | | jamairiensis | As I found it After refreshing Does that help? Rather than manually going through these 600+ taxa, I'm wondering if you can refresh them.
Can you tell me which ones don't exist? I've only checked about five so far. It would seem that they should all get refreshed with your regular updates but somehow they aren't and perhaps others that show as "no kingdom" have a similar problem. I'm not using any of these; just trying to keep WoRMS (via Arctos) complete and tidy. |
Dusty, While we're working on the WoRMS Source, is the subgenus the reason this entry wasn't uploaded to WoRMS (via Arctos)? I just added it and refreshed via the aphiaID but it wasn't there before. https://arctos.database.museum/name/Schistoloma%20alta#WoRMSviaArctos It's invalid so I won't use it but wanted to link it to the valid species. |
Yes, I remember that we can't have it both with and without the subgenus as they are the same species. I've deleted the subgenus row and the subgenus from the species name _Schistoloma _alta , but won't a refresh put it back in again since it doesn't exist in WoRMS without the subgenus? And it didn't download at all (without without the subgenus) which I think means that everything in WoRMS that exists only with a subgenus isn't in WoRMS (via Arctos). Correct? This particular one is invalid but without it, I didn't have the usual link to the preferred name. In general, what should I do. Do any of these approaches mess up the taxon record and the taxon name search functionality? |
That's not necessary - I don't (much) care what's in CLASSIFICATIONS, I just want clean names. I refreshed, it does in fact bring the subgenus back in. You can just remove the aphiaia (=remove the link to WoMS) if you don't want that - otherwise someone changing something on worms will cause a refresh in Arctos. We didn't automagically find names with subgenera - there's no namestring to share, that's one of the many problems with 'traditional' taxonomy. Once the aphiaid is in Arctos there's a link and it will pull from it no matter what's on the other end - you could update https://arctos.database.museum/name/Schistoloma%20alta#WoRMSviaArctos to use http://www.marinespecies.org/aphia.php?p=taxdetails&id=255011 if you want (but please don't!). You can't mess up the "any taxon" search - it hits the name itself (why I need clean names), and the worms classification (now) contains Schistoloma (Schistoloma) alta so that'll find specimens as well. You can definitely make your data (the 'family' and etc. search fields) inaccessible by providing inconsistent data, but those are more or less inconsistent by definition, and this one doesn't seem like something you'd use in an accepted ID anyway. |
Back to the original purpose of this issue. There are still taxa in WoRMS (via Arctos) without a classification. It would seem that one advantage of using that Source would be that everything would have a classification unless it's a (non-WoRMS) taxon that I manually added without a classification. How would I modify the SQL in Issue #1894 to get every taxon name in WoRMS (via Arctos) that is lacking higher classification or can you run that list? I've cleaned up everything that DMNS:Inv is using, but I have another list (I think culled from the list of all taxa missing higher classification) and a lot of them still need to be refreshed. I don't understand the problem since I think your system refreshes continuously. Today, I already refreshed Lanistes olivaceus var. ambiguus, Incertipoma virile, Incertipoma subglobosum. I did have to refresh more than once to get the accurate entry. At first, it put in the accepted name rather than the taxon name but after three refreshes it shows the unaccepted name plus the preferred name. Does it matter that they aren't accepted names? Also, some of them don't have a WoRMS (via Arctos) entry even though there is a World Register of Marine Species entry. I added a entry for _Pseudomalaxis roddai and Eutudorops. Both were unaccepted. Does that matter? By adding them, I now have the linked to the preferred taxa too. After refresh As examples, here are five entries that I left with the following problems: (unaccepted) Bellerophina minuta | aphiaID 747778 | needs refresh |accepted as 584856 |
"Taxa in WoRMS (via Arctos) without a classification" isn't possible - the 'WoRMS (via Arctos)' bit IS a classification. I can help with this, but I don't know what you mean.
No, I refresh when WoRMS tells me they've changed something. Maybe we should refresh everything and see what that does.
Is there any possibility your browser is caching (or haunted, or ...) and is causing these problems? That does not resemble anything I've seen from WoRMS, I can't imagine how it could be possible from the Arctos side, but it sounds a LOT Chrome hanging on to data that it just figures is close enough to what you're looking for.
Absolutely never to me - names are names, 'accepted' is just another bit of metadata.
Yep, and
should still get data entry people where they need to be if we add 500 more relationships (so the actual links become muddied).
I clicked the first couple - it just worked, like it always does.
If it somehow fell through the initial import cracks, it's only going to find its way to Arctos if it changes in WoRMS. I think we also have access to their periodic dump, which we might use to find some of this. Some prioritization would be very helpful to me. I think this is all theoretical problems at this point. Solving them could turn into a full-time job. Unless directed otherwise, I'm probably going to wander back towards locality-land as soon as I can.... |
Yes, I think we need to refresh everything and see if that adds the higher classification to these names. I switched from Chrome to Firefox and went to the next one on my list: Conus minimus var. condoriana and found the name without any higher classification in WoRMS (via Arctos). After refresh, it's all there. Wouldn't this taxon name have been included in the list of names without a kingdom? Also, if we (or any user) were to enter a new specimen and use this taxon and Source, again, there would be no family, kingdom, etc. attached to to the name. So, yes, can you refresh all the WoRMS (via Arctos) taxa (without overwhelming the rest of the Arctos users) and let's see if that resolves this issue. It's Medium priority. Right now, I've cleaned up every name we're using, but I have no idea what taxon name we'll need to use tomorrow. |
Can we close this? |
Unfortunately, there still seem to be WoRMS (via Arctos) taxa that need to be refreshed. I opened the next mollusca taxa on the list this morning to see if the problem had been resolved.. After refresh So somehow we're getting taxon names in WoRMS (via Arctos) and the aphiaID but we're not getting the classification. Here are the next names on my list if you want to see the problem. I have not refreshed them. They are all invalid but Dusty said that's just a bit of metadata and doesn't impact the refresh process. Also, all of them (and the others that I refreshed today) have an Arctos Relationship entry that shows the valid (preferred) term and no Arctos entry, if that matters. Paludestrina olssoni And it's not just mollusca. See, for example: Styllaria borealis and Diatoma fasciatum. As you'll note from the initial entries, all I did was check 1000 taxon names from our "no higher classification" list to be sure none of them had WoRMS entries and when I found WoRMS aphiaIDs I started to open them and found the unrefreshed entries. So there seems to be a bug somewhere and, while it hasn't impacted me yet, I'm sure we want to squash it. |
I don't think this is a bug, I think it's just weird legacy of the initial import. If something changes at worms it'll auto-refresh, and anyone can manually refresh at any time. I'm still up for refreshing everything, either once or on some schedule, but that needs more attention than I feel like I can give it at the moment, especially if it's not breaking anything important. |
A refresh can certainly wait as it hasn't impacted us so far. But it may be contributing to the count of taxon names without classifications, so we should be aware of it. |
Issue #1761 hasn't been updated recently and #1936 has been closed, so it may be time to start a new and possibly smaller issue.
In #1761 I noted that I had run an arbitrary 999 of the taxon names without classification against WoRMS and found 672 of them had an aphiaID. Today I looked at four of them and found I only needed to refresh the aphiaID to get the complete classification in WoRMS (via Arctos). None of these had an Arctos classification though they did have an Arctos Relationship classification. So, as requested in #1936 we'd like to clone the WoRMS (via Arctos) into an Arctos classification.
Dusty, can you refresh everything in WoRMs (via Arctos) that needs it rather than me manually doing these 672 taxon names? Same for creating the Arctos classification.
Maybe this will make some dent in the taxa without classifications. Thanks.
(https://github.com/ArctosDB/arctos/files/3228219/WoRMS.match.of.999.arbitrary.taxa.without.kingdom.xlsx)
The text was updated successfully, but these errors were encountered: