component loaders (and data entry) #2974

dustymc · 2020-07-29T18:10:26Z

The "data entry extras" functionality isn't as good as it could be, loading large batches of various components (identifiers, parts, identifications, etc.) causes timeouts then problems/confusion, the "claim" process for managing data entered via 'data entry extras' causes problems, etc. Let's fix it.

Very tentative suggestions, which may or may not hold up to reality:

Attach all bulkloaders to a scheduled task (like the specimen bulkloader)
Replace the "claim" functionality with an ability to change status in "not-yours" records (which would necessarily come with access to records created by users with whom you share collections)
Rebuild the loaders to resolve UUIDs without first fetching guid_prefix (UUIDs are unlikely to be replicated, the second identifier isn't necessary)

A normal load would then be

load
validate (optional)
set status to something (probably "autoload")
check back later, find either
- nothing, because it loaded and cleaned up after itself
- errors, because you didn't validate

"Approving" records loaded by you or your students/techs/associates, via any process including data entry extras, would be

set status
check back later

I think that would be a significant simplification in both the code and the user experience. "Manage your..." might come with a "pick users" option (a slight increase in complexity), but most of the rest of the complexity (claim, find guid, etc.) that's been introduced for various reasons could be removed.

This has some urgency, I'd like to use #727 as a proof of concept, so I'm adding scary labels and will interpret a lack of immediate objections as enthusiastic approval.

The text was updated successfully, but these errors were encountered:

Jegelewicz · 2020-07-29T19:47:20Z

I'm up for trying this method. Anything we can do to simplify and make the process consistent across tools would be nice.

dustymc · 2020-07-30T20:21:34Z

The basics of this are running in test with bulkload identifications. I think its worked out even better than anticipated, but timely feedback would be appreciated.

Replace the "claim" functionality with an ability to change status in "not-yours" records

The form is limited to manage_collection in order to safely (I hope!) accommodate this, and there's a new "shares collection" function which DOES NOT exclude users with locked accounts (so you can load things created by former techs & etc.).

Rebuild the loaders to resolve UUIDs without first fetching guid_prefix

This is implemented and tested, needs propagated to all other loaders

"validate" is part of the load process; there's no pre-validation. (Having this as a separate step has been a source of confusion for some time, this process facilitates a much simpler go/nogo approach.)

Todo, pending nobody finding a reason to go in a different direction:

add component_loader and component_loader_notification to scheduler
unschedule autoload_extras and dataentry_extras_notification
test with Bulkloading identifications error #2936
rebuild all component-loaders to use this system
change "extras" notifications; need to report everything in loaders (it's all the same now) rather than attempting to pick out pieces. Use the new 'even if locked' function for this.
figure out if we can/should merge "unloaders" into the same system (probably and probably?)

Jegelewicz · 2020-07-30T20:24:32Z

Now I gotta dig up some stuff to load....

Jegelewicz · 2020-07-30T20:24:45Z

/remind me to work on this tomorrow

reminders · 2020-07-30T20:24:49Z

@Jegelewicz set a reminder for Jul 31st 2020

campmlc · 2020-07-30T20:49:06Z

Cool. I will try too.

…

On Thu, Jul 30, 2020, 2:25 PM reminders[bot] ***@***.***> wrote: * [EXTERNAL]* @Jegelewicz <https://github.com/Jegelewicz> set a reminder for *Jul 31st 2020* — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#2974 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADQ7JBHA34YUBTKIKFLHGHDR6HJKJANCNFSM4PL36M5A> .

reminders · 2020-07-31T09:06:24Z

👋 @Jegelewicz, work on this

dustymc · 2020-07-31T17:22:41Z

Another major point for this model: it makes replication easy, there's now a testable locality-loader. I'll stop until I get some feedback, I don't want to replicate any problems.

The loader-scripts aren't scheduled, you can just open http://test.arctos.database.museum/ScheduledTasks/component_loader.cfm to process from the two new loaders.

Jegelewicz · 2020-08-03T14:17:09Z

When I follow that link - I get a white screen.

Jegelewicz · 2020-08-03T14:18:11Z

Let's go to Vegas!

dustymc · 2020-08-03T14:27:50Z

white screen.

Yea it's not very interactive - check back with the data, should be different. https://github.com/ArctosDB/internal/issues/65

Vegas

Sorry, I broke it!

Jegelewicz · 2020-08-03T17:19:17Z

OK, one more observation, when stuff won't load, it would help to get the error along with the csv when you download to fix stuff.

So I was able to load 10 localities - none had coordinates - I'll see if I can find a couple that do to try.

Also http://test.arctos.database.museum/ScheduledTasks/component_loader.cfm to process from the two new loaders. Needs to have some kind of interactivity...once you go there, you don't get out and we need people to understand that they have accomplished something. Assuming this will be true in production.

dustymc · 2020-08-03T17:26:57Z

get the error

wilco

interactivity

That's just test - it'll be on the scheduler in production, loading (or errors) will just happen (including for any number of records).

Jegelewicz · 2020-08-03T20:51:05Z

Clarification - So when I load a file directly to the tool, if stuff passes all the triggers, does it just load or will it always show up in the "manage" page first. Don't know why I can't decide what happens....

dustymc · 2020-08-03T21:17:30Z

You can load with status, and if you load with it as "autoload" then Arctos will take care of the rest (or make errors). If you follow the instructions and load from a fresh template then you'd need to set status (which gives you an opportunity to notice that you've just loaded 4582 duplicates...). How that's implemented and documented is a little waffly at the moment, but the potential for "stuff just happens" exists.

dustymc · 2020-08-04T15:19:01Z

This is in prod, need to integrate eg #2967 (comment) and rebuild all component-loaders under this umbrella.

Dropping priority.

dustymc · 2020-08-06T16:09:53Z

Need to check throttle; currently set for 10 records per run, can be upped significantly but needs monitored as things are added.

ewommack · 2020-11-23T20:46:09Z

Hey Arctos - Reminder to try and test this by next Thursday!

dustymc · 2020-12-11T18:09:59Z

See #3300 - make sure status (which can be errors) is urlencoded when necessary

dustymc · 2021-01-11T16:57:20Z

This has served its purpose, there's a template, it's awesome, closing.

campmlc · 2021-01-29T18:56:29Z

@gradyjt

dustymc · 2021-02-16T17:15:00Z

The next two tasks on my list (#2556, #2442) rely on this template. I can't seem to reconcile #3413 and the related AWG discussion. Do we love this or hate it? Can I keep building these things or do we need more discussion? Do I need to change something going forward? Do I need to change something with the ~dozen loaders I've already built under this model?

Jegelewicz · 2021-02-16T20:50:53Z

I think the tool is fine. It is just the related "documentation" that needs update, but others should weigh in.

campmlc · 2021-02-16T20:59:15Z

I think we could just add to the documentation, and I also suggested that rather than change the text on all the different component loaders, we change the main bulkloader to use the same terminology, e.g. mark to autoload vs mark to load etc - with a few sentences explanation on that form only.

…

On Tue, Feb 16, 2021 at 1:51 PM Teresa Mayfield-Meyer < ***@***.***> wrote: * [EXTERNAL]* I think the tool is fine. It is just the related "documentation" that needs update, but others should weigh in. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2974 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADQ7JBEV7MQA2RFSXEBCW53S7LLDZANCNFSM4PL36M5A> .

dustymc · 2021-02-16T21:15:48Z

main bulkloader

If you mean the catalog record bulkloader, these are fundamentally different tools. The catalog record bulkloader is an independent tool - things in it load or error, that's it. "Component loaders" can have dependencies - things can hang around with 'autoload: ....' for weeks, then be processed after related data becomes available. That is, component loaders have three exits:

load worked (so delete)
load didn't work because data are a mess (so a person needs to become involved)
load didn't work because dependent data are MIA (so try again later, no humans required)

I would be in favor of changing the actionable value of loaded to "autoload" rather than NULL for the catalog record bulkloader, but that should be addressed in a new issue.

Jegelewicz · 2021-02-23T22:16:41Z

Maybe we need to think about some way to let people jump to a specific set of data in tools like https://arctos.database.museum/tools/BulkloadOtherId.cfm

Currently there is an extra-long list of errors in there and if my username was after this person alphabetically, I'd have to scroll forever to get to my stuff.

This is just one page of it

Maybe just a table at the top that lists the usernames and lets you jump to a specific user's stuff?

dustymc · 2021-02-23T22:32:43Z

See https://github.com/ArctosDB/data-migration/issues/450#issuecomment-784555912 - verbose errors 400 lucee, need to POST or truncate errors or something.

Untested workaround: filter only on username, change status to something shorter.

@Jegelewicz

Jegelewicz · 2021-02-23T22:36:37Z

Thanks, that worked for my stuff...

dustymc · 2021-02-25T16:42:06Z

v1.1: csv download should include this to strip unnecessary columns

<cfset flds=mine.columnlist>
<cfif listfindnocase(flds,'key')>
<cfset flds=listdeleteat(flds,listfindnocase(flds,'key'))>	
</cfif>
<cfif listfindnocase(flds,'last_ts')>
<cfset flds=listdeleteat(flds,listfindnocase(flds,'last_ts'))>	
</cfif>
....
<cfset csv = util.QueryToCSV2(Query=mine,Fields=flds)>

dustymc · 2021-02-25T17:02:30Z

Moved unfulfilled requests to #3463, closing (again).

dustymc added the Priority-Critical (Arctos is broken) Critical because it is breaking functionality. label Jul 29, 2020

dustymc added this to the Next Task milestone Jul 29, 2020

dustymc self-assigned this Jul 29, 2020

reminders bot added the reminder label Jul 30, 2020

reminders bot removed the reminder label Jul 31, 2020

dustymc mentioned this issue Aug 1, 2020

Locality and Locality Attribute Loader/Unloader #2967

Closed

dustymc added Priority-Normal (Not urgent) Normal because this needs to get done but not immediately. and removed Priority-Critical (Arctos is broken) Critical because it is breaking functionality. labels Aug 4, 2020

dustymc modified the milestones: Next Task, Active Development Aug 4, 2020

Jegelewicz added the NeedsDocumentation When the issue is resolved in Arctos repository, this should be moved to the Documentation-wiki repo label Aug 13, 2020

Jegelewicz self-assigned this Aug 13, 2020

dustymc mentioned this issue Aug 14, 2020

Can we get GUID as a column for the bulkload identifiers template #2292

Closed

This was referenced Nov 17, 2020

Make the Taxon Name Validator Tool Better #3101

Closed

One sided Other ID relationships #3233

Closed

This was referenced Nov 30, 2020

Cloning record to different collection? #3238

Closed

data service request: Download the container types from a list of barcodes #3261

Closed

This was referenced Dec 11, 2020

media bulkloader #3298

Closed

Bulkload Locality 400 error when deleting #3300

Closed

dustymc closed this as completed Jan 11, 2021

dustymc mentioned this issue Jan 22, 2021

Bulkload/Remove/Append specimen remarks #2556

Closed

dustymc mentioned this issue Feb 6, 2021

Bulkload part attributes fails without errors #3413

Closed

dustymc added the Component Loader Things involved in Round Five of the component loader discussions label Feb 10, 2021

dustymc reopened this Feb 16, 2021

dustymc mentioned this issue Feb 16, 2021

Citation Bulkloader template and documentation #2442

Closed

Jegelewicz mentioned this issue Feb 16, 2021

Change the actionable value of loaded to "autoload" rather than NULL for the catalog record bulkloader #3436

Closed

dustymc mentioned this issue Feb 25, 2021

Component Loaders: v1.1 #3463

Closed

dustymc closed this as completed Feb 25, 2021

dustymc mentioned this issue Jul 28, 2021

Documentation needed - component loaders ArctosDB/documentation-wiki#225

Open

dustymc mentioned this issue Aug 1, 2022

remove the old data entry screen #4257

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

component loaders (and data entry) #2974

component loaders (and data entry) #2974

dustymc commented Jul 29, 2020 •

edited by reminders bot

Loading

Jegelewicz commented Jul 29, 2020

dustymc commented Jul 30, 2020 •

edited

Loading

Jegelewicz commented Jul 30, 2020

Jegelewicz commented Jul 30, 2020

reminders bot commented Jul 30, 2020

campmlc commented Jul 30, 2020 via email

reminders bot commented Jul 31, 2020

dustymc commented Jul 31, 2020

Jegelewicz commented Aug 3, 2020

Jegelewicz commented Aug 3, 2020

dustymc commented Aug 3, 2020

Jegelewicz commented Aug 3, 2020

dustymc commented Aug 3, 2020

Jegelewicz commented Aug 3, 2020

dustymc commented Aug 3, 2020

dustymc commented Aug 4, 2020

dustymc commented Aug 6, 2020

ewommack commented Nov 23, 2020

dustymc commented Dec 11, 2020

dustymc commented Jan 11, 2021

campmlc commented Jan 29, 2021

dustymc commented Feb 16, 2021

Jegelewicz commented Feb 16, 2021

campmlc commented Feb 16, 2021 via email

dustymc commented Feb 16, 2021

Jegelewicz commented Feb 23, 2021

dustymc commented Feb 23, 2021

Jegelewicz commented Feb 23, 2021

dustymc commented Feb 25, 2021

dustymc commented Feb 25, 2021

component loaders (and data entry) #2974

component loaders (and data entry) #2974

Comments

dustymc commented Jul 29, 2020 • edited by reminders bot Loading

Jegelewicz commented Jul 29, 2020

dustymc commented Jul 30, 2020 • edited Loading

Jegelewicz commented Jul 30, 2020

Jegelewicz commented Jul 30, 2020

reminders bot commented Jul 30, 2020

campmlc commented Jul 30, 2020 via email

reminders bot commented Jul 31, 2020

dustymc commented Jul 31, 2020

Jegelewicz commented Aug 3, 2020

Jegelewicz commented Aug 3, 2020

dustymc commented Aug 3, 2020

Jegelewicz commented Aug 3, 2020

dustymc commented Aug 3, 2020

Jegelewicz commented Aug 3, 2020

dustymc commented Aug 3, 2020

dustymc commented Aug 4, 2020

dustymc commented Aug 6, 2020

ewommack commented Nov 23, 2020

dustymc commented Dec 11, 2020

dustymc commented Jan 11, 2021

campmlc commented Jan 29, 2021

dustymc commented Feb 16, 2021

Jegelewicz commented Feb 16, 2021

campmlc commented Feb 16, 2021 via email

dustymc commented Feb 16, 2021

Jegelewicz commented Feb 23, 2021

dustymc commented Feb 23, 2021

Jegelewicz commented Feb 23, 2021

dustymc commented Feb 25, 2021

dustymc commented Feb 25, 2021

dustymc commented Jul 29, 2020 •

edited by reminders bot

Loading

dustymc commented Jul 30, 2020 •

edited

Loading