Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quality control workflows #2936

Merged
merged 43 commits into from
Mar 17, 2022
Merged

Quality control workflows #2936

merged 43 commits into from
Mar 17, 2022

Conversation

benwbrum
Copy link
Owner

No description provided.

@benwbrum benwbrum linked an issue Jan 13, 2022 that may be closed by this pull request
@coveralls
Copy link

coveralls commented Jan 13, 2022

Coverage Status

Coverage decreased (-0.4%) to 79.542% when pulling c31a107 on 2129-quality-control into 219c85e on development.

@saracarl
Copy link
Collaborator

saracarl commented Mar 3, 2022

Failing tests when deleting a collection due to:
NameError at /collection/delete uninitialized constant Collection::QualitySamplings

@saracarl
Copy link
Collaborator

saracarl commented Mar 3, 2022

Testing:

  • Now when you start at a sampling page and click the "sample" button you get a No route matches {:action=>"display_page", :collection_id=>"indiana-wwi-service-record-cards", :controller=>"transcribe", :page_id=>"229601", :quality_sampling_id=>"11", :user_slug=>"indianaarchives"}, missing required keys: [:work_id] around line 34 of the reviewer_breadcrumbs partial
  • The quality sample created was all 140 pages needing review.
    • This happens when the last_editor_user_id has not been set for the collection. Run rake fromthepage:update_last_editor[530,530] (where 530 is replaced with the ID of the collection you are working with), delete all quality samplings, then try again.
  • The review dashboard shows "0 works to review" but "135 pages to review" -- how is that possible?
    • It turns out that the review dashboard works off of work.work_statistics while the rest of the logic looks for works with pages needing review. It's possible that these were off because of the sample data--we certainly did not update work statistics after manually setting pages as needing review in the console, so they would not match. Regardless, I made the review dashboard use the same logic as the other -- "works to review" should find the works with pages needing review.
  • When you get to the end of a review set, we take you back to the collection and show an error "Couldn't find Page without an ID" (in red as an alert across the top of the page).
    • fixed
    • I'm going to trust you, because I don't have a small enough review set to test this right now.
  • "Start Transcribing" button on a collection with pages to transcribe leads to:
    ActionController::UrlGenerationError at /indianaarchives/jeffersonville-land-office-receipts/jeffersonville-land-office-book-9-receipts-12447-to-12996/transcribe/32182134 No route matches {:action=>"display_page", :collection_id=>"jeffersonville-land-office-receipts", :controller=>"transcribe", :page_id=>"32182134", :user_slug=>"indianaarchives", :work_id=>"jeffersonville-land-office-book-9-receipts-12447-to-12996"}, missing required keys: [:quality_sampling_id] -- display_page view, line 3
    • fixed
  • Quality Sampling breadcrumbs are now gone.
  • "Quality Sampling" breadcrumb in the review flow doesn't take me back to the QS; instead goes to the collection. (Also, please check that there is a message -- there isn't one on my system, but I don't trust i18n status on my computer.)
    • I've changed some of the logic here; please see if that addresses this.
  • I was hoping that 5 pages reviewed would be enough for me to have data in the QS dashboard. It isn't. What is the threshhold there?
    • I've substantially changed the flow of sampling, so that the order in which you sample pages is the same as the order in which they appear in the listing. I think this should address this as well.
  • If I start a sampling, review anything in the sample (which in this case is everything that needed review in the collection), and go back to the review tab, I see "start sampling" again. I'm not sure what I should see, but not that. Maybe "everything has been reviewed"?
    • I've rewritten the action to check if a sampling exists. If it does, it takes you to that quality sampling, so you should never see the Start Sampling button more than once per collection.
  • It took a very long time, mostly here: ↳ app/models/collection.rb:88:innever_reviewed_users'` to load the review dashboard for wwI service cards. The dashboard then showed "zero pages to review". Maybe we should check for no pages to review before we "do stuff"? and we should improve the performance here.
  • Hide the "review" tab if a collection is inactive.
  • There's no way to get back to the main review dashboard once you're in the sampling list.
    • I fixed these breadcrumbs (I hope)
  • We should doublecheck the behavior when you are in the review flow, correct a page, and "save" instead of "approve" (or uncheck in Indiana's case). It moved on, but I'm not sure what the state of that page was.
    • I just verified that it does not advance to the next page until the current page is approved
  • If you only do ~5 pages, every result is "high" or "very high" -- desired or not?
  • For this text: "Quality sampling allows you to spot-check contributions, gathering data about quality as you review each contribution." can we add "At this point you have X sample pages left to review. This number may grow as more pages are transcribed."
  • Remove the "help text" on this page http://localhost:3000/indianaarchives/indiana-wwi-service-record-cards/quality_samplings
  • "Start Sampling" should change to "Continue Sampling" if you've already started. (related to no 7) Similarly, I seem to create a new QS every time I "Start sampling"
    • Fixed by change to index action above.
  • Improve the Reviewer dashboard layout (waiting on Nick, but in the meantime...)
  • If you "save" a page in the review flow (after making a change), it takes you to the next page, but that page is probably not approved. Actually, further examination seems to show that it is, so either we remained on the page to move it to completion or it was reviewed. Will check on the next review. This is actually OK.
  • The values displayed in the user listing and work listings don't make any sense under the new, whole-collection logic. We should discuss this, but I'm thinking something as follows:
    • for users:
      • total_page_count should be all pages the user has worked on (or just been last editor on?) -- last editor on. Needs to match.
      • approval_delta should be the total/avg for the pages the user has been last_editor on. (note that we may need to calculate that average completely differently than we do now) -- I think that pages have approval deltas, but users should have a "quality score". It's easier to understand.
      • corrected_page_count should be the count of pages that had an approval_delta > 0 -- yes, but we're missing the "good page count" concept. Not sure where that goes. Implied with "reviewed page count", but we're getting rid of that.
      • replace reviewed page count with count of pages (still) needing review. -- good.
    • for works:
      • total page count should be the total pages in the work -- yes
      • approval delta should be the total/average for the pages in the work that have one and are in a completed state (since reviewed pages can be opened again) -- again, let's call this "quality score"
      • corrected_page_count should be pages in completed state with approval delta > 0 -- again, we should also represent "good page count"
      • replace reviewed_page_count with pages needing review.
  • The "approve all" button on the transcriber page would only be visible to project owners
  • Pseudonymize usernames for everyone but project owners
  • make lists sortable by quality and quantity
  • alert in 24 hour owner report for "prolific new user" (isn't this obvious?)
  • new transcribers should be sorted by quantity & recency
  • fix breadcrumbs
  • make the table values correct & test them
  • migration script rake task should retroactively calculate approval deltas
  • investigate silenced bug in list of works (bad record)
  • recalculate sample set over time

@saracarl saracarl merged commit f90685c into development Mar 17, 2022
sylvieed pushed a commit that referenced this pull request Jul 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Quality Control
4 participants