-
Notifications
You must be signed in to change notification settings - Fork 493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TDL/7493 improve bag generator failure handling part 2 #8773
TDL/7493 improve bag generator failure handling part 2 #8773
Conversation
…Gnerator_failure_handling
…Gnerator_failure_handling
…Gnerator_failure_handling
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see anything weird in this pull request so I'm sending it to QA.
However, I'm not exactly sure how to test it so @qqmyers you might need to explain a bit more.
As with other resources not getting released issues, aside from regression testing in the all-good case, the only way to see the problem and that the PR improves things is to configure an error. In this case, one way to cause the file-retrieval HTTP calls (done when creating the archival bag) to fail is to remove the physical files in a dataset you archive (how we discovered the issue at TDL and verified that this fix helps - some old test datasets did not have physical files) but any way of causing file retrieval API calls to fail would work (e.g. using a proxy of some sort.) As I noted in the test instructions though, I'm not sure how much effort on the failure cases in QA makes sense - deleting some test files might be simple enough. |
…Gnerator_failure_handling
What this PR does / why we need it: It looks like a merge with develop after other PRs in the original #8609 PR undid the changes that were supposed to be in the PR. This PR reapplies those and adds a bug fix for the checksum validation that is done when using the local archiver (the paths to find the files in the validator were not updated when the RDA work added a top-level dir within the bag structure.)
Which issue(s) this PR closes:
Closes
Special notes for your reviewer: Argh - I messed up in breaking the 3A work into PRs. Nominally the main changes here were reviewed before. The only addition is passing the bagName to the BagValidationJob class since the bag now has paths like doi-10-5072-fk2abcdef/data/filepathname and the manifest only includes the data/filepathname part.
Suggestions on how to test this: From the #8603 issue - the fix is primarily about avoiding hung connections in cases where physical files are missing. Not sure how far to go in trying to test that, but minimally this should be regression tested (i.e. in the normal case, archival bags are still produced with the local archiver, etc. Similarly in the normal case, there shouldn't be any waring/sever messages from the BagValidationJob class when using the local archiver.
Does this PR introduce a user interface change? If mockups are available, please link/include them here: no
Is there a release notes update needed for this change?: part of #8611
Additional documentation: