-
Notifications
You must be signed in to change notification settings - Fork 492
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GDCC/Globus and Big Data Support #8891
GDCC/Globus and Big Data Support #8891
Conversation
Update from IQSS develop
Conflicts (keep isValidIdentifier): src/main/java/edu/harvard/iq/dataverse/dataaccess/S3AccessIO.java
doc suggestions IQSS#8891
I just merged this PR and put a bunch of screenshots over in the issue that has to do with UI for Globus: #7626 (comment) I wasn't able to test writing from Dataverse to any endpoint other than my laptop. I've applied for access through Harvard and would be happy to test this if the server is still up. I put "Update: whatever" in the lists above. In short, I feel like Globus support is ready to ship as experimental. Thanks @JayanthyChengan @lubitchv and @qqmyers! |
What this PR does / why we need it: This PR builds on #7325 and earlier work by Scholars Portal/Borealis to add support in Dataverse for Globus-based data transfer to/from a Dataverse-managed S3 store. It is intended as a minimum viable capability that is expected to continue evolving over time.
This PR allows Globus to be used with a specific S3 store(s) and use of Globus requires the store to be 'public' which turns of support for restriction and embargo in that store ('public' indicates the store is not capable of enforcing Dataverse's per-file access controls, which is the case for Globus where access control is per folder (and Dataverse stores all files for a dataset in one folder)).
Which issue(s) this PR closes:
Special notes for your reviewer: This PR includes PR #7325 as a practical matter (they were deployed/tested together in one branch) and so that the Globus effort can inherit common code cleanup. As with other PRs, differencing against the branch for that PR would show what is unique here.
W.r.t closing issue - guessing the ones above can actually close but there are other open Globus issues. I think most are obsoleted by this update but for those, and even the ones here, some human review before auto-closing might be in order.
Suggestions on how to test this: There is significant setup involved in supporting Globus transfer - this PR for Dataverse, the updated Dataverse Globus app from Scholars Portal/Borealis, and configuration of a Globus S3 connector and various Globus accounts are required. As part of the development effort, there is a EC2 installation of Dataverse and Globus available. This, combined with a local install of the Angular 9 Dataverse Globus app allows testing.
Testing could/should cover the basic up/download via Globus as well as regression testing w.r.t. other stores not being affected and with normal up/download to the Globus store also being allowed.
Does this PR introduce a user interface change? If mockups are available, please link/include them here: This adds a Globus upload button on the upload pane (when using a Globus-enabled store) and puts a Globus-transfer item in the file download menu (again when using a Globus-enabled store). The upload/download widgets themselves are the separate Dataverse Globus app.
Is there a release notes update needed for this change?: yes
Additional documentation: There's a demo, talk, Data Commons documentation on setup, etc. that I'll start linking here.