Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simple support for remote web stores #7324

Closed
qqmyers opened this issue Oct 13, 2020 · 0 comments · Fixed by #7325
Closed

Simple support for remote web stores #7324

qqmyers opened this issue Oct 13, 2020 · 0 comments · Fixed by #7325

Comments

@qqmyers
Copy link
Member

qqmyers commented Oct 13, 2020

Based on use cases from Odum's TRSA (see #5213) and recent work to support remote uploads, I'm suggesting a mechanism extending the current StorageIO mechanism that would allow Dataverse management of a file at a remote URL 'as though' it were in S3. Recognizing that there are many potential remote stores that might fit this model, I've created a design document and a proof-of-concept implementation (#7325) to share with the community.

The basic concept is to treat the file URL as read-only and to retrieve it's bytes or provide a download redirect in the same way that the S3 store manages file access. However, since the remote web store is assumed ~read-only in this design, any/all derived files (thumbnails, ingested versions, provenance files, etc.) are managed by an underlying S3 or File store.

This mechanism works for public URLs. I've also suggested/implemented a URL presigning mechanism, roughly analogous to what S3 stores use to provide secure download URLs, that could be used by a remote store to verify that Dataverse was the source of the request (which would only be made if the user is allowed access per Dataverse's configured controls) and only allow access when Dataverse has 'pre-approved' the request. (The code includes java code to sign and validate these requests - validation implemented as module/filter for common web servers could simplify what needs to be done at the remote store.)

I've created this issue, design doc, and draft PR to encourage discussion and get feedback. Does this mechanism, or extensions of it, support use cases from other community members? Are there alternative designs that could be made general and fit well into Dataverse's architecture? Are there concerns about the mechanism proposed?

pdurbin added a commit to GlobalDataverseCommunityConsortium/dataverse that referenced this issue Aug 5, 2022
qqmyers pushed a commit to GlobalDataverseCommunityConsortium/dataverse that referenced this issue Aug 5, 2022
pdurbin added a commit to GlobalDataverseCommunityConsortium/dataverse that referenced this issue Aug 8, 2022
pdurbin added a commit to GlobalDataverseCommunityConsortium/dataverse that referenced this issue Aug 8, 2022
pdurbin added a commit to GlobalDataverseCommunityConsortium/dataverse that referenced this issue Aug 9, 2022
pdurbin added a commit to GlobalDataverseCommunityConsortium/dataverse that referenced this issue Aug 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant