-
Notifications
You must be signed in to change notification settings - Fork 493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Client-side multifile zip download #9245
base: develop
Are you sure you want to change the base?
Client-side multifile zip download #9245
Conversation
this is wonderful! does it default to original file format, or does it send surrogate copies? |
Right now it is adding |
@qqmyers @jdmar3 corrects me: |
@qqmyers Would it be possible to present an option to add EDIT: @donsizemore beat me to the punch! |
The Access Dataset menu at the top of the page allows getting either original or archival format. Currently I have not changed those buttons to use the client-size zipping but that's a useful addition once we know that it works well for most browsers and sizable datasets. Both forms are also available at the individual file level, so it is mostly a limitation of the bulk 'Download' button for selected files, regardless of whether the existing server-side zipping or this client-side method is used. I don't want to change that as part of this PR for client-side zipping, but I think both client-size and the existing server-side algorithm could handle both cases if the user interface work is done to allow it. FWIW: I think the API call to download all files allows you to specify either form as well. W.r.t. archiving, I would argue that the Bag exports are better than the zip available from the front end (the Bag has fixity info, all the metadata for the dataset, etc.) and since it is privileged, it doesn't run the risk of files being excluded if you don't have permissions (if I recall the zip options in the UI include a manifest that lists files that weren't included due to permissions or size limits). There has also been discussion of the archival Bag exports w.r.t. whether including the ingested formats would be better than the original, but there are issues that have slowed that work, e.g. the fact that Dataverse isn't storing the fixity info for the ingested versions. It would definitely be useful to have some discussion/review of the Bags to decide requirements and priorities. W.r.t. to testing - thanks! The draft PR should work as is so if we can get a test server(s) set up somewhere, it could be tested with different browsers, larger data, etc. I think DataverseNO was going to try to fire one up, Don could probably do that at Odum as well. Assuming that looks promising, I can look into updating the download all buttons - that shouldn't involve any new risks - if it works for the one format, it will work for the other. |
This is not ready for Review/QA (hence draft). Testing has show the local browser uses significant amounts of memory with large files and can fail with an out-of-memory error. I'm still investigating how to handle this. Perhaps it should not be on the board yet? |
We're excited about it. Let's let Jim size it. |
Sizing:
|
What this PR does / why we need it: A possible addition to/replacement for zipping on the server. In this PR, the multi-file download button invokes JavaScript that will download files individually (using direct download if enabled) and create a zip locally, using file names/directoryPaths from the specific datasetVersion being downloaded.
Current issues/limitations:
Which issue(s) this PR closes:
Closes #5864
Special notes for your reviewer:
To enable this, I needed to know the datasetVersion in the download code which required trying to fix #5864 - the multifile button doesn't set the datasetVersion in the guestbook by default. If the rest gets delayed, it may be worth pulling out this one line fix (it's a separate commit).
Suggestions on how to test this:
Does this PR introduce a user interface change? If mockups are available, please link/include them here:
Is there a release notes update needed for this change?:
Additional documentation: