-
Notifications
You must be signed in to change notification settings - Fork 493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API: As a data owner, I want to "uningest" a tabular data file so that files that should not have been ingested are saved appropriately #3766
Comments
For now, switching this up to cover a new administrative/curation API endpoint instead of a user-facing feature. |
@landreev - in backlog grooming this week, you mentioned you'd share a script in this issue. |
@landreev - we're taking this on this sprint, can you drop in that script? Thanks! |
Missed the message above - sorry. The script takes the (database) id of the datafile; the script:
|
To clarify: whoever ends up working on this, you don't need to literally reimplement this script in Java 1:1! It's provided for reference, to list everything that needs to be done. But it is an admin script specifically written for our prod. system. It assumes that files live on S3 - but you're not going to make any such assumptions, you'll simply use the StorageIO system to replace the tabular file with the stored original; you don't need to write the code that generates the file extension based on the stored original mime type - there is already a method in FileUtil that does that. Etc. |
Issues/ questions so far:
|
@sekmiller Are you sure it was necessary, to do that .merge() on the filemetadata, in order to get the new extension to stick? |
It wasn't working at all for me without it. |
We occasionally get requests from users to revert their ingested tabular files to their original state. (example: RT 247789 - https://help.hmdc.harvard.edu/Ticket/Display.html?id=247789) Some data were never meant to be tabular. This is particularly common with Excel spreadsheets. For example, authors may use a spreadsheet for listing their bibliographical references; having something like that automatically converted to tabular format, and inviting users to "explore" it with TwoRavens is not really what they want.
(There is an open ticket #2199, to allow users to skip tabular ingest on a new file; this issue deals with a file that's already ingested).
The process is fairly straightforward. These uningest requests are currently handled by running a command line script. There are several things that need to happen: delete the datatable object, and its child objects - datavariables, summarystatistics, etc.; recalculate the version unf; replace the file with the saved original; remove any derivative files; restore the original size, mime type and the file name. All these steps simply need to be reimplemented inside the app.
A little bit of thought will need to be invested into figuring out how to add this option to the UI. (is it an extra button shown to the users with the edit permission on the dataset? should we use the existing checkboxes - with the extra option in the pulldown menu under "edit files"?)
The text was updated successfully, but these errors were encountered: