Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File Upload: Allow users to skip ingest as tabular data. #2199

Closed
kcondon opened this issue May 26, 2015 · 26 comments
Closed

File Upload: Allow users to skip ingest as tabular data. #2199

kcondon opened this issue May 26, 2015 · 26 comments
Labels
Feature: File Upload & Handling Type: Suggestion an idea User Role: Depositor Creates datasets, uploads data, etc. UX & UI: Design This issue needs input on the design of the UI and from the product owner

Comments

@kcondon
Copy link
Contributor

kcondon commented May 26, 2015

In v3.6 we had an upload as "other" option that allowed problematic files, those failing ingest or too large to ingest, to upload as the original format without additional processing.

We might want to call it something else rather than "other" but this is a very useful feature for Support and users too.

@scolapasta scolapasta added this to the Candidates for 4.0.2 milestone Jun 1, 2015
@scolapasta scolapasta modified the milestones: Candidates for 4.0.2, In Review Jul 2, 2015
@mheppler mheppler added the UX & UI: Design This issue needs input on the design of the UI and from the product owner label Jan 27, 2016
@scolapasta scolapasta removed this from the Not Assigned to a Release milestone Jan 28, 2016
@pdurbin
Copy link
Member

pdurbin commented Sep 30, 2016

I just wanted to point out the the File Import Batch job in support of rsync being work on in #3353 currently bypasses all the extra processing.

@pdurbin
Copy link
Member

pdurbin commented Jan 10, 2017

@raprasad @landreev @djbrooke @scolapasta is this something we should consider adding to the "native add" functionality being developed in the 2290-file-replace branch for #2290 and #1612? There's a "jsonData" object we could add "extraProcessing=false" to.

@djbrooke
Copy link
Contributor

No, let's keep this separate and handle it as part of the SBGrid work.

@bmckinney bmckinney mentioned this issue Jan 10, 2017
11 tasks
@pdurbin
Copy link
Member

pdurbin commented Jan 10, 2017

@djbrooke ok, I linked this issue from pull request #3497.

@kcondon
Copy link
Contributor Author

kcondon commented Feb 7, 2017

I don't think this should be part of #3497. This is not what I requested.

@kcondon
Copy link
Contributor Author

kcondon commented Feb 7, 2017

Plus, I'm not sure I like piling on other tickets to pull requests in general. The pull request is about the thing requested, not about other things it might fix.

@djbrooke
Copy link
Contributor

djbrooke commented Feb 7, 2017

To clarify, I meant handle this separate issue as part of the work for the SBGrid grant, not as part of #3353. This is not part of #3353 and should be maintained as its own issue. My bad, I should have been more specific. :)

I think it's fine in general to connect issues to PRs if the PR happens to solve that issue or eliminates the need for that issue - this should be an exception. We don't want to pile on issues to PRs and increase batch size as a matter of course.

@kcondon
Copy link
Contributor Author

kcondon commented Feb 7, 2017

OK, but issues attached to PRs were originally requirements in order to close that PR. The two most recent examples are more hopefully addressed things. I just think it makes it more cumbersome and time consuming.

@djbrooke
Copy link
Contributor

djbrooke commented Feb 7, 2017

Let's talk it through in the retrospective!

@pdurbin
Copy link
Member

pdurbin commented Mar 17, 2017

We talked about this on Wednesday and I said I would stop using the "connects to" syntax to associate pull requests with issues in https://waffle.io/IQSS/dataverse that have no chance of going through QA because the pull request only affords a partial fix or workaround and not an exact solution. In pull requests I'll continue to at least reference related issues so that the team can see the connection.

@landreev landreev changed the title File Upload: Provide a "No additional processing" option for problematic files. File Upload: Allow users to skip ingest as tabular data. Apr 11, 2017
@djbrooke djbrooke changed the title File Upload: Allow users to skip ingest as tabular data. File Upload: Allow users to skip ingest as tabular data. Apr 19, 2017
@djbrooke djbrooke added ready and removed ready labels Apr 19, 2017
@pdurbin pdurbin added the User Role: Depositor Creates datasets, uploads data, etc. label Jul 12, 2017
@amberleahey
Copy link

Hi folks! I know this issue is quite old, but I want revive it! We have a user requesting the option to opt out of additional tabular data processing, so that way end-users can more easily find and download the entire data (in this case the author prefers full download and would like that option to choose how to present their data in DV).

@djbrooke
Copy link
Contributor

Hi @amberleahey - I think we don't want to provide a feature to opt out of this, as ingest provides the data in multiple formats for easier data sharing and additionally provides a more preservation-friendly format. These are some of the core functionalities of the platform. There is an option to "uningest" via an API for one-offs such as these:

http://guides.dataverse.org/en/latest/api/native-api.html#uningest-a-file

My thoughts, anyway - happy to hear from you or others about this!

@amberleahey
Copy link

Fair enough, and I agree especially to ensure preservation friendly formats and metadata. I didn't know about API for one offs, will check this out and offer to user as an option. Thanks!

@jggautier
Copy link
Contributor

so that way end-users can more easily find and download the entire data (in this case the author prefers full download and would like that option to choose how to present their data in DV

@amberleahey, so the author would prefer that when others want to download the file, they see only the option to download the original file format? Just wanted to make sure I understood

@amberleahey
Copy link

@jggautier it is to do with the Explore button appearing yes, so in this case they prefer to only present the download button and original file format is preferred.

@djbrooke
Copy link
Contributor

Closing this issue as we don't plan to implement this.

@pdurbin
Copy link
Member

pdurbin commented Mar 22, 2022

@amberleahey
Copy link

Hi! I noticed this was reopened recently, we were asked about these errors (again!) at our community's meeting this month and started to discuss what solutions we could envision to improve the situation for tabular ingest. We are really open to exploring options, here is a summary:

  1. Offer option for end-users to select when to use ingest or not (could be labelled along these lines: "Select to ingest data for open display and preservation"
  2. Offer option for Admins to select when to use ingest or not in UI (oppose to only option being SuperAdmins through API)
  3. Improve the error message warnings in UI, suggest to make this less like an error and more like a tip to improve the openess of their data. Some ingest would remain the same, and the error message would become an improvement tip to remove the flag. "Flagged for improvement", UI text improvements, etc.
  4. Revamp ingest tools to support Excel and CSV use cases automatically e.g. multi-sheet use case, parsing common formatting, etc. (seems like the harder approach, but definitely an area of interest)

@pdurbin
Copy link
Member

pdurbin commented Oct 3, 2022

@amberleahey nice summary.

One thing I want to make sure is clear is that the "skip ingest via API" feature that @lubitchv added is not limited to superusers. Any user with access to upload files can use it.

Please see https://guides.dataverse.org/en/5.11.1/api/native-api.html#add-a-file-to-a-dataset

So the API is a potential work around until a UI solution is in place.

@amberleahey
Copy link

amberleahey commented Oct 3, 2022

Yes my bad @pdurbin :)
And yes API is an option , we have used it a handful of times this past year, that we are aware of :) @eugene-barsky to add if anything else can be added and how often uningest is requested. btw

@eugene-barsky
Copy link

Thanks so much, @amberleahey for representing our issues so thoroughly!

@sirineREKIK
Copy link
Contributor

Hello,
@pdurbin
As a member of RDG team ( https://entrepot.recherche.data.gouv.fr/ ) we have a quick question:
did the UI solution of skip ingest files is already planned or we need to create a new issue for that?

Thanks

@pdurbin
Copy link
Member

pdurbin commented Jul 11, 2023

@sirineREKIK hi! Sorry, I was away on vacation. I see you opened an issue in the new frontend repo and that @GPortas replied to you there: IQSS/dataverse-frontend#143 (comment)

I would say that no, this work is not planned. Are you interested in making a pull request to the new frontend when it's ready? (Currently the new frontend is read only.) Thanks.

@sirineREKIK
Copy link
Contributor

@pdurbin Hello! Thanks for your return.

Yes we are interested to see that why not!
However we're not going to work on it right away.
THANKS

@pdurbin
Copy link
Member

pdurbin commented Jul 12, 2023

@sirineREKIK that's fine. The new frontend won't be ready for this addition right away either! 😅

@cmbz
Copy link

cmbz commented Aug 20, 2024

To focus on the most important features and bugs, we are closing issues created before 2020 (version 5.0) that are not new feature requests with the label 'Type: Feature'.

If you created this issue and you feel the team should revisit this decision, please reopen the issue and leave a comment.

@cmbz cmbz closed this as completed Aug 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature: File Upload & Handling Type: Suggestion an idea User Role: Depositor Creates datasets, uploads data, etc. UX & UI: Design This issue needs input on the design of the UI and from the product owner
Projects
Status: Done
Development

No branches or pull requests