Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IQSS/10318 Uningest/Reingest UI #10319

Merged

Conversation

qqmyers
Copy link
Member

@qqmyers qqmyers commented Feb 13, 2024

What this PR does / why we need it: Per the issue, this adds a File page Edit menu Uningest/Reingest menu item for tabular files, visible to superusers, that allows clearing ingest errors/removing ingested tab files, and ingesting/reingesting a file (useful if a new Dataverse version may have fixes or if ingest limits change, etc.)

Which issue(s) this PR closes:

Closes #10318

Special notes for your reviewer: The PR as is, keeps these as requiring superuser. At QDR, we switched to allowing anyone who has permission to publish the dataset to uningest/reingest via API/UI. For QDR, that means curators can use these options whereas in a non-curated/self-publication setup it would allow anyone to do this in their draft datasets. If that's desired, the permission changes can be added to this PR.

Suggestions on how to test this: Go to the file page and try the uningest/reingest options on a tabular file. Verify that for a file where ingest has failed, that uningest clears the error and that for one where ingest succeeded, the .tab version is removed. Verify that reingest does an ingest (and fails or succeeds as the initial ingest did).

Does this PR introduce a user interface change? If mockups are available, please link/include them here: single entry (uningest or reingest) in the File page Edit menu, visible for superusers.
image
image

Is there a release notes update needed for this change?:added

Additional documentation:

@coveralls
Copy link

coveralls commented Feb 13, 2024

Coverage Status

coverage: 20.254% (-0.02%) from 20.27%
when pulling 6480896 on QualitativeDataRepository:IQSS/10318-uningest_reingest_UI
into dcd7c22 on IQSS:develop.

@qqmyers qqmyers added the Size: 3 A percentage of a sprint. 2.1 hours. label Feb 14, 2024
@landreev landreev self-assigned this Feb 14, 2024
@pdurbin pdurbin self-assigned this Feb 14, 2024
@landreev
Copy link
Contributor

Jenkins run failed on edu.harvard.iq.dataverse.api.MoveIT.testMoveLinkedDataset. Hard to see how it could be related to the changes in the PR though.

Copy link
Member

@pdurbin pdurbin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! I left a few comments. I didn't run the code.

who can see the draft version of the dataset containing the file that will indicate why ingest failed. When the file is published as
part of the dataset, there will be no indication that ingest was attempted and failed.

If the warning message is a concern, the Dataverse software includes both an API call (see the Files section of the :doc:`/api/native-api` guide)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A :ref: link to the specific endpoint would be nice.

This will remove the .tab version of the file that was generated.

If a file is a tabular format but was never ingested, .e.g. due to the ingest file size limit being lower in the past, or if ingest had failed,
e.g. in a prior Dataverse version, an reingest API (see the Files section of the :doc:`/api/native-api` guide) and a file page Edit/Reingest option
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A :ref: link would be nice here as well.

public String ingestFile() throws CommandException{

User u = session.getUser();
if(!u.isAuthenticated() || !(permissionService.permissionsFor(u, file).contains(Permission.PublishDataset))) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was expecting to see a check for superuser here... Add a comment to say that the permission check is in the command (if it is)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this is re-ingest - and there is no command involved. The corresponding API is superuser-only - so this should probably be .isSuperuser() here as well. (?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(but note the "shouldn't happen" comment in the next line; i.e. I still believe it would make sense to check for superuser here, for consistency, but it's not as crucial)

if (!file.isTabularData()) {
if(file.isIngestProblem()) {
User u = session.getUser();
if(!u.isAuthenticated() || !(permissionService.permissionsFor(u, file).contains(Permission.PublishDataset))) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be superuser-only?

Ah, I see UningestFileCommand has a check for superuser. Maybe add a comment here to look at the command for the actual permission check?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This permission check does not appear to be for an actual attempt to uningest a file - it's just for clearing of the ingest failure warning from the UI (the file is not ingested to begin with), and yes, it makes perfect sense to allow authors to do this.
I hope I'm getting it right this time around.

if(!u.isAuthenticated() || !(permissionService.permissionsFor(u, file).contains(Permission.PublishDataset))) {
logger.warning("User: " + u.getIdentifier() + " tried to uningest a file");
//Shouldn't happen (choice not displayed for users who don't have the right permission), but check anyway
JH.addMessage(FacesMessage.SEVERITY_WARN, BundleUtil.getStringFromBundle("file.ingest.cantUningestFileWarning"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This entry may not be in the Bundle in the develop branch yet.

file.setIngestReport(null);
//Ingest never succeeded, either there was a failure or this is not a tabular data file
User u = session.getUser();
if (!u.isAuthenticated() || !u.isSuperuser()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I thought what you had here before was on purpose - to give the dataset author a way to remove an ingest failure report - no?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(those are only visible to the owners/authors, so not a big deal either way)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice if that's OK. (Probably less work for superusers as that warning bothers the authors.)
At QDR all of this is with the publish permission.

Copy link
Member

@pdurbin pdurbin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All 6 Jenkins runs failed so far but I think Jenkins was having issues. I just kicked off a 7th run: https://jenkins.dataverse.org/job/IQSS-Dataverse-Develop-PR/job/PR-10319/7/

I found a typo in the docs. I'll probably just commit it.

There's one spot where I'm suggesting we lower logging to fine but I don't feel strongly about it.

I didn't run the code but I'd say it's safe to start manually testing. Hopefully the API test pass.


}

dataFile.SetIngestScheduled();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, it's unconventional that this and other methods start with capital "S" but whatever, something to address another time:

    public void SetIngestScheduled() {
        ingestStatus = INGEST_STATUS_SCHEDULED;
    }
    
    public void SetIngestInProgress() {
        ingestStatus = INGEST_STATUS_INPROGRESS;
    }
    
    public void SetIngestProblem() {
        ingestStatus = INGEST_STATUS_ERROR;
    }


logger.warning("Ingest Status for file: " + dataFile.getId() + " : " + status);
}
logger.info("File: " + dataFile.getId() + " ingest queued");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
logger.info("File: " + dataFile.getId() + " ingest queued");
logger.fine("File: " + dataFile.getId() + " ingest queued");

Copy link
Member

@pdurbin pdurbin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All 6 Jenkins runs failed so far but I think Jenkins was having issues. I just kicked off a 7th run: https://jenkins.dataverse.org/job/IQSS-Dataverse-Develop-PR/job/PR-10319/7/

This passed so I'm approving this.

@landreev I know you're assigned to this but I'll make sure neither one of us is assigned as this moves to "ready for QA", as is our custom.

There is still the one logger line we might want to lower but I'll leave that the to the person doing QA to decide if it adds value as-is.

Overall, this PR is looking much better than the first time I looked at it. Missing bundle properties have been added. Docs are better. I haven't run the code but looks good, approved.

@sekmiller sekmiller self-assigned this Feb 28, 2024
@sekmiller
Copy link
Contributor

@qqmyers Any thoughts about adding the option to the Edit Files button/kebab on the dataset page?
Screen Shot 2024-02-28 at 3 26 48 PM

@sekmiller
Copy link
Contributor

@qqmyers There are also merge conflicts that need to be resolved. thanks!

@qqmyers
Copy link
Member Author

qqmyers commented Feb 28, 2024

@qqmyers Any thoughts about adding the option to the Edit Files button/kebab on the dataset page?

  • Just removed the filePage restriction in the xhtml.

@@ -79,14 +79,14 @@
</ui:fragment>

<!-- Single file uningest/reingest -->
<ui:fragment rendered="#{isFilePg and (dataverseSession.user.superuser and FilePage.fileMetadata.dataFile.isTabularData()) or (FilePage.fileMetadata.dataFile.isIngestProblem() and FilePage.canPublishDataset())}">
<ui:fragment rendered="#{(dataverseSession.user.superuser and FilePage.fileMetadata.dataFile.isTabularData()) or (FilePage.fileMetadata.dataFile.isIngestProblem() and FilePage.canPublishDataset())}">
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I'm getting a 500 error when I try to go to the dataset page because the FilePage.fileMetadata is null (jakarta.servlet.ServletException: Cannot invoke "edu.harvard.iq.dataverse.FileMetadata.getDataFile()" because "this.fileMetadata" is null)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's not worth it to allow display the uningest on the dataset page. it will be pretty complicated - especially if the user selects multiple files from the list.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK - the list part shouldn't be an issue if this is only in the kebab menu, but yeak, more work than just removing the isFilePage check - I'll restore that.

@qqmyers qqmyers force-pushed the IQSS/10318-uningest_reingest_UI branch from 3f59722 to 5760149 Compare February 28, 2024 22:11
@sekmiller
Copy link
Contributor

@qqmyers in all cases, whether the original ingest was successful or failed and whether it was an uningest or reingest the api does not create a new draft, but the ui does.

@qqmyers
Copy link
Member Author

qqmyers commented Feb 29, 2024

Thanks. I see that the new code was calling FilePage.save() to make sure the changes to the datafile were saved, but that triggers a draft. I changed that to just call DataFileServiceBean.save(DataFile) instead, which is what the API does.

@qqmyers
Copy link
Member Author

qqmyers commented Feb 29, 2024

Good catch btw - I think QDR only uses this on draft datasets, so not a lot of testing on published files.

@sekmiller sekmiller merged commit 2e3b333 into IQSS:develop Feb 29, 2024
12 checks passed
@sekmiller
Copy link
Contributor

All set., thanks for the follow-up.

@scolapasta
Copy link
Contributor

Thanks. I see that the new code was calling FilePage.save() to make sure the changes to the datafile were saved, but that triggers a draft. I changed that to just call DataFileServiceBean.save(DataFile) instead, which is what the API does.

@qqmyers @sekmiller ideally, we don't want pages to call the service beans directly. That said, since this is in jsf and going away eventually AND it is superuser only anyway, I think it's fine for now.

@pdurbin pdurbin added this to the 6.2 milestone Mar 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Size: 3 A percentage of a sprint. 2.1 hours.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature Request/Idea: Uningest tabular files from the UI
6 participants