Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IQSS/10318 Uningest/Reingest UI #10319

Merged
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions doc/release-notes/10318-uningest-and-reingest.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
New Uningest/Reingest options are available in the File Page Edit menu for superusers, allowing ingest errors to be cleared and for
ingest to be retried (e.g. after a Dataverse version update or if ingest size limits are changed).
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ format. (more info below)


Tabular Data and Metadata
==========================
=========================

Data vs. Metadata
-----------------
Expand All @@ -56,3 +56,21 @@ the Dataverse Software was originally based on the `DDI Codebook
<https://www.ddialliance.org/Specification/DDI-Codebook/2.5/>`_ format.

You can see an example of DDI output under the :ref:`data-variable-metadata-access` section of the :doc:`/api/dataaccess` section of the API Guide.

Uningest and Reingest
=====================

Ingest will only work for files whose content can be interpreted as a table.
Multi-sheets spreadsheets and CSV files with different number of entries per row are two examples where ingest will fail.
qqmyers marked this conversation as resolved.
Show resolved Hide resolved
This is non-fatal. The Dataverse software will not produce a .tab version of the file and will show a warning to users
who can see the draft version of the dataset containing the file that will indicate why ingest failed. When the file is published as
part of the dataset, there will be no indication that ingest was attempted and failed.

If the warning message is a concern, the Dataverse software includes both an API call (see the Files section of the :doc:`/api/native-api` guide)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A :ref: link to the specific endpoint would be nice.

and an Edit/Uningest menu option displayed on the file page, that allow a file to be Uningested. These are only available to superusers.
qqmyers marked this conversation as resolved.
Show resolved Hide resolved
Uningest will remove the warning. Uningest can also be done for a file that was successfully ingested.
This will remove the .tab version of the file that was generated.

If a file is a tabular format but was never ingested, .e.g. due to the ingest file size limit being lower in the past, or if ingest had failed,
e.g. in a prior Dataverse version, an reingest API (see the Files section of the :doc:`/api/native-api` guide) and a file page Edit/Reingest option
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A :ref: link would be nice here as well.

in the user interface allow ingest to be tried again. As with Uningest, this fucntionality is only available to superusers.
124 changes: 121 additions & 3 deletions src/main/java/edu/harvard/iq/dataverse/FilePage.java
Original file line number Diff line number Diff line change
Expand Up @@ -21,20 +21,25 @@
import edu.harvard.iq.dataverse.engine.command.impl.CreateNewDatasetCommand;
import edu.harvard.iq.dataverse.engine.command.impl.PersistProvFreeFormCommand;
import edu.harvard.iq.dataverse.engine.command.impl.RestrictFileCommand;
import edu.harvard.iq.dataverse.engine.command.impl.UningestFileCommand;
import edu.harvard.iq.dataverse.engine.command.impl.UpdateDatasetVersionCommand;
import edu.harvard.iq.dataverse.export.ExportService;
import io.gdcc.spi.export.ExportException;
import io.gdcc.spi.export.Exporter;
import edu.harvard.iq.dataverse.externaltools.ExternalTool;
import edu.harvard.iq.dataverse.externaltools.ExternalToolHandler;
import edu.harvard.iq.dataverse.externaltools.ExternalToolServiceBean;
import edu.harvard.iq.dataverse.ingest.IngestRequest;
import edu.harvard.iq.dataverse.ingest.IngestServiceBean;
import edu.harvard.iq.dataverse.makedatacount.MakeDataCountLoggingServiceBean;
import edu.harvard.iq.dataverse.makedatacount.MakeDataCountLoggingServiceBean.MakeDataCountEntry;
import edu.harvard.iq.dataverse.privateurl.PrivateUrlServiceBean;
import edu.harvard.iq.dataverse.settings.SettingsServiceBean;
import edu.harvard.iq.dataverse.util.BundleUtil;
import edu.harvard.iq.dataverse.util.FileUtil;
import edu.harvard.iq.dataverse.util.JsfHelper;
import edu.harvard.iq.dataverse.util.StringUtil;

import static edu.harvard.iq.dataverse.util.JsfHelper.JH;
import edu.harvard.iq.dataverse.util.SystemConfig;

Expand All @@ -45,6 +50,7 @@
import java.util.Comparator;
import java.util.List;
import java.util.Set;
import java.util.logging.Level;
import java.util.logging.Logger;
import jakarta.ejb.EJB;
import jakarta.ejb.EJBException;
Expand Down Expand Up @@ -112,10 +118,10 @@ public class FilePage implements java.io.Serializable {
GuestbookResponseServiceBean guestbookResponseService;
@EJB
AuthenticationServiceBean authService;

@EJB
DatasetServiceBean datasetService;

@EJB
IngestServiceBean ingestService;
@EJB
SystemConfig systemConfig;

Expand Down Expand Up @@ -209,7 +215,7 @@ public String init() {
// If this DatasetVersion is unpublished and permission is doesn't have permissions:
// > Go to the Login page
//
// Check permisisons
// Check permissions
Boolean authorized = (fileMetadata.getDatasetVersion().isReleased())
|| (!fileMetadata.getDatasetVersion().isReleased() && this.canViewUnpublishedDataset());

Expand Down Expand Up @@ -475,6 +481,112 @@ public String restrictFile(boolean restricted) throws CommandException{
return returnToDraftVersion();
}

public String ingestFile() throws CommandException{

User u = session.getUser();
if(!u.isAuthenticated() || !(permissionService.permissionsFor(u, file).contains(Permission.PublishDataset))) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was expecting to see a check for superuser here... Add a comment to say that the permission check is in the command (if it is)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this is re-ingest - and there is no command involved. The corresponding API is superuser-only - so this should probably be .isSuperuser() here as well. (?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(but note the "shouldn't happen" comment in the next line; i.e. I still believe it would make sense to check for superuser here, for consistency, but it's not as crucial)

//Shouldn't happen (choice not displayed for users who don't have the right permission), but check anyway
logger.warning("User: " + u.getIdentifier() + " tried to ingest a file");
JH.addMessage(FacesMessage.SEVERITY_WARN, BundleUtil.getStringFromBundle("file.ingest.cantIngestFileWarning"));
return null;
}

DataFile dataFile = fileMetadata.getDataFile();
editDataset = dataFile.getOwner();

if (dataFile.isTabularData()) {
JH.addMessage(FacesMessage.SEVERITY_WARN, BundleUtil.getStringFromBundle("file.ingest.alreadyIngestedWarning"));
return null;
}

boolean ingestLock = dataset.isLockedFor(DatasetLock.Reason.Ingest);

if (ingestLock) {
JH.addMessage(FacesMessage.SEVERITY_WARN, BundleUtil.getStringFromBundle("file.ingest.ingestInProgressWarning"));
return null;
}

if (!FileUtil.canIngestAsTabular(dataFile)) {
JH.addMessage(FacesMessage.SEVERITY_WARN, BundleUtil.getStringFromBundle("file.ingest.cantIngestFileWarning"));
return null;

}

dataFile.SetIngestScheduled();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, it's unconventional that this and other methods start with capital "S" but whatever, something to address another time:

    public void SetIngestScheduled() {
        ingestStatus = INGEST_STATUS_SCHEDULED;
    }
    
    public void SetIngestInProgress() {
        ingestStatus = INGEST_STATUS_INPROGRESS;
    }
    
    public void SetIngestProblem() {
        ingestStatus = INGEST_STATUS_ERROR;
    }


if (dataFile.getIngestRequest() == null) {
dataFile.setIngestRequest(new IngestRequest(dataFile));
}

dataFile.getIngestRequest().setForceTypeCheck(true);

// update the datafile, to save the newIngest request in the database:
save();

// queue the data ingest job for asynchronous execution:
String status = ingestService.startIngestJobs(editDataset.getId(), new ArrayList<>(Arrays.asList(dataFile)), (AuthenticatedUser) session.getUser());

if (!StringUtil.isEmpty(status)) {
// This most likely indicates some sort of a problem (for example,
// the ingest job was not put on the JMS queue because of the size
// of the file). But we are still returning the OK status - because
// from the point of view of the API, it's a success - we have
// successfully gone through the process of trying to schedule the
// ingest job...

logger.warning("Ingest Status for file: " + dataFile.getId() + " : " + status);
}
logger.info("File: " + dataFile.getId() + " ingest queued");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
logger.info("File: " + dataFile.getId() + " ingest queued");
logger.fine("File: " + dataFile.getId() + " ingest queued");


init();
JsfHelper.addInfoMessage(BundleUtil.getStringFromBundle("file.ingest.ingestQueued"));
return returnToDraftVersion();
}

public String uningestFile() throws CommandException {

if (!file.isTabularData()) {
if(file.isIngestProblem()) {
User u = session.getUser();
if(!u.isAuthenticated() || !(permissionService.permissionsFor(u, file).contains(Permission.PublishDataset))) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be superuser-only?

Ah, I see UningestFileCommand has a check for superuser. Maybe add a comment here to look at the command for the actual permission check?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This permission check does not appear to be for an actual attempt to uningest a file - it's just for clearing of the ingest failure warning from the UI (the file is not ingested to begin with), and yes, it makes perfect sense to allow authors to do this.
I hope I'm getting it right this time around.

logger.warning("User: " + u.getIdentifier() + " tried to uningest a file");
//Shouldn't happen (choice not displayed for users who don't have the right permission), but check anyway
JH.addMessage(FacesMessage.SEVERITY_WARN, BundleUtil.getStringFromBundle("file.ingest.cantUningestFileWarning"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This entry may not be in the Bundle in the develop branch yet.

return null;
}
file.setIngestDone();
file.setIngestReport(null);
} else {
JH.addMessage(FacesMessage.SEVERITY_WARN, BundleUtil.getStringFromBundle("file.ingest.cantUningestFileWarning"));
return null;
}
} else {
commandEngine.submit(new UningestFileCommand(dvRequestService.getDataverseRequest(), file));
Long dataFileId = file.getId();
file = datafileService.find(dataFileId);
}
editDataset = file.getOwner();
if (editDataset.isReleased()) {
try {
ExportService instance = ExportService.getInstance();
instance.exportAllFormats(editDataset);

} catch (ExportException ex) {
// Something went wrong!
// Just like with indexing, a failure to export is not a fatal
// condition. We'll just log the error as a warning and keep
// going:
logger.log(Level.WARNING, "Uningest: Exception while exporting:{0}", ex.getMessage());
}
}
save();
//Refresh filemetadata with file title, etc.
init();
JH.addMessage(FacesMessage.SEVERITY_INFO, BundleUtil.getStringFromBundle("file.uningest.complete"));
return returnToDraftVersion();
}


private List<FileMetadata> filesToBeDeleted = new ArrayList<>();

public String deleteFile() {
Expand Down Expand Up @@ -948,6 +1060,12 @@ public boolean isPubliclyDownloadable() {
return FileUtil.isPubliclyDownloadable(fileMetadata);
}

public boolean isIngestable() {
DataFile f = fileMetadata.getDataFile();
//Datafile is an ingestable type and hasn't been ingested yet or had an ingest fail
return (FileUtil.canIngestAsTabular(f)&&!(f.isTabularData() || f.isIngestProblem()));
}

private Boolean lockedFromEditsVar;
private Boolean lockedFromDownloadVar;

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -47,10 +47,10 @@ public UningestFileCommand(DataverseRequest aRequest, DataFile uningest) {
@Override
protected void executeImpl(CommandContext ctxt) throws CommandException {

// first check if user is a superuser
if ( (!(getUser() instanceof AuthenticatedUser) || !getUser().isSuperuser() ) ) {
throw new PermissionException("Uningest File can only be called by Superusers.",
this, Collections.singleton(Permission.EditDataset), uningest);
// first check if user is a superuser
if ((!(getUser() instanceof AuthenticatedUser) || !getUser().isSuperuser())) {
throw new PermissionException("Uningest File can only be called by Superusers.", this,
Collections.singleton(Permission.EditDataset), uningest);
}

// is this actually a tabular data file?
Expand Down
16 changes: 16 additions & 0 deletions src/main/webapp/file-edit-button-fragment.xhtml
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,22 @@
</h:outputLink>
</li>
</ui:fragment>

<!-- Single file uningest/reingest -->
<ui:fragment rendered="#{isFilePg and dataverseSession.user.superuser and (FilePage.fileMetadata.dataFile.isTabularData() or FilePage.fileMetadata.dataFile.isIngestProblem())}">
<li>
<p:commandLink update="@form,:messagePanel,:fileForm:fileTitleFragment, :fileForm:topDatasetBlockFragment" action="#{FilePage.uningestFile()}">
<h:outputText value="#{bundle['file.uningest']}"/>
</p:commandLink>
</li>
</ui:fragment>
<ui:fragment rendered="#{isFilePg and dataverseSession.user.superuser and FilePage.isIngestable()}">
<li>
<p:commandLink update="@form,:messagePanel" actionListener="#{FilePage.ingestFile()}">
<h:outputText value="#{bundle['file.ingest']}"/>
</p:commandLink>
</li>
</ui:fragment>
pdurbin marked this conversation as resolved.
Show resolved Hide resolved

<!-- TO-DO #3488 - ADD FILE TAGS FOR FILE PG AND SINGLE FILE-->
<ui:fragment rendered="#{fileMetadata==null}">
Expand Down
Loading