Harvesting : message "javax.ejb.EJBTransactionRolledbackException, Exception thrown from bean: java.lang.NullPointerException" #9318

arnaumevi · 2023-01-24T12:36:22Z

Hi,
I'm having trouble harvesting Clients with the Dataverse 5.11.1 version.
I get the message javax.ejb.EJBTransactionRolledbackException, Exception thrown from bean: java.lang.NullPointerException on the server log

Client configurations:

Alias : UAB
Server URL : https://ddd.uab.cat/oai2d
OAI Set : datasets
Metadata Format : oai_dc
Archive type Generic OAI archive
Results :
SUCCESS; 0 harvested, 0 deleted, 78 failed.

Here is the log for the attempt :
harvest_UAB_2023-01-24T13-21-32.log

Thank you for your time in advance,
Best Regards,
Arnau

landreev · 2023-01-24T16:38:13Z

Thank you.
Just to confirm, you WERE able to harvest from this OAI archive successfully, before upgrading to 5.11.1, correct?

landreev · 2023-01-25T22:51:28Z

A quick followup:
This isn't mentioned in this issue here, but the original report in the Google group suggests that these failures started happening after the upgrade to 5.11.1. Having looked at this OAI server and the failures, I don't think these OAI_DC records would have been imported successfully by any version of Dataverse. So if you were able to harvest from this archive previously, they must have changed their record format on the server side since then.

The short answer is that Dataverse can't import these OAI_DC records because they don't have persistent identifiers in any of the <dc:identifier> fields, for example:

  <dc:identifier>https://ddd.uab.cat/record/166606</dc:identifier>
  <dc:identifier>urn:oai:ddd.uab.cat:166606</dc:identifier>
  <dc:identifier>urn:10.5565/ddd.uab.cat/166606</dc:identifier>
  <dc:identifier>urn:articleid:14712202</dc:identifier>

i.e. Dataverse wants one of these fields to contain either a DOI or a Handle identifier.

This is our fault, in more than one way:

It obviously shouldn't be failing in such a confusing, unclear manner. (There's nothing informative in that harvesting log; and there's a mess of stacktraces left in the main server.log).
We may not really need to enforce this requirement, that a dataset must have a persistent id, on harvested datasets. (as opposed to "real", local datasets). All we need is some working url that we can use to redirect the Dataverse user back to the archival location of the data; and the first of the identifiers in the record above is a valid url that we could use for that. It becomes more difficult/less reliable, to ensure that we are not importing duplicate copies of the same data record without persistent ids, but then again, duplicates are probably much less of a problem with harvested datasets.

We have an open issue for improving the client-side harvesting functionality that should address 1. above - we'll make more and better diagnostics visible to the admin; I'm hoping that it will be prioritized and addressed soon.
As for 2., I have brought this up with the dev. team and we at least started talking about this.

But, unfortunately, this is not something we can fix for you, and/or something you can fix with a configuration change, right away.

tjouneau · 2024-01-09T13:14:32Z

Is related to the previous issue :

Harvesting : message "javax.ejb.EJBTransactionRolledbackException, Exception thrown from bean: java.lang.NullPointerException" #7546

pdurbin added the Feature: Harvesting label Jan 24, 2023

landreev mentioned this issue Jan 25, 2023

Spike: Inventory and prioritize all existing Harvesting related issues IQSS/dataverse-pm#24

Closed

3 tasks

pdurbin added Type: Bug a defect User Role: API User Makes use of APIs labels Oct 9, 2023

tjouneau mentioned this issue Jan 9, 2024

Harvesting : message "javax.ejb.EJBTransactionRolledbackException, Exception thrown from bean: java.lang.NullPointerException" #7546

Open

cmbz mentioned this issue Mar 12, 2024

GREI 3: HDV Task - Improve OAI-PMH Harvesting IQSS/dataverse-pm#171

Open

56 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Harvesting : message "javax.ejb.EJBTransactionRolledbackException, Exception thrown from bean: java.lang.NullPointerException" #9318

Harvesting : message "javax.ejb.EJBTransactionRolledbackException, Exception thrown from bean: java.lang.NullPointerException" #9318

arnaumevi commented Jan 24, 2023 •

edited

Loading

landreev commented Jan 24, 2023

landreev commented Jan 25, 2023

tjouneau commented Jan 9, 2024

Harvesting : message "javax.ejb.EJBTransactionRolledbackException, Exception thrown from bean: java.lang.NullPointerException" #9318

Harvesting : message "javax.ejb.EJBTransactionRolledbackException, Exception thrown from bean: java.lang.NullPointerException" #9318

Comments

arnaumevi commented Jan 24, 2023 • edited Loading

landreev commented Jan 24, 2023

landreev commented Jan 25, 2023

tjouneau commented Jan 9, 2024

arnaumevi commented Jan 24, 2023 •

edited

Loading