Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Publication fails/hangs as of ~late Dec. #5427

Closed
qqmyers opened this issue Jan 3, 2019 · 7 comments
Closed

Publication fails/hangs as of ~late Dec. #5427

qqmyers opened this issue Jan 3, 2019 · 7 comments
Assignees

Comments

@qqmyers
Copy link
Member

qqmyers commented Jan 3, 2019

There are several reports of publications failing, potentially intermittently for develop and older versions. From looking into this, I think the issue (or at least one of them) is a bug in Dataverse in which the check when generating GlobalIds for files has been sending a null value instead of the proposed identifier. I suspect that a change at DataCite which would have changed it's response to seeing a call to /metadata/null from a error to a 200 status code (looks like they send page 1 of the list of 16M DOIs now) has uncovered this. If DataCite responds quickly, Dataverse now sits in a while loop generating new IDs over and over as the 200 response is interpreted as all of them existing. (If DataCite is bogged down (perhaps due to lots of calls to check null identifiers from somewhere...), it may instead respond with 502 status, which I've also seen, which then causes publication to fail. This case shows exceptions in the log, whereas the first doesn't ever report a problem and just hangs for minutes until some timeout occurs.)

I think a useful fix would be to correct the DV bug by sending the newly generated ID, i.e. by refactoring the GlobalIdServiceBeans to test alreadyExists(GlobalId) so that the ID doesn't have to belong to a DVObject before it can be tested. Hoping to do this now and be able to back port to v4.9.4...

@qqmyers
Copy link
Member Author

qqmyers commented Jan 3, 2019

further discussion with @landreev et. al. in chat seems to confirm the hypothesis above (it's consistent with everyone's observations). The PR now sends the identifier being proposed to DataCite (or other provider). For DataCite, @landreev noted that, while /metadata/null (what was being sent) gets a 200 response, a /metadata/<existing ID with a slash and basic form of a DOI with that separator) still gets a 404 response, so this PR, while it doesn't check the contents of the response, should now detect when a generated id already exists in DataCite...

FWIW: we also noted that the response to /metadata/null returns json, despite an Accept:application/xml header being sent with the request. (The 200 response for an existing DOI is in XML as expected).

@landreev
Copy link
Contributor

landreev commented Jan 3, 2019

@qqmyers Thanks again for the PR.
I'm moving it to QA; testing it and moving it further down the merging process will likely have to wait till next week (some people are still out here). But we should definitely be able to add it to the next release.
As discussed, I'm going to open another github issue for a more in depth cleanup of the code there.

@landreev
Copy link
Contributor

landreev commented Jan 3, 2019

@kcondon
To summarize the parts important for testing/QA:
Please note that the current issue is fairly easy to reproduce: it's not unique to the prod. in any way. You should be having issues publishing datasets with files, on any dev. system configured to use DataCite, with the file level registration enabled.

The reason this has started happening just recently is a new behavior on the DataCite API side; that must have been introduced around the time they were addressing their service instability (mid Dec.). That in turn triggered a bug we had in our implementation all along.

@landreev
Copy link
Contributor

landreev commented Jan 3, 2019

As an immediate response to the prod. issues, we've disabled file level registration in our Dataverse. And are going to advise all the other installations using DataCite to do the same.
I wrote the following blurb for sending out:

To all the Dataverse installations using DataCite:

If you have registration of DOIs for datafiles enabled (the default behavior), you are likely to experience issues publishing datasets.

So we strongly recommend disabling file-level registration, until the issue is fixed (in the next release).
The issue is due to a recent change in the behavior of the DataCite API; which, in turn, triggered a bug in the current implementation of our DataCite registration client.

Dataverses that are using EZID or Handlenet for registering their global ids are not affected by this issue.

To disable registering global ids for datafiles, add the following entry to the “setting” table in your database:

INSERT INTO setting (name,content) VALUES (':FilePIDsEnabled', 'false')

(Or modify an existing entry if you currently have it set to “true”)

Note:

  • Your will still be getting DOIs registered for your datasets (just not the files);
  • You will be able to easily register new DOIs for the published files that miss them, if desired, with an existing API. We’ll provide the instructions in the release notes, once the release with the fix is out.

@qqmyers
Copy link
Member Author

qqmyers commented Jan 3, 2019

@landreev - could use the API call instead of db insert:
curl -X PUT -d 'false' http://localhost:8080/api/admin/settings/:FilePIDsEnabled

@landreev
Copy link
Contributor

landreev commented Jan 3, 2019

@qqmyers that should be the recommended way yes; i have no idea why i did it w/ a db query.
i'll change it, if it hasn't gone out yet.

@mfenner
Copy link

mfenner commented Jan 9, 2019

DataCite deployed a fix in the MDS API at mds.datacite.org that should also help with this issue: datacite/poodle#21

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants