-
Notifications
You must be signed in to change notification settings - Fork 493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Publication fails/hangs as of ~late Dec. #5427
Comments
further discussion with @landreev et. al. in chat seems to confirm the hypothesis above (it's consistent with everyone's observations). The PR now sends the identifier being proposed to DataCite (or other provider). For DataCite, @landreev noted that, while /metadata/null (what was being sent) gets a 200 response, a /metadata/<existing ID with a slash and basic form of a DOI with that separator) still gets a 404 response, so this PR, while it doesn't check the contents of the response, should now detect when a generated id already exists in DataCite... FWIW: we also noted that the response to /metadata/null returns json, despite an Accept:application/xml header being sent with the request. (The 200 response for an existing DOI is in XML as expected). |
@qqmyers Thanks again for the PR. |
@kcondon The reason this has started happening just recently is a new behavior on the DataCite API side; that must have been introduced around the time they were addressing their service instability (mid Dec.). That in turn triggered a bug we had in our implementation all along. |
As an immediate response to the prod. issues, we've disabled file level registration in our Dataverse. And are going to advise all the other installations using DataCite to do the same. To all the Dataverse installations using DataCite: If you have registration of DOIs for datafiles enabled (the default behavior), you are likely to experience issues publishing datasets. So we strongly recommend disabling file-level registration, until the issue is fixed (in the next release). Dataverses that are using EZID or Handlenet for registering their global ids are not affected by this issue. To disable registering global ids for datafiles, add the following entry to the “setting” table in your database: INSERT INTO setting (name,content) VALUES (':FilePIDsEnabled', 'false') (Or modify an existing entry if you currently have it set to “true”) Note:
|
@landreev - could use the API call instead of db insert: |
@qqmyers that should be the recommended way yes; i have no idea why i did it w/ a db query. |
DataCite deployed a fix in the MDS API at mds.datacite.org that should also help with this issue: datacite/poodle#21 |
There are several reports of publications failing, potentially intermittently for develop and older versions. From looking into this, I think the issue (or at least one of them) is a bug in Dataverse in which the check when generating GlobalIds for files has been sending a null value instead of the proposed identifier. I suspect that a change at DataCite which would have changed it's response to seeing a call to /metadata/null from a error to a 200 status code (looks like they send page 1 of the list of 16M DOIs now) has uncovered this. If DataCite responds quickly, Dataverse now sits in a while loop generating new IDs over and over as the 200 response is interpreted as all of them existing. (If DataCite is bogged down (perhaps due to lots of calls to check null identifiers from somewhere...), it may instead respond with 502 status, which I've also seen, which then causes publication to fail. This case shows exceptions in the log, whereas the first doesn't ever report a problem and just hangs for minutes until some timeout occurs.)
I think a useful fix would be to correct the DV bug by sending the newly generated ID, i.e. by refactoring the GlobalIdServiceBeans to test alreadyExists(GlobalId) so that the ID doesn't have to belong to a DVObject before it can be tested. Hoping to do this now and be able to back port to v4.9.4...
The text was updated successfully, but these errors were encountered: