Skip to content

Commit

Permalink
Merge pull request #10632 from QualitativeDataRepository/datacite_plu…
Browse files Browse the repository at this point in the history
…s_relPubRelType

DataciteXML changes Plus RelationType field
  • Loading branch information
landreev authored Sep 23, 2024
2 parents 9a1e494 + c5f12b7 commit edae760
Show file tree
Hide file tree
Showing 43 changed files with 2,487 additions and 768 deletions.
2 changes: 2 additions & 0 deletions conf/solr/schema.xml
Original file line number Diff line number Diff line change
Expand Up @@ -352,6 +352,7 @@
<field name="productionPlace" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="publication" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="publicationCitation" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="publicationRelationType" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="publicationIDNumber" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="publicationIDType" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="publicationURL" type="text_en" multiValued="true" stored="true" indexed="true"/>
Expand Down Expand Up @@ -593,6 +594,7 @@
<copyField source="productionPlace" dest="_text_" maxChars="3000"/>
<copyField source="publication" dest="_text_" maxChars="3000"/>
<copyField source="publicationCitation" dest="_text_" maxChars="3000"/>
<copyField source="publicationRelationType" dest="_text_" maxChars="3000"/>
<copyField source="publicationIDNumber" dest="_text_" maxChars="3000"/>
<copyField source="publicationIDType" dest="_text_" maxChars="3000"/>
<copyField source="publicationURL" dest="_text_" maxChars="3000"/>
Expand Down
41 changes: 41 additions & 0 deletions doc/release-notes/10632-DataCiteXMLandRelationType.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
### Enhanced DataCite Metadata, Relation Type

A new field has been added to the citation metadatablock to allow entry of the "Relation Type" between a "Related Publication" and a dataset. The Relation Type is currently limited to the most common 6 values recommended by DataCite: isCitedBy, Cites, IsSupplementTo, IsSupplementedBy, IsReferencedBy, and References. For existing datasets where no "Relation Type" has been specified, "IsSupplementTo" is assumed.

Dataverse now supports the DataCite v4.5 schema. Additional metadata, including metadata about Related Publications, and files in the dataset are now being sent to DataCite and improvements to how PIDs (ORCID, ROR, DOIs, etc.), license/terms, geospatial, and other metadata is represented have been made. The enhanced metadata will automatically be sent when datasets are created and published and is available in the DataCite XML export after publication.

The additions are in rough alignment with the OpenAIRE XML export, but there are some minor differences in addition to the Relation Type addition, including an update to the DataCite 4.5 schema. For details see https://github.com/IQSS/dataverse/pull/10632 and https://github.com/IQSS/dataverse/pull/10615 and the [design document](https://docs.google.com/document/d/1JzDo9UOIy9dVvaHvtIbOI8tFU6bWdfDfuQvWWpC0tkA/edit?usp=sharing) referenced there.

Multiple backward incompatible changes and bug fixes have been made to API calls (3 of the four of which were not documented) related to updating PID target urls and metadata at the provider service:
- [Update Target URL for a Published Dataset at the PID provider](https://guides.dataverse.org/en/latest/admin/dataverses-datasets.html#update-target-url-for-a-published-dataset-at-the-pid-provider)
- [Update Target URL for all Published Datasets at the PID provider](https://guides.dataverse.org/en/latest/admin/dataverses-datasets.html#update-target-url-for-all-published-datasets-at-the-pid-provider)
- [Update Metadata for a Published Dataset at the PID provider](https://guides.dataverse.org/en/latest/admin/dataverses-datasets.html#update-metadata-for-a-published-dataset-at-the-pid-provider)
- [Update Metadata for all Published Datasets at the PID provider](https://guides.dataverse.org/en/latest/admin/dataverses-datasets.html#update-metadata-for-all-published-datasets-at-the-pid-provider)

Upgrade instructions
--------------------

The Solr schema has to be updated via the normal mechanism to add the new "relationType" field.

The citation metadatablock has to be reinstalled using the standard instructions.

With these two changes, the "Relation Type" fields will be available and creation/publication of datasets will result in the expanded XML being sent to DataCite.

To update existing datasets (and files using DataCite DOIs):

Exports can be updated by running `curl http://localhost:8080/api/admin/metadata/reExportAll`

Entries at DataCite for published datasets can be updated by a superuser using an API call (newly documented):

`curl -X POST -H 'X-Dataverse-key:<key>' http://localhost:8080/api/datasets/modifyRegistrationPIDMetadataAll`

This will loop through all published datasets (and released files with PIDs). As long as the loop completes, the call will return a 200/OK response. Any PIDs for which the update fails can be found using

`grep 'Failure for id' server.log`

Failures may occur if PIDs were never registered, or if they were never made findable. Any such cases can be fixed manually in DataCite Fabrica or using the [Reserve a PID](https://guides.dataverse.org/en/latest/api/native-api.html#reserve-a-pid) API call and the newly documented `/api/datasets/<id>/modifyRegistration` call respectively. See https://guides.dataverse.org/en/latest/admin/dataverses-datasets.html#send-dataset-metadata-to-pid-provider. Please reach out with any questions.

PIDs can also be updated by a superuser on a per-dataset basis using

`curl -X POST -H 'X-Dataverse-key:<key>' http://localhost:8080/api/datasets/<id>/modifyRegistrationMetadata`

35 changes: 32 additions & 3 deletions doc/sphinx-guides/source/admin/dataverses-datasets.rst
Original file line number Diff line number Diff line change
Expand Up @@ -195,12 +195,41 @@ Mints a new identifier for a dataset previously registered with a handle. Only a
.. _send-metadata-to-pid-provider:

Send Dataset metadata to PID provider
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Update Target URL for a Published Dataset at the PID provider
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Forces update to metadata provided to the PID provider of a published dataset. Only accessible to superusers. ::
Forces update to the target URL provided to the PID provider of a published dataset and assures the PID is findable.
Only accessible to superusers. ::

curl -H "X-Dataverse-key: $API_TOKEN" -X POST http://$SERVER/api/datasets/$dataset-id/modifyRegistration
Update Target URL for all Published Datasets at the PID provider
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Forces update to the target URL provided to the PID provider of all published datasets and assures the PID is findable.
Only accessible to superusers. ::

curl -H "X-Dataverse-key: $API_TOKEN" -X POST http://$SERVER/api/datasets/modifyRegistrationAll
Update Metadata for a Published Dataset at the PID provider
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Checks to see that the PID metadata for a published dataset (and any released files in it using file PIDs)
is up-to-date at the provider and updates the metadata if necessary.
Only accessible to superusers. ::

curl -H "X-Dataverse-key: $API_TOKEN" -X POST http://$SERVER/api/datasets/$dataset-id/modifyRegistrationMetadata
Update Metadata for all Published Datasets at the PID provider
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Checks to see that the PID metadata is up-to-date at the provider for all published datasets
(and any released files in them using file PIDs) and updates the metadata if necessary.
Only accessible to superusers. ::

curl -H "X-Dataverse-key: $API_TOKEN" -X POST http://$SERVER/api/datasets/modifyRegistrationPIDMetadataAll
The call returns 200/OK as long as the call completes. Any errors for individual datasets are reported in the log.

Check for Unreserved PIDs and Reserve Them
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Expand Down
7 changes: 7 additions & 0 deletions doc/sphinx-guides/source/api/changelog.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,13 @@ This API changelog is experimental and we would love feedback on its usefulness.
:local:
:depth: 1

v6.4
----

- **/api/datasets/$dataset-id/modifyRegistration**: Changed from GET to POST
- **/api/datasets/modifyRegistrationPIDMetadataAll**: Changed from GET to POST


v6.3
----

Expand Down
12 changes: 8 additions & 4 deletions doc/sphinx-guides/source/installation/config.rst
Original file line number Diff line number Diff line change
Expand Up @@ -232,6 +232,10 @@ Dataverse can be configured with one or more PID providers, each of which can mi
to manage an authority/shoulder combination, aka a "prefix" (PermaLinks also support custom separator characters as part of the prefix),
along with an optional list of individual PIDs (with different authority/shoulders) than can be managed with that account.

Dataverse automatically manages assigning PIDs and making them findable when datasets are published. There are also :ref:`API calls that
allow updating the PID target URLs and metadata of already-published datasets manually if needed <send-metadata-to-pid-provider>`, e.g. if a Dataverse instance is
moved to a new URL or when the software is updated to generate additional metadata or address schema changes at the PID service.

Testing PID Providers
+++++++++++++++++++++

Expand All @@ -246,11 +250,11 @@ configure the credentials as described below.

Alternately, you may wish to configure other providers for testing:

- EZID is available to University of California scholars and researchers. Testing can be done using the authority 10.5072 and shoulder FK2 with the "apitest" account (contact EZID for credentials) or an institutional account. Configuration in Dataverse is then analogous to using DataCite.
- EZID is available to University of California scholars and researchers. Testing can be done using the authority 10.5072 and shoulder FK2 with the "apitest" account (contact EZID for credentials) or an institutional account. Configuration in Dataverse is then analogous to using DataCite.

- The PermaLink provider, like the FAKE DOI provider, does not involve an external account.
Unlike the Fake DOI provider, the PermaLink provider creates PIDs that begin with "perma:", making it clearer that they are not DOIs,
and that do resolve to the local dataset/file page in Dataverse, making them useful for some production use cases. See :ref:`permalinks` and (for the FAKE DOI provider) the :doc:`/developers/dev-environment` section of the Developer Guide.
- The PermaLink provider, like the FAKE DOI provider, does not involve an external account.
Unlike the Fake DOI provider, the PermaLink provider creates PIDs that begin with "perma:", making it clearer that they are not DOIs,
and that do resolve to the local dataset/file page in Dataverse, making them useful for some production use cases. See :ref:`permalinks` and (for the FAKE DOI provider) the :doc:`/developers/dev-environment` section of the Developer Guide.

Provider-specific configuration is described below.

Expand Down
6 changes: 6 additions & 0 deletions scripts/api/data/dataset-create-new-all-default-fields.json
Original file line number Diff line number Diff line change
Expand Up @@ -331,6 +331,12 @@
"typeClass": "compound",
"value": [
{
"publicationRelationType" : {
"typeName" : "publicationRelationType",
"multiple" : false,
"typeClass" : "controlledVocabulary",
"value" : "IsSupplementTo"
},
"publicationCitation": {
"typeName": "publicationCitation",
"multiple": false,
Expand Down
Loading

0 comments on commit edae760

Please sign in to comment.