Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Datacite xml improvements #10615

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
65 commits
Select commit Hold shift + click to select a range
4878cfe
separate metadata parsing/params from XML generation code
qqmyers May 3, 2024
68792c2
extract some common xml writing util code
qqmyers May 3, 2024
1a46155
note duplicate method
qqmyers May 3, 2024
ace656c
remove xml template doc, refactor to generate xml, adding OA fields
qqmyers May 3, 2024
dba03e2
refactor source of XML info
qqmyers May 3, 2024
af3e24b
add code to get raw alphanumeric pid value
qqmyers May 3, 2024
fa23884
remove duplicate method
qqmyers May 3, 2024
0d22d6c
dates, resourceType, alternate Ids
qqmyers May 3, 2024
d69bf41
more methods
qqmyers May 8, 2024
04b367f
only one field to look for
qqmyers May 15, 2024
003431d
use common util method
qqmyers May 15, 2024
fea2f5e
access rights descriptions, geolocations, funding refs
qqmyers May 15, 2024
3c52b6a
altTitles npe
qqmyers May 17, 2024
bab2a0d
fixes and test
qqmyers May 18, 2024
3cca63d
fix for empty rel pub entry
qqmyers May 20, 2024
30c80a9
bugs: remove bad nesting, dupe values
qqmyers May 20, 2024
a2acdeb
add XML Validation to test
qqmyers May 20, 2024
3ec7a0b
fix contributorType
qqmyers May 23, 2024
842dee6
add geolocations element and multiple geolocation
qqmyers May 23, 2024
81a7c4a
typos
qqmyers May 23, 2024
ed5eab0
try execute inside the main method
qqmyers May 24, 2024
39673f0
Fix subject, keyword
qqmyers May 24, 2024
36097d6
fix geo coverage
qqmyers May 24, 2024
a5d3b3e
adjust funders to include grant number, add xml escaping for description
qqmyers May 24, 2024
8a12444
bug: add dataset descriptions
qqmyers May 24, 2024
f3e5dc1
typo, add xml escape for funder
qqmyers May 24, 2024
5610c95
still typo
qqmyers May 24, 2024
7148b03
mark contact as deprecated - unused
qqmyers May 24, 2024
0470459
more fixes
qqmyers May 24, 2024
c0265da
catch parseexception
qqmyers May 24, 2024
2ff8678
fix alternateIdentifier, related PID parsing, series
qqmyers May 24, 2024
182f3d7
catch PID update exception to avoid corrupt dataset
qqmyers May 24, 2024
be90355
try long sleep
qqmyers May 24, 2024
e458e8c
set dv released before pid publicize, go back to short time
qqmyers May 24, 2024
27fe7b4
always use latest version for copy
qqmyers May 24, 2024
00a3830
handle deaccession, fix relatedIDtype for files
qqmyers May 28, 2024
1faf0cd
missed assignment for title
qqmyers May 28, 2024
23dd581
fix creator for deaccessioned
qqmyers May 28, 2024
3bbd2e9
correct fix for creators when deaccessioned
qqmyers May 28, 2024
4def6da
remove bad value and lang
qqmyers May 28, 2024
eac477e
add creatorName sub element for deaccession/no names case
qqmyers May 28, 2024
154ac8a
typo
qqmyers May 28, 2024
9144f6c
fix resourceType - always 1 entry
qqmyers May 28, 2024
a5870fb
Also handle file case for resourceType
qqmyers May 28, 2024
24db2af
missed changes
qqmyers May 31, 2024
f0fd61a
simplify - util checks for null and empty
qqmyers May 31, 2024
ead153f
typo in DOI parsing logic
qqmyers Jun 10, 2024
ea75216
only files in latestversionforcopy
qqmyers Jun 10, 2024
33f8f30
Merge remote-tracking branch 'IQSS/develop' into datacite_xml_improve…
qqmyers Jun 14, 2024
b6bd530
fix date parsing, clear bad values
qqmyers Jun 11, 2024
347971f
skip blanks in geo place name entries
qqmyers Jun 17, 2024
bc5686b
use ; to separate kindOfData / resourceTypes
qqmyers Jun 17, 2024
db934bb
add Time Period as Other Date
qqmyers Jun 18, 2024
9efe597
support available and updated dates for dataset and file
qqmyers Jun 18, 2024
de37314
fix file updated logic
qqmyers Jun 18, 2024
0c2cff1
add HasPart rels - logic issue
qqmyers Jun 24, 2024
a177a08
catch additional exception type
qqmyers Jun 24, 2024
08843d8
missing imports, null check
qqmyers Jun 26, 2024
c4df868
Merge remote-tracking branch 'IQSS/develop' into datacite_xml_improve…
qqmyers Jun 26, 2024
1cceab1
Merge remote-tracking branch 'IQSS/develop' into datacite_xml_improve…
qqmyers Jul 19, 2024
6c87b9e
fix ROR identification
qqmyers Jul 5, 2024
182cfdd
passthrough for ext cvv/ROR affiliation update
qqmyers Jul 9, 2024
f5326c9
Merge remote-tracking branch 'IQSS/develop' into
qqmyers Sep 3, 2024
ea373af
Merge remote-tracking branch 'IQSS/develop' into
qqmyers Sep 6, 2024
561fd2c
fix test
qqmyers Sep 6, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions src/main/java/edu/harvard/iq/dataverse/DataFile.java
Original file line number Diff line number Diff line change
Expand Up @@ -1123,4 +1123,23 @@ private boolean tagExists(String tagLabel) {
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll just leave this comment here on the first file that is changed...

Can we please get a release note?

return false;
}

public boolean isDeaccessioned() {
// return true, if all published versions were deaccessioned
boolean inDeaccessionedVersions = false;
for (FileMetadata fmd : getFileMetadatas()) {
DatasetVersion testDsv = fmd.getDatasetVersion();
if (testDsv.isReleased()) {
return false;
}
// Also check for draft version
if (testDsv.isDraft()) {
return false;
}
if (testDsv.isDeaccessioned()) {
inDeaccessionedVersions = true;
}
}
return inDeaccessionedVersions; // since any published version would have already returned
}
} // end of class
Original file line number Diff line number Diff line change
Expand Up @@ -157,6 +157,8 @@ public class DatasetFieldConstant implements java.io.Serializable {
public final static String confidentialityDeclaration="confidentialityDeclaration";
public final static String specialPermissions="specialPermissions";
public final static String restrictions="restrictions";
@Deprecated
//Doesn't appear to be used and is not datasetContact
public final static String contact="contact";
public final static String citationRequirements="citationRequirements";
public final static String depositorRequirements="depositorRequirements";
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -947,7 +947,7 @@ public void callFinalizePublishCommandAsynchronously(Long datasetId, CommandCont
try {
Thread.sleep(1000);
} catch (Exception ex) {
logger.warning("Failed to sleep for a second.");
logger.warning("Failed to sleep for one second.");
}
logger.fine("Running FinalizeDatasetPublicationCommand, asynchronously");
Dataset theDataset = find(datasetId);
Expand Down
2 changes: 1 addition & 1 deletion src/main/java/edu/harvard/iq/dataverse/DatasetVersion.java
Original file line number Diff line number Diff line change
Expand Up @@ -1342,7 +1342,7 @@ public List<String[]> getGeographicCoverage() {
}
geoCoverages.add(coverageItem);
}

break;
}
}
return geoCoverages;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,9 @@ public enum ExternalIdentifier {
GND("GND", "https://d-nb.info/gnd/%s", "^1[01]?\\d{7}[0-9X]|[47]\\d{6}-\\d|[1-9]\\d{0,7}-[0-9X]|3\\d{7}[0-9X]$"),
// note: DAI is missing from this list, because it doesn't have resolvable URL
ResearcherID("ResearcherID", "https://publons.com/researcher/%s/", "^[A-Z\\d][A-Z\\d-]+[A-Z\\d]$"),
ScopusID("ScopusID", "https://www.scopus.com/authid/detail.uri?authorId=%s", "^\\d*$");
ScopusID("ScopusID", "https://www.scopus.com/authid/detail.uri?authorId=%s", "^\\d*$"),
//Requiring ROR to be URL form as we use it where there is no id type field and matching any 9 digit number starting with 0 seems a bit aggressive
ROR("ROR", "https://ror.org/%s", "^(https:\\/\\/ror.org\\/)0[a-hj-km-np-tv-z|0-9]{6}[0-9]{2}$");

private String name;
private String template;
Expand Down
7 changes: 7 additions & 0 deletions src/main/java/edu/harvard/iq/dataverse/GlobalId.java
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,13 @@ public String asURL() {
}
return null;
}

public String asRawIdentifier() {
if (protocol == null || authority == null || identifier == null) {
return "";
}
return authority + separator + identifier;
}



Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -211,7 +211,7 @@ public Dataset execute(CommandContext ctxt) throws CommandException {

if (theDataset.getLatestVersion().getVersionState() != RELEASED) {
// some imported datasets may already be released.

theDataset.getLatestVersion().setVersionState(RELEASED);
if (!datasetExternallyReleased) {
publicizeExternalIdentifier(theDataset, ctxt);
// Will throw a CommandException, unless successful.
Expand All @@ -220,7 +220,6 @@ public Dataset execute(CommandContext ctxt) throws CommandException {
// a failure - it will remove any locks, and it will send a
// proper notification to the user(s).
}
theDataset.getLatestVersion().setVersionState(RELEASED);
}

final Dataset ds = ctxt.em().merge(theDataset);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
import io.gdcc.spi.export.ExportException;
import io.gdcc.spi.export.Exporter;
import io.gdcc.spi.export.XMLExporter;
import edu.harvard.iq.dataverse.pidproviders.doi.XmlMetadataTemplate;
import edu.harvard.iq.dataverse.util.BundleUtil;
import java.io.IOException;
import java.io.OutputStream;
Expand All @@ -20,11 +21,7 @@
*/
@AutoService(Exporter.class)
public class DataCiteExporter implements XMLExporter {

private static String DEFAULT_XML_NAMESPACE = "http://datacite.org/schema/kernel-3";
private static String DEFAULT_XML_SCHEMALOCATION = "http://datacite.org/schema/kernel-3 http://schema.datacite.org/meta/kernel-3/metadata.xsd";
private static String DEFAULT_XML_VERSION = "3.0";


public static final String NAME = "Datacite";

@Override
Expand Down Expand Up @@ -60,17 +57,17 @@ public Boolean isAvailableToUsers() {

@Override
public String getXMLNameSpace() {
return DataCiteExporter.DEFAULT_XML_NAMESPACE;
return XmlMetadataTemplate.XML_NAMESPACE;
}

@Override
public String getXMLSchemaLocation() {
return DataCiteExporter.DEFAULT_XML_SCHEMALOCATION;
return XmlMetadataTemplate.XML_SCHEMA_LOCATION;
}

@Override
public String getXMLSchemaVersion() {
return DataCiteExporter.DEFAULT_XML_VERSION;
return XmlMetadataTemplate.XML_SCHEMA_VERSION;
}

}
Loading
Loading