Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataset Citation: provide flexible options for information displayed in citation metadata #2297

Closed
sbarbosadataverse opened this issue Jun 30, 2015 · 39 comments
Labels
Feature: Metadata Type: Suggestion an idea User Role: Curator Curates and reviews datasets, manages permissions

Comments

@sbarbosadataverse
Copy link

Several dataverse users have requested more flexibility in what is displayed for dataset citations. Not so much changing information display order, but actually choosing what to display for their dataset citation
@mcrosas please add additional thoughts on this and which milestone should this go in?

@sbarbosadataverse sbarbosadataverse added the Type: Feature a feature request label Jun 30, 2015
@mercecrosas mercecrosas added this to the In Design milestone Jul 2, 2015
@mercecrosas
Copy link
Member

We should bring back a data citation "widget" or tool that allows to configure some of the fields in data citation, in particular:

  • Date: select whether is published date or distribution date or other dates
  • Repository and distributor: select whether the citation should display distributor/producer in addition to repository

To be reviewed with @eaquigley

@mercecrosas mercecrosas added Priority: Medium Type: Suggestion an idea and removed Type: Feature a feature request labels Jul 2, 2015
@eaquigley
Copy link
Contributor

@mcrosas would this be something we could add to the "selecting metadata fields" portion of the general information page of a dataverse or should this be on the dataset level?

@posixeleni
Copy link
Contributor

Related or Duplicate: #2146

@mercecrosas
Copy link
Member

More interest in this from HMS: "dataset collection year would be much more useful."

@posixeleni
Copy link
Contributor

Met with @eaquigley @sbarbosadataverse @scolapasta to plan out this feature.

An FRD will need to be created but at a high-level we will need to be more flexible with allowing users to select a different date in the citation other than the default publication year. This is especially important for historical datasets.

Admins will be able to set at their Dataverse-level how they want their data citations to display.

@mercecrosas mercecrosas modified the milestones: In Design, In Review Nov 30, 2015
@mheppler mheppler changed the title flexible display of Citation metadata: provide flexible options for information displayed in citation metadata Dataset Citation: provide flexible options for information displayed in citation metadata Jan 28, 2016
@scolapasta scolapasta removed this from the Not Assigned to a Release milestone Jan 28, 2016
@posixeleni
Copy link
Contributor

Just spoke with @scolapasta and @eaquigley that there may be a use case where someone would like to use different dates depending on the dataset in their dataverse, rather than just one kind of date across the board. For example in 3.6 we allowed people to use either distribution or production date for the citation so they would have two different kinds of dates in their citation within a single dataverse.

@scolapasta
Copy link
Contributor

We should also make this consistent with facets and metadatablocks and have inheritance for this. So a checkbox to say is "citation customization root" or something like that.

If this is something stored on dvobject, then it could be inherited by datasets by default, but you could override for a specific dataset (if we encounter a use case like @posixeleni described).

@kcondon
Copy link
Contributor

kcondon commented Jun 28, 2016

OK, backend changes are in but please note that the additional fields need to be ordered and currently are not or the order is unclear. This will need to be decided both in backend (order column) and UI.

@mheppler
Copy link
Contributor

Related to #2146.

@jggautier
Copy link
Contributor

There's recent discussion about this issue in this Google Groups thread.

@pdurbin pdurbin added User Role: Curator Curates and reviews datasets, manages permissions and removed zTriaged labels Jun 30, 2017
@RightInTwo
Copy link
Contributor

Well, don't trust me too much - I'm just trusting that ILO page :)

Another example where it is more clear: https://doi.org/10.17632/ym23rrm63f.1

On the landing page, Mendeley prompts me to cite the data with:

van Veldhuizen, Roel (2017), “Data and Analysis Files for "Clean up your own Mess"”, Mendeley Data, v1, http://dx.doi.org/10.17632/ym23rrm63f.1

While I think that it's not neccessary to replicate this exactly, as I would remove the "dx." and use https for the DOI link, I think the main info author/year/title/distributor/doi should be what we display to our users as well.

@jggautier
Copy link
Contributor

jggautier commented Dec 5, 2019

Just for clarification, Mendeley Data is the repository, isn't it? Is the citation above an example of how citations should look when no producer name is provided, so the repository name is used instead?

@RightInTwo
Copy link
Contributor

RightInTwo commented Dec 5, 2019

@jggautier Hey Julian, nice to see you! Well yes, Mendeley Data is the repository where the data actually resides and where the doi lookup points to. But in Dataverse, I didn't know a field "repository" exists in the metadata. Aren't we talking about "distributor"?

In any way, the root name can be used as the default, but if I explicitly provide a different producer/distributor/repository/which-ever-field-is-correct, I want the citation to reflect that.

Another example: The field $.publisher at https://api.datacite.org/dois/application/vnd.datacite.datacite+json/10.7802/1.2121 is what I would expect in the citation when I use it to populate the respective field in the dataverse metadata.

@pdurbin
Copy link
Member

pdurbin commented Dec 5, 2019

@RightInTwo here's a thought. What if you create a dataset in https://github.com/IQSS/dataverse-sample-data that illustrates which Dataverse metadata fields you'd use? You could create the dataset using https://demo.dataverse.org and then I could help you export the dataset as JSON and get it into that "sample data" repo. Actually, a good first step would probably be for you to create an issue at https://github.com/IQSS/dataverse-sample-data/issues to explain how the dataset comes from somewhere else, etc.

@jggautier
Copy link
Contributor

jggautier commented Dec 5, 2019

Hey @RightInTwo. No there's no metadata field called "repository", as you've probably already confirmed :)

I think I was confused because I forgot that you'd like to index the metadata of datasets that will continue to live outside of dataverse (similiar to oai-pmh harvesting, but you can't use that as you've written elsewhere). So I agree that showing the root repository's name in the citation in the search results would be wrong when the data is actually in another repository.

I agree with @pdurbin about seeing which metadata fields you'd use.

@RightInTwo
Copy link
Contributor

RightInTwo commented Dec 5, 2019

@pdurbin @jggautier It is just "publisher" (same in dublin core, datacite, native dvn) that would need to be accepted by dataverse on the ddi import (which is afaik still the only way to get existing dois into the system). That would actually be enough for our purpose, but it might make sense to also make the field editable in the gui and other apis (which might accept existing dois in the future?) for more diverse use cases.

@pdurbin
Copy link
Member

pdurbin commented Dec 5, 2019

ddi import (which is afaik still the only way to get existing dois into the system)

In addition to DDI, you can also get existing DOIs into Dataverse with JSON: http://guides.dataverse.org/en/4.18.1/api/native-api.html#import-a-dataset-into-a-dataverse

One can also get existing DOIs into Dataverse by harvesting them via OAI-PMH.

@scolapasta
Copy link
Contributor

In terms of harvesting (i.e. allowing for search of datasets in other repositories; no dataset page available through Dataverse*), we had always talked about not generating citations (since it's really not our responsibility) and having the citation be one of the things we actually harvest. (currently we do generate a citation using the distributor as the publisher)

(*) which is how it should be when the data is actually published somewhere else

@RightInTwo
Copy link
Contributor

RightInTwo commented Dec 5, 2019

@scolapasta

having the citation be one of the things we actually harvest

Well, being able to import the whole citation would be even better for our use case! Then we could just sync the whole citation in our own format.

But there is a drawback. In the Harvard Subscription Data Dataverse (and we are planning something similar), you'd not be able to set the correct publisher for those datasets, as the Harvard Dataverse is the authority for that metadata and no citation can be imported.

@pdurbin

In addition to DDI, you can also get existing DOIs into Dataverse with JSON

Perfect! Maybe that has always worked and I just missed the &release=yes in my code 🐛
Why we can't simply use a OAI-PMH harvesting is discussed in #5402. Though, yes, I'm sorry for just ignoring the main way of metadata exchange between repositories :D

@RightInTwo
Copy link
Contributor

@pdurbin @scolapasta @jggautier Thanks for the fruitful discussion! Can we maybe wrap it up in some way? I always hesitate to wake issues like this from the stale pile, because I know it takes a lot of effort from everyone involved to think about all the dependencies for such features so close to the core.

@jggautier
Copy link
Contributor

jggautier commented Dec 12, 2019

It might be helpful to summarize the needs related to changing dataset citations discussed in this issue and related needs discussed in other issues. Please feel free to suggest edits or additions:

  1. As a researcher, I want others to cite my data in a way that acknowledges the producer that funded or otherwise supported the research (in addition to the Dataverse repository responsible for its preservation).
  2. As a researcher, I want others to cite my data in a way that acknowledges when the data was first published (as opposed to when it was first published in the Dataverse repository that it's published in now).
    • A citation's publication date is often interpreted as the date when the data was collected or when the research was done, and having the publication date be updated as it moves from one repository to another can be misleading.
  3. As a curator, I want others to cite data in a way that acknowledges who has responsibility for preserving the data files.
    • Outside of OAI-PMH harvesting, Dataverse effectively assumes that it is responsible for preserving the data files of any dataset metadata that it indexes, even when those files are preserved in another repository.

Discussed in other issues:

  1. As a researcher, I want others to be able to cite a particular version of my dataset regardless of where the data lives. As an admin of another repository system migrating dataset versions to Dataverse, I would like to include existing dataset version numbers so that my version numbers are persistent across repositories #4570
    • If versions 1 and 2 of my dataset are published in one repository, then the dataset is moved to another repository, I need people to know that the latest version of the dataset published in the new repository is version 2. Right now, moving a dataset from one repository into a Dataverse repository effectively resets the version number displayed in the citation. This is also an issue with depositing software.
  2. As a researcher, I want the publication date in the suggested citation to reflect when the latest version (or latest major version) was published. Versioning: allow citation version display to include date value associated with the version release #2298
    • Right now, the citation's publication date is the date when the dataset was first published in the Dataverse repository.

@RightInTwo
Copy link
Contributor

RightInTwo commented Dec 16, 2019

I added a code example in #5402 to import metadata from Datacite through a python script that produces DDI-XML quick-and-dirty. When using this with &release=yes, I would like for Dataverse to just use existing fields (like <distrbtr>, <version>, <distDate> and the whole custom citation in <biblCit>) instead of populating them, which I think should just happen when Dataverse publishes data, not when it is imported as "released".

@cmbz
Copy link

cmbz commented Aug 20, 2024

To focus on the most important features and bugs, we are closing issues created before 2020 (version 5.0) that are not new feature requests with the label 'Type: Feature'.

If you created this issue and you feel the team should revisit this decision, please reopen the issue and leave a comment.

@cmbz cmbz closed this as completed Aug 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature: Metadata Type: Suggestion an idea User Role: Curator Curates and reviews datasets, manages permissions
Development

No branches or pull requests