-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for ingesting data from Dataverse. Fixes #179 #175
Conversation
Codecov Report
@@ Coverage Diff @@
## master #175 +/- ##
=========================================
+ Coverage 83.84% 84.8% +0.96%
=========================================
Files 31 32 +1
Lines 1826 2001 +175
=========================================
+ Hits 1531 1697 +166
- Misses 295 304 +9
Continue to review full report at Codecov.
|
075fb36
to
ac582d6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
PluginSettings.DATAONE_URL: 'https://cn.dataone.org/cn/v2/node' | ||
PluginSettings.DATAONE_URL: 'https://cn.dataone.org/cn/v2/node', | ||
PluginSettings.DATAVERSE_URL: | ||
'https://services.dataverse.harvard.edu/miniverse/map/installations-json' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This "miniverse" URL may change some day but we'll try to let you know if it does. If we forget and it changes under you, please email support@dataverse.org about it.
ac582d6
to
a014752
Compare
a014752
to
d01f09a
Compare
c90c30e
to
7d05ee3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, looks great and works as expected. One minor comment about supported API formats, which it seems unlikely we'll ever hit. I'll mention again here that I think there is an opportunity to split some of this out into a separate library that could be used outside of WT.
server/lib/dataverse/provider.py
Outdated
'mimeType': item['file_content_type'], | ||
'filesize': item['size_in_bytes'], | ||
'id': item['file_id'], | ||
'doi': 'fixme' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the reason for the doi fixme noted somewhere? I gather it isn't returned by Dataverse, but required by our framework?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching that. I wasn't actually setting the identifier
attribute, that's why it went unnoticed. For now I'm gonna use parent DOI, but I left note pointing to IQSS/dataverse#5339 (comment)
|
||
Handles: {siteURL}/api/access/datafile/{fileId} | ||
""" | ||
fileId = os.path.basename(url.path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you are trying to be comprehensive about what users can specify for lookup, Dataverse appears to support two additional methods of accessing data via the API. From http://guides.dataverse.org/en/latest/api/dataaccess.html#basic-file-access:
- Files can be accessed using persistent identifiers, e.g. http://dataverse.harvard.edu/api/access/datafile/:persistentId/?persistentId=doi:10.7910/DVN/PB8X8P/SDYHZE
- And also multi-file bundles (e.g.,
/api/access/datafiles/$id1,$id2,...$idN
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The former is now handled. The latter returns a zipfile, which I'd say is out of scope for this PR
This PR introduces initial support for registering data from Dataverse using DOI resolver, e.g.:
A set of dataverse installation can be changed via
DATAVERSE_URL
config variable.TODO:
DATAVERSE_URL