-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Import from Dataverse #536
Comments
The first release of pyDataverse is now online, the next will come in 2-3 weeks with classes for the metadata of dataverses, datasets and datafiles. https://github.com/AUSSDA/pyDataverse/releases/tag/v0.1.0 |
Great, thanks for the update @skasberger! |
Here's how a Renku button (or should it be RENKU?) could look in Dataverse: It would be added with a curl command something like this:
|
Heads up that there's a new pull request at jupyterhub/repo2docker#739 by @Xarthisius (thanks!!) for downloading files from Dataverse into repo2docker which is a Python library used by both Binder and Whole Tale for spinning up Docker containers running Jupyter Notebooks and other compute environments. In other news, I've been having great success with pyDataverse, the new Python client library mentioned by @skasberger who is also the author. He gave a great talk about it the other week at the 5th annual Dataverse Community Meeting: https://osf.io/ur2q7/ Finally, at the same meeting I demo'ed launching a Jupyter Notebook from Dataverse using Whole Tale at the same meeting. Next year (if not sooner!) I'd love to demo a similar trick with Renku! Here are screenshots of the demo and a full transcription: https://scholar.harvard.edu/pdurbin/blog/2019/jupyter-notebooks-and-crazy-ideas-for-dataverse |
Over at IQSS/dataverse#6059 I recently created a new pull request to support external tools at the dataset level for Dataverse. (In the screenshot above, I showed how external tools at the file level are already supported.) My question for the Renku team is this: What is an ideal URL on an installation of Renku that Dataverse users should be sent to when they click "Explore" and then "Renku"? Given the pull request right now, a URL like the following can be constructed on the Dataverse side: https://renkulab.io?datasetPid=doi:10.7910/DVN/RLLL1V The external tool manifest would look like this:
Would that URL work for you? More query parameters are also supported. It could be longer and more specific, like this: https://renkulab.io?datasetPid=hdl:10864/10798&siteUrl=https://dataverse.scholarsportal.info Thoughts are welcome here or on the pull request above or its corresponding issue about supporting dataset level external tools: IQSS/dataverse#5028 |
hey @pdurbin thanks for the heads up on this - there are two possible scenarios I can imagine:
For the first case, a URL like https://renkulab.io/datasets?datasetPid=doi:10.7910/DVN/RLLL1V could potentially work - however, for the second case the user would have to be prompted/redirected to provide the project they want to add the data to. We don't currently have the functionality to add data to a project via an API, but it is presently being worked on. cc @vfried @lorenzo-cavazzi @cchoirat @ciyer @jsam @jachro who may have other opinions... |
closed by #626 |
@rokroskar this is fantastic news! 🎉 Unfortunately, I'm struggling a bit with importing my dataset. I just left a screenshot of the commands I tried over at SwissDataScienceCenter/renku#593 (comment) |
@pdurbin: the dataverse import is not a part of a release yet. To install it (in your interactive environment running on renkulab), you can run:
|
@pdurbin also note that we are still cleaning the import features up a bit (right now you get 1 commit per imported file... not optimal) - but they will be fixed soon. |
@rokroskar thanks. I tried that
Here's a screenshot for context: |
Sorry, made a typo - should be git+https |
@rokroskar it works! Thanks! I posted a screenshot to SwissDataScienceCenter/renku#593 (comment) I also let the Dataverse community know about this exciting new integration: https://groups.google.com/d/msg/dataverse-community/2H21moBIRgU/PUuai7UNBgAJ 🎉 As you suggested, follow up issues specific to Dataverse would probably be best. I'm excited that this initial integration "just works". Thanks! |
Dataverse is open source research data repository software with 43 installations around the world. I'm one of the developers and we'd be very happy for Renku to have "import from Dataverse" functionality. If you have any questions about Dataverse APIs, we're happy to answer them.
Once import has been implemented, we'd be happy to have Renku listed under "Analysis and Computation" at http://guides.dataverse.org/en/4.14/admin/integrations.html#analysis-and-computation . If you want to go ahead and create an issue at https://github.com/IQSS/dataverse/issues to update the
integrations.rst
file in the Dataverse repo, please go ahead. We could use that issue to answer any questions you may have.I seem to remember that Renku is written in Python so you might want to try using a new Python library for Dataverse at https://github.com/AUSSDA/pyDataverse by @skasberger that is so new that we haven't yet listed it at http://guides.dataverse.org/en/4.14/api/client-libraries.html
Alternatively, you can just write your own implementation. You might find inspiration from whole-tale/girder_wholetale#175 by @Xarthisius who implemented "import" from Dataverse for @whole-tale which is also written in Python. For a more human readable discussion of how to do download all the files in a Dataverse dataset using the DOI of the dataset, please see whole-tale/girder_wholetale#179 . I suggested testing with https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/TJCLKP since it's a dataset of mine but you are of course welcome to pick any dataset from any installation of Dataverse, including the demo site at https://demo.dataverse.org
I'm looking at my notes from when @rokroskar and @ciyer visited @IQSS and I'm reminded that Renku has the ability to create PROV-JSON files. Perhaps a future integration would be to push these files into Dataverse using the Dataverse "prov-json" API endpoint: http://guides.dataverse.org/en/4.14/api/native-api.html#provenance
Of course we would be thrilled if you choose a dataset hosted on an installation of Dataverse when you work on SwissDataScienceCenter/renku/issues/543 😄
The text was updated successfully, but these errors were encountered: