-
Notifications
You must be signed in to change notification settings - Fork 493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dataverse -> DCM APIs #3725
Comments
See #3352 for links to code that's already been written. I'm still partial to https://github.com/pdurbin/dataverse/tree/3145-dcm which has |
failure modes / open questions:
|
@sekmiller and I are talking about this. I'm sure the screenshots in "2017-05-02 Review rsync prototype (Bill's UI changes)" at https://docs.google.com/a/harvard.edu/document/d/1Mi7I2w2FVYbN1Qb9oWLik3UsU2oPY3Wn9Z6_5JXrEAY/edit?usp=sharing will be helpful. They're oriented toward building a UI some day which is not in scope for this issue but will help give some background. I believe this meeting with @pameyer was even recorded. |
Here's some whiteboarding I just did for @sekmiller . I tried to focus on what the end user will see, even without the fancy UI work that will come in a future issue (if anyone know the issue number please advise):
On the back end, here's roughly what we want to happen:
That's the happy path. If the DCM tells Dataverse that the checksum fail, Dataverse sends a notification to the user via the normal Dataverse notification system. Here's the (fugly) whiteboarding of the above: |
@raprasad @sekmiller to get ready for development, please install and run the Data Capture Module (DCM) on your laptops. Instructions are at https://github.com/sbgrid/data-capture-module and @pameyer can help field questions since he wrote the code! 😄 |
It just hit my radar that there are many additional resources that are sure to be helpful to developers that are being gathered by the design team in a folder called "rsync" https://drive.google.com/open?id=0B3A1TxMQgvUVa2ltQjc4cmliTTg including:
There's also a wealth of information at https://trello.com/c/Nbte37k1/9-rsync-file-upload-download-4-8 |
I took code I wrote a year ago at https://github.com/pdurbin/dataverse/tree/3145-dcm and pushed it into a new branch after getting it up to date with the "develop" branch: https://github.com/IQSS/dataverse/tree/3725-dcm-apis I'm realizing that some of my unanswered questions will be resolved once #3724 has passed through Code Review. I'll keep an eye on that issue as well as its pull request at #3830 to make sure I'm being consistent with whatever decisions are made there especially with regard to the rules for when Dataverse should ask the DCM for an rsync script. @pameyer made the pull request and should be able to fill me in. |
removed assignment after ticket scope changed |
@raprasad thanks. Yes, complete scope change as of yesterday afternoon's sprint planning meeting. In the morning I was sketching diagrams like this... ... which give a more complete picture of what's called "Large Data Upload Integration" at http://dataverse.org/goals-roadmap-and-releases and hints at the work @michbarsinai is doing in #3561 but this current issue has been clarified to be much smaller. Above I had stated that the definition of done is seeing a data file with a MIME Type of The clarified definition of done is this: Assuming the |
I'm still working away on my Here are some requirements for testings we talked about:
|
I just made pull request #3851 and put this issue in the Code Review column at https://waffle.io/IQSS/dataverse |
@pameyer it's a good point and as luck would have it my installation of DCM is hosed right now so it was easy to test the case of when it doesn't return an rsync script to Dataverse. 😄 I just pushed 4126ad1 which gives the API user a better clue as to what went wrong:
|
@pdurbin thanks |
Just noticed the intersection with DV sending And to slightly clarify; the batch-import API (and DCM in general) was expecting the This is something that can be worked-around; but increased global system complexity and so it's probably better to not need to work around it. |
Allow download of rsync scripts from Data Capture Module (DCM) #3725
@pameyer to me, an identifier is a DOI and an id is a database id. |
@pdurbin - the database and native API both have something called "identifier"; what should I be calling it? |
Distinct from #3353 (roughly DCM -> DV APIs, in this context). This is a dependency for DCM UI implementation, but doesn't include UI.
Create dataset function/command needs to:
Some configuration (DCM url, etc) will be needed.
The text was updated successfully, but these errors were encountered: