Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support new publishing workflow #1119

Closed
yarikoptic opened this issue Sep 29, 2022 · 30 comments · Fixed by #1125
Closed

Support new publishing workflow #1119

yarikoptic opened this issue Sep 29, 2022 · 30 comments · Fixed by #1125
Assignees
Labels

Comments

@yarikoptic
Copy link
Member

See dandi/dandi-archive#1006 (comment) for more detail from @mvandenburgh . From the description it seems that new logic is compatible with old one, ie the same code for new logic should work just fine with currently deployed solution, we would just naturally have no PUBLISHING state.

@jwodder
Copy link
Member

jwodder commented Sep 29, 2022

@yarikoptic So should the publish() method just poll the version info endpoint indefinitely until the publication is finished?

@mvandenburgh What happens if the publish task fails?

@yarikoptic
Copy link
Member Author

poll: I think so, but not indefinitely. I feel we have some other similar interface where we do similar polling already. @mvandenburgh - is there a reasonable duration to wait for for such new publishing process and error out if timing out?

@jwodder
Copy link
Member

jwodder commented Sep 29, 2022

@yarikoptic Are you thinking of wait_until_valid()?

@yarikoptic
Copy link
Member Author

@yarikoptic Are you thinking of wait_until_valid()?

yes!

@mvandenburgh
Copy link
Member

poll: I think so, but not indefinitely. I feel we have some other similar interface where we do similar polling already. @mvandenburgh - is there a reasonable duration to wait for for such new publishing process and error out if timing out?

The default timeout on wait_until_valid() (120 seconds) should be more than enough. I can't imagine it would take anywhere near that in most cases, although the overhead of placing it onto the queue and celery picking it up might make it vary by a few seconds if we're under heavy load.

@yarikoptic So should the publish() method just poll the version info endpoint indefinitely until the publication is finished?

@mvandenburgh What happens if the publish task fails?

The publish endpoint will check if a Version is eligible to be published prior to dispatching the task, and will let the requester know with a 400 error. If the publish task fails while running due to some strange conditions, like the worker dying, it will be retried again until it succeeds.

@jwodder
Copy link
Member

jwodder commented Sep 30, 2022

@mvandenburgh Another question: Previously, POST /dandisets/:id/versions/:id/publish/ would return the new version object in the response. What does it return now? Is querying /dandisets/:id/versions/ the only way now to get the newly-published version?

Also, what are all the values for the "status" field of version objects?

@mvandenburgh
Copy link
Member

mvandenburgh commented Sep 30, 2022

@mvandenburgh Another question: Previously, POST /dandisets/:id/versions/:id/publish/ would return the new version object in the response. What does it return now? Is querying /dandisets/:id/versions/ the only way now to get the newly-published version?

Correct. Now, the response will be empty. Instead, the status code now indicates what the status of the publishing is; in particular, a status of 423 after POSTing to the /publish endpoint has been introduced, and it indicates that a publish is already occurring. I can elaborate more on the 400 status codes that this endpoint can return and what each means if that is needed.

Also, what are all the values for the "status" field of version objects?

'Pending'
'Validating'
'Valid'
'Invalid'
'Publishing'
'Published'

(Sorry for the confusion about upper/lower-case; my earlier comments were incorrect, each status will be lowercase with the exception of the first character, i.e. exactly what they are above)

@jwodder
Copy link
Member

jwodder commented Oct 3, 2022

@mvandenburgh If the CLI were to use this new workflow against a pre-dandi/dandi-archive#1006 version of the Archive, is the workflow intended to be backwards-compatible, or is some specific error expected? Currently, in testing the new workflow in the CLI, it appears that the /info/ endpoint has status "Published" immediately yet the first Published version listed by the /versions/ endpoint is the draft version.

@mvandenburgh
Copy link
Member

mvandenburgh commented Oct 3, 2022

The /versions/ endpoint in the pre-dandi/dandi-archive#1006 version of the Archive, i.e. what is running in staging and production right now, returns versions ordered by their creation time in ascending order; so, the draft version will always be first, and the most recently published version will always be last. dandi-archive#1006 changes that to be the reverse by default.

However, it's possible (both in the current API and the new one proposed in dandi-archive#1006) to override that with the ordering query parameter. So you can achieve the same behavior as the proposed API in the current one by setting ?ordering=-created. As I'm writing this, I'm realizing that we can avoid modifying the behavior of the /versions/ endpoint altogether and instead require the client to specify the order they want. Then, newer versions of the CLI and web app can just specify ordering=-created to retrieve the most recently published version. I'm going to go ahead and make that change in the API PR.

@jwodder
Copy link
Member

jwodder commented Oct 3, 2022

@mvandenburgh So the /versions/ endpoint is ordered by ?ordering, yet the /assets/ endpoint is ordered by ?order? Why the inconsistency?

@mvandenburgh
Copy link
Member

I agree that they should be consistent. Does the CLI use either of those query parameters with those endpoints?

@jwodder
Copy link
Member

jwodder commented Oct 3, 2022

@mvandenburgh The CLI currently only uses the /assets/?order parameter, and it's controlled by a method argument named "order".

@yarikoptic
Copy link
Member Author

@jwodder @mvandenburgh could you please update on the status -- are we waiting for dandi-archive to harmonize ordering -> order or there is nothing to wait for?

@jwodder
Copy link
Member

jwodder commented Oct 4, 2022

@yarikoptic I have a PR in progress, but the tests are failing because of the order in which versions are currently returned, and I'd like to know whether to use order or ordering to adjust the order.

@yarikoptic
Copy link
Member Author

Thank you @jwodder! @mvandenburgh is change coming to dandi-archive to harmonize that option to order or it will remain ordering?

@mvandenburgh
Copy link
Member

@jwodder @yarikoptic I just merged a change to dandi-archive changing the /versions/ endpoint to accept order instead of ordering.

@jwodder
Copy link
Member

jwodder commented Oct 5, 2022

@mvandenburgh For documentation purposes, what fields can /versions/ be ordered by besides created?

@mvandenburgh
Copy link
Member

Currently, created is the only field that /versions/ can be ordered by.

@jwodder
Copy link
Member

jwodder commented Oct 5, 2022

@mvandenburgh I'm passing ?order=-created to the /versions/ endpoint, yet I'm still getting the draft version back as the first (in response order) "Published" version.

@jwodder
Copy link
Member

jwodder commented Oct 5, 2022

@mvandenburgh Also, question: It's my understanding that, currently, any previously-published version can be "published again" by POSTing to /dandisets/:id/versions/:id/publish/. Is that still the case? If yes, then shouldn't the proper workflow involve querying the /info/ endpoint for the re-published version instead of always using draft? Or is draft now the only version that can be published?

@mvandenburgh
Copy link
Member

@mvandenburgh Also, question: It's my understanding that, currently, any previously-published version can be "published again" by POSTing to /dandisets/:id/versions/:id/publish/. Is that still the case? If yes, then shouldn't the proper workflow involve querying the /info/ endpoint for the re-published version instead of always using draft? Or is draft now the only version that can be published?

draft is, and to my knowledge has always been, the only version that can be published. POSTing to /dandisets/:id/versions/:id/publish/ for a non-draft version will always result in a 400 error.

@mvandenburgh I'm passing ?order=-created to the /versions/ endpoint, yet I'm still getting the draft version back as the first (in response order) "Published" version.

Are you trying that against the production/staging instance, or a local instance of dandi/dandi-archive#1006? The changes haven't been merged in yet due to the failing CLI integration tests; I can issue a staging release if you'd like.

@jwodder
Copy link
Member

jwodder commented Oct 5, 2022

@mvandenburgh I'm trying against a Dockerized instance of (what should be) the latest master commit of dandi-archive. I was under the impression that the ?order change had been merged separately and that the change in the CLI's workflow for 1006 should be backwards-compatible with the pre-1006 Archive.

@mvandenburgh
Copy link
Member

Ah, I think I understand - are you saying there are other versions returned before it, but the draft version is the first with a status of Publishing? i.e. the ones before it have different values for status?

@jwodder
Copy link
Member

jwodder commented Oct 5, 2022

@yarikoptic Yes. When testing publishing on a fresh Dandiset, there is one published version before "draft", and it has a status of "Valid".

@mvandenburgh
Copy link
Member

mvandenburgh commented Oct 5, 2022

Ok great, that is the intended behavior. Any non-draft version is implicitly a "published" version; the Published status on the draft dandiset is simply saying that that draft version has been published; in other words, a published version of that draft version exists. So, I think the workflow for the CLI would be something like

  1. Hit /api/dandisets/<id>/versions/?order=created (note there's no - in front of created); the first version returned here is guaranteed to be the draft version. Poll that endpoint until the draft version's status is Published.
  2. Once the draft version's status is Published, hit /api/dandisets/<id>/versions/?order=-created (note the - to indicate descending order). The first version returned here is guaranteed to be the most recently created version; and since the draft version's status is Published, that first version is also guaranteed to be the most recently published.

@jwodder
Copy link
Member

jwodder commented Oct 5, 2022

@mvandenburgh

  • For step 1, why not just poll /api/dandisets/<id>/versions/draft/info/ as you stated here?
  • For step 2, should any filtering be done (before or after 1006 is merged) on the versions' status fields?

@mvandenburgh
Copy link
Member

mvandenburgh commented Oct 5, 2022

For step 1, why not just poll /api/dandisets//versions/draft/info/ as you stated dandi/dandi-archive#1006 (comment)?

That's also possible, and I agree that's a cleaner way to go about it. I had the /versions/ endpoint on my mind so I thought of that when writing that, but technically either way should work.

For step 2, should any filtering be done (before or after 1006 is merged) on the versions' status fields?

What sort of filtering are you referring to?

@jwodder
Copy link
Member

jwodder commented Oct 5, 2022

@mvandenburgh

What sort of filtering are you referring to?

Well, your original description of the workflow says "the first version with a status of "PUBLISHED" in that response is guaranteed to be the most recently published version", implying that versions with different statuses should be discarded. Is that still the case, either before or after 1006 is merged?

@mvandenburgh
Copy link
Member

@mvandenburgh

What sort of filtering are you referring to?

Well, your original description of the workflow says "the first version with a status of "PUBLISHED" in that response is guaranteed to be the most recently published version", implying that versions with different statuses should be discarded. Is that still the case, either before or after 1006 is merged?

Apologies, that was a mistake, I've update the original comment. Versions with different statuses should not be discarded; if the draft version has a status of Published, then the first version returned by /versions/?order=-created is guaranteed to be the most recently published version, regardless of what its status is.

yarikoptic added a commit that referenced this issue Oct 6, 2022
Update client-side publication workflow
@github-actions
Copy link

github-actions bot commented Oct 6, 2022

🚀 Issue was released in 0.46.4 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants