Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use flat file listing in zarr file browser #1394

Merged
merged 6 commits into from
Dec 19, 2022
Merged

Conversation

jjnesbitt
Copy link
Member

@jjnesbitt jjnesbitt commented Dec 7, 2022

This is a simplification, and will as a result fix any zarrs with out of sync file listings (any other zarrs affected by the underlying issue to #1378).

@jwodder This is a breaking change and will require a corresponding CLI update. If anything here seem infeasible / unreasonable, please let me know.

Notable breaking changes:

  • The url is now zarr/{zarr_id}/files
  • To filter by prefix, use the prefix query param
  • To limit the number of files returned, use the limit query param (default 1000)
  • To perform pagination, provide the after query param, setting it to the key of the last item in the previous listing
  • To be redirected to file download (requires providing file name in prefix), set the download flag to true
  • The asset download endpoint no longer redirects to the zarr file explorer view if the specified asset contains a zarr. Instead, it returns a 400, instructing the user to use the zarr file listing endpoint directly. I prefer this than to blindly redirecting to zarr file listing endpoint, as the "download" operation on an entire zarr isn't something we support.

Head requests are still supported, and will use the provided prefix query param to return a presigned head URL. If the prefix is not the entire path of a file, the HEAD request against the presigned url will return a 404.

@jjnesbitt jjnesbitt marked this pull request as draft December 7, 2022 22:05
@jjnesbitt jjnesbitt marked this pull request as ready for review December 7, 2022 22:54
@jwodder
Copy link
Member

jwodder commented Dec 12, 2022

@AlmightyYakob What is the format of the zarr/{zarr_id}/files response?

To perform pagination, provide the after query param, setting it to the key of the last item in the previous listing

Is there a reason that pagination doesn't work like other endpoints, with "results" and "next" keys in the response?

@danlamanna danlamanna self-requested a review December 12, 2022 16:18
@jjnesbitt
Copy link
Member Author

@AlmightyYakob What is the format of the zarr/{zarr_id}/files response?

My bad. The format is as such:

[
  {
    "Key": "foo/bar",
    "LastModified": "2022-12-06T21:46:34.331000Z",
    "Etag": "f0c0b28da13bec676d3a348d6f11dabe",
    "Size": 100
  }
]

Is there a reason that pagination doesn't work like other endpoints, with "results" and "next" keys in the response?

Yes, S3 doesn't provide any mechanism for limit/offset pagination when listing objects, only providing a StartAfter parameter, to indicate the last element of the previous "page". There's also no way to get the total count of objects in the bucket (or under a certain prefix, as we would need), so it would be impossible to easily provide the count field, which is the standard for our other pagination endpoints.

@danlamanna
Copy link
Contributor

I think we should still provide a next link where we can, this way it's easier for consumers and (hopefully) prevents them from hand-constructing URLs.

@jjnesbitt
Copy link
Member Author

@jwodder The endpoint response has been updated to the following

{
  "next": "http://localhost:8000/api/zarr/357a0699-4f08-4a3b-83a7-923353379e28/files/?after=subdir%2Fa.txt&download=false&limit=1&prefix=subdir",
  "results": [
    {
      "Key": "subdir/a.txt",
      "LastModified": "2022-12-07 17:46:51.962000+00:00",
      "ETag": "c560516266eb28040362e72ec278b4a2",
      "Size": 8
    }
  ]
}

@jwodder
Copy link
Member

jwodder commented Dec 15, 2022

@AlmightyYakob Does the DELETE /zarr/:id/files/ endpoint for deleting Zarr entries still work the same way?

@jjnesbitt
Copy link
Member Author

@AlmightyYakob Does the DELETE /zarr/:id/files/ endpoint for deleting Zarr entries still work the same way?

Yes, the zarr file deletion endpoint is unchanged.

Copy link
Member

@mvandenburgh mvandenburgh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM from the perspective of the API.

@jwodder
Copy link
Member

jwodder commented Dec 19, 2022

@AlmightyYakob There was one failure in the test against dandi/dandi-cli#1175, regarding the test of get_content_url(). Has the contents of the contentUrl metadata field changed for Zarr assets?

EDIT: It seems that the test in question no longer applies. However, there was another test failure I missed, in which this test fails on the second upload with a 409 and the body (prettified):

<Error>
  <Code>XMinioObjectExistsAsDirectory</Code>
  <Message>Object name already exists as a directory.</Message>
  <Key>zarr/1ef801d4-44ad-4b7c-bd72-ad8bf2a9cee6/changed-type</Key>
  <BucketName>dandi-dandisets</BucketName>
  <Resource>/dandi-dandisets/zarr/1ef801d4-44ad-4b7c-bd72-ad8bf2a9cee6/changed-type</Resource>
  <RequestId>17324C0AA68E7B7F</RequestId>
  <HostId>eb4e137a-89c9-4a14-848d-2c79cfaab7c5</HostId>
</Error>

@jwodder
Copy link
Member

jwodder commented Dec 19, 2022

@AlmightyYakob The integration tests with my PR are passing now.

@jjnesbitt
Copy link
Member Author

@AlmightyYakob The integration tests with my PR are passing now.

@jwodder So there's no changes needed on this PR, in reference to your previous comment?

@jwodder
Copy link
Member

jwodder commented Dec 19, 2022

@AlmightyYakob Right.

@jjnesbitt jjnesbitt added patch Increment the patch version when merged release Create a release when this pr is merged labels Dec 19, 2022
@jjnesbitt jjnesbitt merged commit 29da166 into master Dec 19, 2022
@jjnesbitt jjnesbitt deleted the zarr-file-explorer branch December 19, 2022 21:50
@dandibot
Copy link
Member

🚀 PR was released in v0.3.8 🚀

@dandibot dandibot added the released This issue/pull request has been released. label Dec 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
patch Increment the patch version when merged release Create a release when this pr is merged released This issue/pull request has been released.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants