Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

object_store: range request with suffix #4611

Closed
clbarnes opened this issue Aug 1, 2023 · 19 comments · Fixed by #5222
Closed

object_store: range request with suffix #4611

clbarnes opened this issue Aug 1, 2023 · 19 comments · Fixed by #5222
Labels
enhancement Any new improvement worthy of a entry in the changelog

Comments

@clbarnes
Copy link
Contributor

clbarnes commented Aug 1, 2023

We need to read the last n bytes of a file whose size we do not know. This is supported by HTTP range requests, and access to local files (via Seek), can object_store support it?

object_store::GetOptions::range could take, instead of a core::ops::Range, something like this: https://github.com/clbarnes/byteranges-rs/blob/0a953e7c580e96b65fe28e61ed460d6e221dcd8d/src/request.rs#L6-L51 . So long as From<RangeBounds> is supported this may not have to break any existing code.

The alternative is a HEAD request to find the length and then a range request using the offset from 0, which is twice the round trips.

@clbarnes clbarnes added the enhancement Any new improvement worthy of a entry in the changelog label Aug 1, 2023
@tustvold
Copy link
Contributor

tustvold commented Aug 2, 2023

I seem to remember GCS being the only cloud store that supports this, but I could be mistaken. Have you done any research into this?

@clbarnes
Copy link
Contributor Author

clbarnes commented Aug 2, 2023

The S3 docs say they support the Range header, and specifically states that they don't support multipart/byteranges responses, but doesn't specifically say they don't support suffixes https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetObject.html#API_GetObject_RequestSyntax

@tustvold
Copy link
Contributor

tustvold commented Aug 2, 2023

Have you tested this, I seem to remember it did not but my memory is hazy

@normanrz
Copy link

normanrz commented Aug 2, 2023

Suffix range requests are supported by all major cloud providers. Here is an example against S3 (proxied via Cloudfront):

curl -H 'Range: bytes=-524292' https://static.webknossos.org/data/zarr_v3/l4_sample/color/1/c/0/3/3/1 | wc -c

@tustvold
Copy link
Contributor

tustvold commented Aug 2, 2023

I would be happy to review a PR making this change. The use of From<RangeBounds> is clever and I like it, although I suspect many use-cases have hard-coded Range<usize> which will then require modification to support this.

Suffix range requests are supported by all major cloud providers

Good to know, I'm not sure where I got the impression otherwise. Ultimately it isn't a massive problem if they don't support them, we just report this as an error to the user.

@clbarnes
Copy link
Contributor Author

clbarnes commented Aug 2, 2023

we just report this as an error to the user.

Sounds like users might consider that a problem 😁

@tustvold
Copy link
Contributor

tustvold commented Aug 2, 2023

Well yes, but if the store itself doesn't support them there isn't really all that much we can do 😄

@tustvold
Copy link
Contributor

tustvold commented Aug 11, 2023

It would appear that Azure Blob Storage does not support suffix range headers, although it does support prefix range headers

https://learn.microsoft.com/en-us/rest/api/storageservices/specifying-the-range-header-for-blob-service-operations

@clbarnes
Copy link
Contributor Author

clbarnes commented Dec 6, 2023

I think I'm closing in on an implementation here. Allowing suffixes could mean that some bytes are fetched twice but I don't think that's too much of a problem, as it's on the user to have knowledge of the resource they're looking into. I've been using HttpRange as the type name (and in all the functions which use it) - would be nice to have something more terse but it's the best description I think, even for stores which aren't HTTP-backed.

I think that replacing references to Range<usize> in the API with T: Into<HttpRange> + Copy should make for a seamless upgrade path, but the range field on the GetOptions struct would need to be deprecated and a different field of type HttpRange (called http_range?), probably raising an error if both are defined.

What we do with suffixes on azure is another question - probably just a log_once::warn_once!() when a suffix is requested, and then a HEAD request to find the length and regular range request.

@tustvold
Copy link
Contributor

tustvold commented Dec 6, 2023

In the API with T: Into + Copy

This would still be a breaking API change, something I am rather hesitant about. It would also not be object safe, which is likely more problematic.

Did you see, it would appear at least one of the major cloud providers do not support this, and I would not be surprised if there are other implementations that do not - #4611 (comment)

@clbarnes
Copy link
Contributor Author

clbarnes commented Dec 6, 2023

This would still be a breaking API change, something I am rather hesitant about. It would also not be object safe, which is likely more problematic.

Understood. If we added get_http_range and get_http_ranges, deprecated get_range and get_ranges and then provided a default impl which used the new methods under the hood (and marked them as deprecated), then 3rd party implementations would need to update but only in an additive way, new implementations wouldn't need to implement the limited methods, and people could continue to use their existing code. In this case it would definitely be nice to use a name more terse than http_range, though!

The difficult bit would be getting 3rd party implementations to implement get in a way which respects the new GetOptions::http_range field. Marking range as deprecated would at least give visibility to developers but it could easily be ignored.

Did you see, it would appear at least one of the major cloud providers do not support this

Yes, tried to address that in my comment above. Sometimes people need the suffix, and on azure, the best those people can do is HEAD and GET range, but logging it so that users can change their access pattern if necessary is probably the right thing to do.

@clbarnes
Copy link
Contributor Author

clbarnes commented Dec 6, 2023

The difficult bit would be getting 3rd party implementations to implement get in a way which respects the new GetOptions::http_range field. Marking range as deprecated would at least give visibility to developers but it could easily be ignored.

Actually, just noticed we control this too - it's just a case of updating GetOptionsExt. We'd need to decide what to do if someone sent both a range and an http_range in their GetOptions; choosing the http_range and logging a warning is probably best, if not panicking as it's purely a user error that they should be getting a clear deprecation warning about.

@tustvold
Copy link
Contributor

tustvold commented Dec 6, 2023

Perhaps you could expand upon why you do not know the sizes of the files, I would have expected the files to have been identified by either a catalog or a listing call, both of which could provide this information?

I dunno, generally the approach of this crate is to encourage people towards patterns that behave equally well across all backends, as opposed to ones that will have store-specific performance pitfalls.

Perhaps I am just trying to avoid another breaking API change as they end up taking up an inordinate amount of my and others time, something I am somewhat struggling to justify here...

it's just a case of updating GetOptionsExt

GetOptionsExt is crate private

@clbarnes
Copy link
Contributor Author

clbarnes commented Dec 6, 2023

Perhaps you could expand upon why you do not know the sizes of the files

I mentioned our use case in another issue; copied below

As part of the zarr project, we plan to store large tensors on a variety of backends (local/ HTTP/ object store), which are chunked into many separate files/ objects. As part of the sharding specification, each chunk (=shard) could contain many sub-chunks which are independently encoded and then concatenated. We'd want to read a footer to find the byte addresses of sub-chunks (see #4611 ), and then read (possibly multiple) byte ranges from the shard.

We don't want to list all existing chunks ahead of time as there could easily be many millions, and this could even change under our feet if we're writing the tensor as we go. As chunks may be compressed with arbitrary codecs, we can't predict how many bytes they'll be even if we know how large the chunks are; we just need to read the footer (which indexes sub-chunks) so that we then know which bits of the object to read.

I suppose in this use case we never need to read the suffix at the same time as the rest of the chunk, so we could have a separate method for suffix-getting with a default implementation of using a HEAD then GET which is documented as possibly being slow.

I dunno, generally the approach of this crate is to encourage people towards patterns that behave equally well across all backends, as opposed to ones that will have store-specific performance pitfalls.

Patterns, yes, but I hope we've demonstrated that sometimes people actually need a suffix. All stores can do it (with 2 requests), some stores can just do it better (with 1) - should we refuse to use optimisations which are only available to certain stores? If people already know the length (from listing or whatever), then they don't need to use the method documented as being possibly slow.

GetOptionsExt is crate private

Ah, yes, that's unfortunate. So 3rd party stores currently just wrangle their own options?

I think the minimal-impact course is to keep everything as it is and just add something like

pub trait ObjectStore: std::fmt::Display + Send + Sync + Debug + 'static {
    ...

    /// Get the last `nbytes` of an object.
    /// 
    /// If the object size is not known, the default implementation first finds out with a HEAD request.
    /// Stores which support suffix requests directly should override this behaviour.
    async fn get_suffix(&self, location: &Path, nbytes: usize, object_size: Option<usize>) -> Result<GetResult> {
        // if size is None, find out with a head request
        // then do self.get_range
    }
}

Instantly works for everyone, the performance concerns are well-documented, there's an ergonomic path for people who need a suffix and already know the size, and an easy optimisation path for stores which do support it.

@tustvold
Copy link
Contributor

tustvold commented Dec 6, 2023

Thank you for the context.

The get_suffix idea makes sense to me, and seems like a pragmatic way to avoid an API break, and is consistent with methods like rename or list_with_offset. We can always revisit this at a later date if/when looking to make other breaking changes. I'm curious why to include the size though, if the size is known they could just use get_range, no?

Perhaps it could just be?

async fn get_suffix(&self, suffix: usize) -> Result<GetResult>

@clbarnes
Copy link
Contributor Author

clbarnes commented Dec 6, 2023

if the size is known they could just use get_range, no?

Yes, and that's all we'd do internally. I suppose it's a bit defensive, affording the same ergonomics/ aesthetics to users who do or don't know the size so that they can think about it in terms of the suffix length rather than needing saturating_subs all over the place. It also stands to remind users "hey, I do know the size actually", where they might otherwise be tempted to just ask for a suffix and use the possibly-slow path.

@tustvold
Copy link
Contributor

tustvold commented Dec 6, 2023

I can see your point, but I think I would prefer to keep it simple and provide docs to encourage people to use get_range if available.

@clbarnes
Copy link
Contributor Author

clbarnes commented Dec 6, 2023

Just for future reference: the fact that get_range takes a Range also means you can't do the bytes: 10- form (e.g. getting the remainder of an object), which might also be useful to people and I suspect is available in all stores. This could be implemented as a get_offset method, but then it's something which every store would practically have to implement, while not being able to use the existing GetOptions machinery. That would be another benefit of supporting the whole range spec.

The get_suffix will lead to a fair amount of duplicated code as we can't use GetOptions and therefore the get_opts method which all the clients are already set up for.

@tustvold
Copy link
Contributor

tustvold commented Dec 6, 2023

Agreed, I view this as a potentially short-term solution to avoid an API break at this time. I suspect when we next look to do a major release we might revisit this.

clbarnes added a commit to clbarnes/arrow-rs that referenced this issue Dec 12, 2023
With default implementation using a HEAD and then GET request.

See apache#4611.
clbarnes added a commit to clbarnes/arrow-rs that referenced this issue Dec 12, 2023
With default implementation using a HEAD and then GET request.

See apache#4611.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Any new improvement worthy of a entry in the changelog
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants