-
Notifications
You must be signed in to change notification settings - Fork 433
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: avoid some allocations in DeltaStorageHandler #1115
Conversation
|
From what I could tell happend. If you have a store with a single object path = Path::from("file")
store.list(Some(path)) I think the remote stores will return a list containing that single object, whereas |
Both should return an empty list, that may be a bug, I'll take a look Edit: filed a bug report upstream, thank you for reporting - apache/arrow-rs#3712 |
turning this back into a draft, since the upstream fixes are in the works ... |
# Description Fixes an error in the `DeltaFileSystemHandler`, when reading file metadata from remote storages. Due to an inconsistency between the behaviour object stores when invoking list operations on a path that points to a file, we incorrectly returned an Directory type for files in case of object stores. The bug only surfaced when using pyarrow < 9, since we used the call only when getting the file size, which we avoid when using more recent pyarrow versions. @tustvold - I seem to vaguely remember discussing this at some point, but am not sure anymore. Is this something we should look into in object-store? Update: validated locally, that the upstream fixes will fix the linked issue, so the main reason for this PR is resolved elsewhere. There are some changes included which safe us some allocation (admittedly very few), but hopefully an improvement anyhow. # Related Issue(s) closes delta-io#1109 # Documentation <!--- Share links to useful documentation --->
Description
Fixes an error in the
DeltaFileSystemHandler
, when reading file metadata from remote storages. Due to an inconsistency between the behaviour object stores when invoking list operations on a path that points to a file, we incorrectly returned an Directory type for files in case of object stores. The bug only surfaced when using pyarrow < 9, since we used the call only when getting the file size, which we avoid when using more recent pyarrow versions.@tustvold - I seem to vaguely remember discussing this at some point, but am not sure anymore. Is this something we should look into in object-store?
Update: validated locally, that the upstream fixes will fix the linked issue, so the main reason for this PR is resolved elsewhere. There are some changes included which safe us some allocation (admittedly very few), but hopefully an improvement anyhow.
Related Issue(s)
closes #1109
Documentation