-
Notifications
You must be signed in to change notification settings - Fork 400
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
files_by_partitions() changed the return type from absolute to relative paths in 0.6.2 #894
Labels
bug
Something isn't working
Comments
Thank you for pointing this out; we should not have changed that API. |
Is there any news regarding the issue? Any plans to fix it? |
Sorry haven't gotten around to this, but I will look at this soon. IMO we have too many methods for getting files right now, and I'd like to par them down. Here's what I'm thinking:
|
3 tasks
fvaleye
pushed a commit
that referenced
this issue
Dec 27, 2022
# Description This PR consolidates the four methods `files()`, `file_paths()`, `file_uris()`, `files_by_partitions()` into just two methods: * `files()` -> which returns paths as they are in the Delta Log (usually relative, but *can* be absolute, particularly if they are located outside of the delta table root). * `file_uris()`, which returns absolute URIs for all files. Both of these now take the `partition_filters` parameter, making `files_by_partitions()` obsolete. That latter function has been marked deprecated, but it also returns it to its original behavior of returning absolute file paths and not relative ones, resolving #894. Finally, the `partition_filters` parameter now supports passing values other than strings, such as integers and floats. TODO: * [x] Update documentation * [ ] ~~Test behavior of filtering for null or non-null~~ Null handling isn't supported by DNF filters IIUC * [x] Test behavior of paths on object stores. # Related Issue(s) <!--- For example: - closes #106 ---> # Documentation <!--- Share links to useful documentation --->
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Environment
Delta-rs version: 0.6.2
Binding: python
Environment:
Bug
What happened:
In previous versions <= 0.6.1,
files_by_partitions
returned the absolute path of the files. However, in 0.6.2, the relative path is returned.What you expected to happen:
Minor version upgrades will not break API.
How to reproduce it:
In <= 0.61,
parquet_files
is an absolute path. In 0.6.2, it is relative todelta_lake_path
.More details:
Unclear if this has been resolved per #880 .
It sounds like https://delta-io.github.io/delta-rs/python/usage.html#custom-storage-backends could be used. However, this Delta Lake is on a mount, so it is unclear if this is needed as it appears as a local file to the system. More importantly, it was unexpected for the API to change between minor versions.
It is also worth noting that https://delta-io.github.io/delta-rs/python/usage.html#custom-storage-backends does not specify that the path will be absolute or relative.
The text was updated successfully, but these errors were encountered: