Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(rust): generalize the cloud storage builders #5972

Merged

Conversation

winding-lines
Copy link
Contributor

@winding-lines winding-lines commented Dec 31, 2022

This PR generalizes the parquet support for cloud urls. It enables all the backends supported by object_store.

This PR threads through the code the required cloud options, as opposed to using the process environment. This reduces the magic provided by global state (std::env) at the expense of more changes for the different layers. I personally prefer more explicit options/settings passing.

This PR also breaks the existing object_store.rs into a module, that file was getting bigger and had a poor cohesion.

Feedback appreciated.

@github-actions github-actions bot added enhancement New feature or an improvement of an existing feature rust Related to Rust Polars labels Dec 31, 2022
@winding-lines winding-lines force-pushed the feat/generalize_cloud_store_builders branch from 8451c0b to 3e098e7 Compare January 3, 2023 02:51
@winding-lines winding-lines force-pushed the feat/generalize_cloud_store_builders branch 3 times, most recently from c0b963f to 563781b Compare January 6, 2023 03:20
@winding-lines
Copy link
Contributor Author

I got some help from the developers of object-store crate, they provided an API that is nicer for us to use. Many thanks :)

This code is ready to review at your convenience. I am not sure how to fully enable it in the Python side... @ritchie46

Copy link
Member

@ritchie46 ritchie46 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @winding-lines! I have a few small remarks.

examples/read_parquet_cloud/src/main.rs Outdated Show resolved Hide resolved
polars/polars-io/Cargo.toml Outdated Show resolved Hide resolved
@winding-lines
Copy link
Contributor Author

@ritchie46 addressed feedback, tests are passing, ready for more feedback or merge 🤗

@chitralverma
Copy link
Contributor

chitralverma commented Jan 8, 2023

@winding-lines when object_store adds more backends in the future, will it require changes in polars as well or is it generalized now?

because I see that in their main/ master branch HTTP has also been added.

@winding-lines
Copy link
Contributor Author

@chitralverma in the current iteration the deltalake and object_store teams have refactored some code out of the former and put in the latter. I think we can also contribute our current layer upstream so that any future changes will be integrated just by recompiling.

Given that this PR has been open for a while my preference would be to merge it and then release a polars version so that I can do more testing at work.

Let me know what you think :)

Copy link
Member

@ritchie46 ritchie46 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @winding-lines. I have only two remarks with respect to dependencies.

If you can fix those and the conflicts it is good to go.

polars/polars-io/Cargo.toml Outdated Show resolved Hide resolved
polars/polars-core/Cargo.toml Outdated Show resolved Hide resolved
@winding-lines winding-lines force-pushed the feat/generalize_cloud_store_builders branch from baba317 to 6f2c89c Compare January 10, 2023 12:17
@winding-lines
Copy link
Contributor Author

@ritchie46 addressed feedback, tests are passing :-)

Many thanks!

@winding-lines winding-lines force-pushed the feat/generalize_cloud_store_builders branch from 7e0f848 to 4d5b9f2 Compare January 11, 2023 15:24
@ritchie46 ritchie46 merged commit 851b9bd into pola-rs:master Jan 11, 2023
@winding-lines
Copy link
Contributor Author

thanks @ritchie46 ❤️

@winding-lines winding-lines deleted the feat/generalize_cloud_store_builders branch January 11, 2023 16:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or an improvement of an existing feature rust Related to Rust Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants