Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: set compat_level when calling to_arrow #85

Merged
merged 6 commits into from
Jul 25, 2024

Conversation

ruihe774
Copy link
Contributor

@ruihe774 ruihe774 commented Jul 2, 2024

This depends on pola-rs/polars#17371.

In my naive benchmark, with large string columns, it improves the speed to ~10x.

@ruihe774 ruihe774 marked this pull request as ready for review July 3, 2024 14:12
@ruihe774
Copy link
Contributor Author

ruihe774 commented Jul 3, 2024

I'm still wondering how to ensure forward compatibility with future=True. When a old version of pyo3-polars is used together with a newer version of python polars which has introduced incompatible arrow data structure updates, it can be broken. One solution is to check the version of polars before setting future=True, and another solution is to introduce versioning like future=1.

As for backward compatibility, we can simply retry with future=True unset if it fails.

@ruihe774
Copy link
Contributor Author

ruihe774 commented Jul 4, 2024

pola-rs/polars#17421

@ruihe774
Copy link
Contributor Author

ruihe774 commented Jul 7, 2024

pola-rs/polars#17421

I've made this PR work with it. In PySeries::into_py and PySeries::extract_bound, newest compatibility level supported by both sides is negotiated and selected automatically. In this way, both backward compatibility and forward compatibility are achieved.

@ruihe774 ruihe774 changed the title perf: set future=True when calling to_arrow perf: set compat_level when calling to_arrow Jul 7, 2024
@@ -13,3 +13,10 @@ polars-core = { version = "0.41.0", default-features = false }
polars-ffi = { version = "0.41.0", default-features = false }
polars-plan = { version = "0.41.0", default-feautres = false }
polars-lazy = { version = "0.41.0", default-features = false }

[patch.crates-io]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we point this to the main branch until 0.42 is released.

Copy link
Contributor Author

@ruihe774 ruihe774 Jul 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cargo only recognizes patches defined in the root project and not in dependencies. pyo3-polars is meant to be a dependency of projects that use it. So unless they declare patches in their Cargo.toml, patches are not in effect.

Patches declared here are only for testing. We can only wait for 0.42.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright. 0.42 is planned in ca. 2 weeks.

@ritchie46 ritchie46 merged commit 04a7efc into pola-rs:main Jul 25, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants