-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf(python): Improve Series.to_numpy
performance for chunked Series that would otherwise be zero-copy
#16301
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #16301 +/- ##
==========================================
- Coverage 80.78% 80.75% -0.03%
==========================================
Files 1393 1393
Lines 179362 179455 +93
Branches 2921 2922 +1
==========================================
+ Hits 144894 144919 +25
- Misses 33965 34033 +68
Partials 503 503 ☔ View full report in Codecov by Sentry. |
Nice. I like this approach. :)
Not entirely sure how the codspeed works, but are these tests ran by codspeed? |
Yes! You can see it here, 3 new benchmarks: Also, I spotted a problem with this implementation for nested data, going to have another look before putting it in review again. |
How does one register them then? A specific folder? |
You just have to do |
I had to add an up-front check whether the Series has the right dtype / nested nulls, otherwise we could rechunk unnecessarily. Should be good to go now, waiting for CI to turn green 🤞 |
I'm seeing this raise in 0.20.27; it did not raise for me in 0.20.26. Is this expected? pl.concat(
[
pl.DataFrame({"a": [1, 1, 2], "b": [2, 3, 4]}),
pl.DataFrame({"a": [1, 1, 2], "b": [2, 3, 4]}),
]
).to_numpy()
# PanicException: source slice length (3) does not match destination slice length (6) I haven't confirmed it to be from this PR, but looked likely. |
Apologies - jumped the gun here. #16288 looks more likely. |
…s that would otherwise be zero-copy (pola-rs#16301)
Ref #16267
Instead of iterating over the values, we rechunk and create a writable view.
This regressed in #16178 - now we get the best of both worlds with a writable array and only a single, fast copy.
Performance of converting a chunked Series of 50 million float32s:
I added some benchmark tests so that we may catch regressions here in the future.