perf(python): Improve `Series.to_numpy` performance for chunked Series that would otherwise be zero-copy #16301

stinodego · 2024-05-17T20:50:46Z

Ref #16267

Instead of iterating over the values, we rechunk and create a writable view.

This regressed in #16178 - now we get the best of both worlds with a writable array and only a single, fast copy.

Performance of converting a chunked Series of 50 million float32s:

Before: ~60ms
After: ~25ms

I added some benchmark tests so that we may catch regressions here in the future.

codecov · 2024-05-17T21:09:24Z

Codecov Report

Attention: Patch coverage is 98.24561% with 1 lines in your changes are missing coverage. Please review.

Project coverage is 80.75%. Comparing base (6d48c11) to head (319d255).
Report is 8 commits behind head on main.

Files	Patch %	Lines
py-polars/src/to_numpy.rs	98.14%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main   #16301      +/-   ##
==========================================
- Coverage   80.78%   80.75%   -0.03%     
==========================================
  Files        1393     1393              
  Lines      179362   179455      +93     
  Branches     2921     2922       +1     
==========================================
+ Hits       144894   144919      +25     
- Misses      33965    34033      +68     
  Partials      503      503

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

ritchie46 · 2024-05-18T07:04:50Z

Nice. I like this approach. :)

I added some benchmark tests so that we may catch regressions here in the future.

Not entirely sure how the codspeed works, but are these tests ran by codspeed?

stinodego · 2024-05-18T08:17:51Z

Not entirely sure how the codspeed works, but are these tests ran by codspeed?

Yes! You can see it here, 3 new benchmarks:
https://codspeed.io/pola-rs/polars/branches/to-np-copy-chunk

Also, I spotted a problem with this implementation for nested data, going to have another look before putting it in review again.

ritchie46 · 2024-05-18T08:32:32Z

How does one register them then? A specific folder?

stinodego · 2024-05-18T08:34:51Z

How does one register them then? A specific folder?

You just have to do @pytest.mark.benchmark. That's all. I made a dedicated benchmark folder because our tests require some data generation utilities and it's nice to have those in one place, but it doesn't have to be in there necessarily.

stinodego · 2024-05-18T12:15:24Z

I had to add an up-front check whether the Series has the right dtype / nested nulls, otherwise we could rechunk unnecessarily.

Should be good to go now, waiting for CI to turn green 🤞

rhshadrach-8451 · 2024-05-21T17:01:19Z

I'm seeing this raise in 0.20.27; it did not raise for me in 0.20.26. Is this expected?

pl.concat(
    [
        pl.DataFrame({"a": [1, 1, 2], "b": [2, 3, 4]}),
        pl.DataFrame({"a": [1, 1, 2], "b": [2, 3, 4]}),
    ]
).to_numpy()
# PanicException: source slice length (3) does not match destination slice length (6)

I haven't confirmed it to be from this PR, but looked likely.

rhshadrach-8451 · 2024-05-21T17:06:49Z

Apologies - jumped the gun here. #16288 looks more likely.

…s that would otherwise be zero-copy (pola-rs#16301)

stinodego added 2 commits May 17, 2024 22:47

Improve handling of chunked viewable Series

2188792

Update test

d45d505

github-actions bot added performance Performance issues or improvements python Related to Python Polars labels May 17, 2024

stinodego added 3 commits May 17, 2024 23:44

Add coverage for temporal branch

84920df

Add benchmark tests for numpy interop

8243b89

Restore question mark

241590c

stinodego marked this pull request as ready for review May 17, 2024 22:31

stinodego requested review from ritchie46, c-peters, alexander-beedie, MarcoGorelli and reswqa as code owners May 17, 2024 22:31

Typing

c55234a

stinodego marked this pull request as draft May 18, 2024 07:54

Improve behavior for arrays / do not rechunk unnecssarily

f75b9a7

Return writable flag

319d255

stinodego marked this pull request as ready for review May 18, 2024 12:38

stinodego requested a review from orlp as a code owner May 18, 2024 12:38

ritchie46 approved these changes May 18, 2024

View reviewed changes

ritchie46 merged commit 42795c6 into main May 18, 2024
26 checks passed

ritchie46 deleted the to-np-copy-chunk branch May 18, 2024 15:30

c-peters added the accepted Ready for implementation label May 21, 2024

c-peters assigned stinodego May 21, 2024

Wouittone pushed a commit to Wouittone/polars that referenced this pull request Jun 22, 2024

perf(python): Improve Series.to_numpy performance for chunked Serie…

1ecce8a

…s that would otherwise be zero-copy (pola-rs#16301)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(python): Improve `Series.to_numpy` performance for chunked Series that would otherwise be zero-copy #16301

perf(python): Improve `Series.to_numpy` performance for chunked Series that would otherwise be zero-copy #16301

stinodego commented May 17, 2024 •

edited

Loading

codecov bot commented May 17, 2024 •

edited

Loading

ritchie46 commented May 18, 2024

stinodego commented May 18, 2024

ritchie46 commented May 18, 2024

stinodego commented May 18, 2024

stinodego commented May 18, 2024

rhshadrach-8451 commented May 21, 2024

rhshadrach-8451 commented May 21, 2024

perf(python): Improve Series.to_numpy performance for chunked Series that would otherwise be zero-copy #16301

perf(python): Improve Series.to_numpy performance for chunked Series that would otherwise be zero-copy #16301

Conversation

stinodego commented May 17, 2024 • edited Loading

codecov bot commented May 17, 2024 • edited Loading

Codecov Report

ritchie46 commented May 18, 2024

stinodego commented May 18, 2024

ritchie46 commented May 18, 2024

stinodego commented May 18, 2024

stinodego commented May 18, 2024

rhshadrach-8451 commented May 21, 2024

rhshadrach-8451 commented May 21, 2024

perf(python): Improve `Series.to_numpy` performance for chunked Series that would otherwise be zero-copy #16301

perf(python): Improve `Series.to_numpy` performance for chunked Series that would otherwise be zero-copy #16301

stinodego commented May 17, 2024 •

edited

Loading

codecov bot commented May 17, 2024 •

edited

Loading