Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: Added optimizer rules for is_null().all() and similar expressions to use null_count() #18359

Merged
merged 14 commits into from
Aug 30, 2024

Conversation

barak1412
Copy link
Contributor

@barak1412 barak1412 commented Aug 25, 2024

Fixes #17605.

Added rules:

  • is_null().all() -> null_count() == len()
  • is_null().any() -> null_count() > 0
  • is_not_null().all() -> null_count() == 0
  • is_not_null().any() -> null_count() < len()
  • is_null().sum() -> null_count()
  • is_not_null().sum() -> len() - null_count()
  • drop_nulls().len() -> len() - null_count()
  • drop_nulls().count() -> len() - null_count()

Thanks.

@github-actions github-actions bot added performance Performance issues or improvements python Related to Python Polars rust Related to Rust Polars labels Aug 25, 2024
Copy link

codecov bot commented Aug 25, 2024

Codecov Report

Attention: Patch coverage is 96.94656% with 4 lines in your changes missing coverage. Please review.

Project coverage is 79.86%. Comparing base (915f164) to head (7fa01f4).
Report is 6 commits behind head on main.

Files with missing lines Patch % Lines
...s/polars-plan/src/plans/optimizer/simplify_expr.rs 96.00% 2 Missing ⚠️
...ars-plan/src/plans/optimizer/simplify_functions.rs 96.55% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #18359      +/-   ##
==========================================
+ Coverage   79.84%   79.86%   +0.01%     
==========================================
  Files        1497     1497              
  Lines      201828   201959     +131     
  Branches     2867     2867              
==========================================
+ Hits       161141   161286     +145     
+ Misses      40141    40127      -14     
  Partials      546      546              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@coastalwhite
Copy link
Collaborator

This PR looks good to me in general, although I am not the person who decides whether it gets merged.

Two things that stuck out to me.

  • The CI failure is probably due to you having to gate certain things with Int8 / Int16 behind the cfg(feature = "dtype-i8") / cfg(feature = "dtype-i16").
  • I don't think that macro has to be a macro. It can just be a function, if so I think is preferred to make it a function.

In general, very nice PR.

@nameexhaustion
Copy link
Collaborator

Nice work! I've left a few comments.

@barak1412
Copy link
Contributor Author

@nameexhaustion @coastalwhite

I fixed all @nameexhaustion comments, so I will be glad for review.

Regarding @coastalwhite comment about make_null_count_expr macro, do you think add it as a function to the AEpr enum (like new_null_count) will be cleaner?

Thanks.

@barak1412
Copy link
Contributor Author

@nameexhaustion

Fixed also @coastalwhite comment, now it indeed looks nicer.
Please let me know if further changes are needed, thanks.

@nameexhaustion
Copy link
Collaborator

LGTM - will need final approval from @ritchie46 to merge

There are a few benchmarks in the comments from #17605 (comment) to take note of - in some edge cases this performed slightly slower, but in most cases this should lead to good speedups.

@barak1412
Copy link
Contributor Author

barak1412 commented Aug 29, 2024

LGTM - will need final approval from @ritchie46 to merge

There are a few benchmarks in the comments from #17605 (comment) to take note of - in some edge cases this performed slightly slower, but in most cases this should lead to good speedups.

Thanks, I rebased against main since I saw the files I edited were changes.
I hope it will be merged soon.

Edit:

Final performance test in release mode:

import polars as pl
import timeit

lf = pl.select(pl.when(pl.int_range(1_000_000) % 8 < 8).then(1)).lazy()

q = lf.select(pl.first().is_null().all()).collect
t = timeit.timeit(q, number=10_000)
print("is_null().all() :: ", t)

q = lf.select(pl.first().null_count() == pl.len()).collect
t = timeit.timeit(q, number=10_000)
print(f"null_count() == len() :: {t}\n")

q = lf.select(pl.first().is_null().any()).collect
t = timeit.timeit(q, number=10_000)
print("is_null().any() :: ", t)

q = lf.select(pl.first().null_count() > 0).collect
t = timeit.timeit(q, number=10_000)
print(f"null_count() > 0 :: {t}\n")

q = lf.select(pl.first().is_not_null().all()).collect
t = timeit.timeit(q, number=10_000)
print("is_not_null().all() :: ", t)

q = lf.select(pl.first().null_count() == 0).collect
t = timeit.timeit(q, number=10_000)
print(f"null_count() == 0 :: {t}\n")

q = lf.select(pl.first().is_not_null().any()).collect
t = timeit.timeit(q, number=10_000)
print("is_not_null().any() :: ", t)

q = lf.select(pl.len() > pl.first().null_count()).collect
t = timeit.timeit(q, number=10_000)
print(f"len() > null_count() :: {t}\n")

q = lf.select(pl.first().is_null().sum()).collect
t = timeit.timeit(q, number=10_000)
print("is_null().sum() :: ", t)

q = lf.select(pl.first().null_count()).collect
t = timeit.timeit(q, number=10_000)
print(f"null_count() :: {t}\n")

q = lf.select(pl.first().is_not_null().sum()).collect
t = timeit.timeit(q, number=10_000)
print("is_not_null().sum() :: ", t)

q = lf.select(pl.len() - pl.first().null_count()).collect
t = timeit.timeit(q, number=10_000)
print(f"len() - null_count() :: {t}\n")

q = lf.select(pl.first().drop_nulls().len()).collect
t = timeit.timeit(q, number=10_000)
print("drop_nulls().len() :: ", t)

q = lf.select(pl.len() - pl.first().null_count()).collect
t = timeit.timeit(q, number=10_000)
print(f"len() - null_count() :: {t}\n")

q = lf.select(pl.first().drop_nulls().count()).collect
t = timeit.timeit(q, number=10_000)
print("drop_nulls().count() :: ", t)

q = lf.select(pl.len() - pl.first().null_count()).collect
t = timeit.timeit(q, number=10_000)
print("len() - null_count() :: ", t)
is_null().all() ::  0.927607897989219
null_count() == len() :: 0.8752323719963897

is_null().any() ::  0.07383483598823659
null_count() > 0 :: 0.0765081650170032

is_not_null().all() ::  0.07545267502428032
null_count() == 0 :: 0.07799395400797948

is_not_null().any() ::  0.9184792170126457
len() > null_count() :: 0.8450572740111966

is_null().sum() ::  0.060947869991650805
null_count() :: 0.058283796010073274

is_not_null().sum() ::  0.936902777000796
len() - null_count() :: 0.8810137850232422

drop_nulls().len() ::  0.9251011039887089
len() - null_count() :: 0.8557504759810399

drop_nulls().count() ::  0.8580696490244009
len() - null_count() ::  0.7815374790225178

@ritchie46 ritchie46 merged commit 8b65fe3 into pola-rs:main Aug 30, 2024
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance issues or improvements python Related to Python Polars rust Related to Rust Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optimize is_null().any() and similar operations that can be written in terms of null count
4 participants