perf: Added optimizer rules for `is_null().all()` and similar expressions to use `null_count()` #18359

barak1412 · 2024-08-25T15:45:28Z

Fixes #17605.

Added rules:

is_null().all() -> null_count() == len()
is_null().any() -> null_count() > 0
is_not_null().all() -> null_count() == 0
is_not_null().any() -> null_count() < len()
is_null().sum() -> null_count()
is_not_null().sum() -> len() - null_count()
drop_nulls().len() -> len() - null_count()
drop_nulls().count() -> len() - null_count()

Thanks.

…count_optimization

codecov · 2024-08-25T16:06:56Z

Codecov Report

Attention: Patch coverage is 96.94656% with 4 lines in your changes missing coverage. Please review.

Project coverage is 79.86%. Comparing base (915f164) to head (7fa01f4).
Report is 6 commits behind head on main.

Files with missing lines	Patch %	Lines
...s/polars-plan/src/plans/optimizer/simplify_expr.rs	96.00%	2 Missing ⚠️
...ars-plan/src/plans/optimizer/simplify_functions.rs	96.55%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main   #18359      +/-   ##
==========================================
+ Coverage   79.84%   79.86%   +0.01%     
==========================================
  Files        1497     1497              
  Lines      201828   201959     +131     
  Branches     2867     2867              
==========================================
+ Hits       161141   161286     +145     
+ Misses      40141    40127      -14     
  Partials      546      546

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

coastalwhite · 2024-08-25T20:04:15Z

This PR looks good to me in general, although I am not the person who decides whether it gets merged.

Two things that stuck out to me.

The CI failure is probably due to you having to gate certain things with Int8 / Int16 behind the cfg(feature = "dtype-i8") / cfg(feature = "dtype-i16").
I don't think that macro has to be a macro. It can just be a function, if so I think is preferred to make it a function.

In general, very nice PR.

crates/polars-plan/src/plans/optimizer/simplify_expr.rs

crates/polars-plan/src/plans/optimizer/simplify_functions.rs

py-polars/tests/unit/lazyframe/test_optimizations.py

crates/polars-plan/src/plans/optimizer/simplify_expr.rs

py-polars/tests/unit/lazyframe/test_optimizations.py

nameexhaustion · 2024-08-26T05:13:55Z

Nice work! I've left a few comments.

crates/polars-plan/src/plans/optimizer/simplify_expr.rs

barak1412 · 2024-08-27T15:55:04Z

@nameexhaustion @coastalwhite

I fixed all @nameexhaustion comments, so I will be glad for review.

Regarding @coastalwhite comment about make_null_count_expr macro, do you think add it as a function to the AEpr enum (like new_null_count) will be cleaner?

Thanks.

barak1412 · 2024-08-28T10:43:27Z

@nameexhaustion

Fixed also @coastalwhite comment, now it indeed looks nicer.
Please let me know if further changes are needed, thanks.

nameexhaustion · 2024-08-28T12:02:19Z

LGTM - will need final approval from @ritchie46 to merge

There are a few benchmarks in the comments from #17605 (comment) to take note of - in some edge cases this performed slightly slower, but in most cases this should lead to good speedups.

…count_optimization

barak1412 · 2024-08-29T14:28:47Z

LGTM - will need final approval from @ritchie46 to merge

There are a few benchmarks in the comments from #17605 (comment) to take note of - in some edge cases this performed slightly slower, but in most cases this should lead to good speedups.

Thanks, I rebased against main since I saw the files I edited were changes.
I hope it will be merged soon.

Edit:

Final performance test in release mode:

import polars as pl
import timeit

lf = pl.select(pl.when(pl.int_range(1_000_000) % 8 < 8).then(1)).lazy()

q = lf.select(pl.first().is_null().all()).collect
t = timeit.timeit(q, number=10_000)
print("is_null().all() :: ", t)

q = lf.select(pl.first().null_count() == pl.len()).collect
t = timeit.timeit(q, number=10_000)
print(f"null_count() == len() :: {t}\n")

q = lf.select(pl.first().is_null().any()).collect
t = timeit.timeit(q, number=10_000)
print("is_null().any() :: ", t)

q = lf.select(pl.first().null_count() > 0).collect
t = timeit.timeit(q, number=10_000)
print(f"null_count() > 0 :: {t}\n")

q = lf.select(pl.first().is_not_null().all()).collect
t = timeit.timeit(q, number=10_000)
print("is_not_null().all() :: ", t)

q = lf.select(pl.first().null_count() == 0).collect
t = timeit.timeit(q, number=10_000)
print(f"null_count() == 0 :: {t}\n")

q = lf.select(pl.first().is_not_null().any()).collect
t = timeit.timeit(q, number=10_000)
print("is_not_null().any() :: ", t)

q = lf.select(pl.len() > pl.first().null_count()).collect
t = timeit.timeit(q, number=10_000)
print(f"len() > null_count() :: {t}\n")

q = lf.select(pl.first().is_null().sum()).collect
t = timeit.timeit(q, number=10_000)
print("is_null().sum() :: ", t)

q = lf.select(pl.first().null_count()).collect
t = timeit.timeit(q, number=10_000)
print(f"null_count() :: {t}\n")

q = lf.select(pl.first().is_not_null().sum()).collect
t = timeit.timeit(q, number=10_000)
print("is_not_null().sum() :: ", t)

q = lf.select(pl.len() - pl.first().null_count()).collect
t = timeit.timeit(q, number=10_000)
print(f"len() - null_count() :: {t}\n")

q = lf.select(pl.first().drop_nulls().len()).collect
t = timeit.timeit(q, number=10_000)
print("drop_nulls().len() :: ", t)

q = lf.select(pl.len() - pl.first().null_count()).collect
t = timeit.timeit(q, number=10_000)
print(f"len() - null_count() :: {t}\n")

q = lf.select(pl.first().drop_nulls().count()).collect
t = timeit.timeit(q, number=10_000)
print("drop_nulls().count() :: ", t)

q = lf.select(pl.len() - pl.first().null_count()).collect
t = timeit.timeit(q, number=10_000)
print("len() - null_count() :: ", t)

is_null().all() ::  0.927607897989219
null_count() == len() :: 0.8752323719963897

is_null().any() ::  0.07383483598823659
null_count() > 0 :: 0.0765081650170032

is_not_null().all() ::  0.07545267502428032
null_count() == 0 :: 0.07799395400797948

is_not_null().any() ::  0.9184792170126457
len() > null_count() :: 0.8450572740111966

is_null().sum() ::  0.060947869991650805
null_count() :: 0.058283796010073274

is_not_null().sum() ::  0.936902777000796
len() - null_count() :: 0.8810137850232422

drop_nulls().len() ::  0.9251011039887089
len() - null_count() :: 0.8557504759810399

drop_nulls().count() ::  0.8580696490244009
len() - null_count() ::  0.7815374790225178

barak1412 added 9 commits August 21, 2024 09:57

2 expression were changed

09f2acb

4 expression were changed

4724c02

2 expression were changed

d066485

Merge branch 'main' of https://github.com/barak1412/polars into null_…

90827ce

…count_optimization

last working version with one test

7129433

last working version with one test

1352976

full working version with tests

6711a12

Extracted nullcount to macro

6c01386

final with linting

f70d0cb

barak1412 requested review from ritchie46, c-peters, alexander-beedie, MarcoGorelli, reswqa and orlp as code owners August 25, 2024 15:45

github-actions bot added performance Performance issues or improvements python Related to Python Polars rust Related to Rust Polars labels Aug 25, 2024

Fix UInt8->Int8 due to failed test

ee956b8