cache branching subplans optimization prevents parallel concat of LazyFrames #17430
Closed
2 tasks done
Labels
accepted
Ready for implementation
bug
Something isn't working
needs triage
Awaiting prioritization by a maintainer
performance
Performance issues or improvements
python
Related to Python Polars
Checks
Reproducible example
Log output
No response
Issue description
CPU usage is at 100% in this case. The filters on each of the 16 df segments happens in series. If we instead run with
comm_subplan_elim=False
we get the expected behaviour of 1600% CPU usage, and the overall run time is reduce by about half.Our actual use case involves concatenated lazyframes from
scan_ipc
which go through ajoin_asof
. We branch into two related queries before concatenating the result (just like in this example). In that scenario the runtime is reduced substantially (5-8x) by running withcomm_subplan_elim=False
due to parallelism. However it takes twice as long as the single query takes since the scan_ipc, filtering, and joining has to happen twice.I don't know if this is a known limitation, but it would be great to have both subplan branch cache working in combination with parallel concat.
For the simplified example presented here reduced to 3 source dataframes instead of 16, here is the query plan without subplan branch cache:
And her it is with branch cache:
Expected behavior
expect query to run in parallel. Expect to see higher CPU usage and to see overlapping start and end times in
.profile()
Installed versions
The text was updated successfully, but these errors were encountered: