Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: CSE don't scan share if predicate pushdown predicates don't match #15328

Merged
merged 6 commits into from
Mar 27, 2024

Conversation

ritchie46
Copy link
Member

@ritchie46 ritchie46 commented Mar 27, 2024

In this case we will favor memory reduction over scan sharing.

/// 1. This will ensure that all equal caches communicate the amount of columns
/// they need to project.
/// 2.
/// - This will ensure we apply predicate in the subtrees below the caches.
/// - If the predicate above the cache is the same for all matching caches that filter will be applied
///  as well.
///
/// # Example
/// Consider this tree, where `SUB-TREE` is duplicate and can be cached.
///
///
///                         Tree
///                         |
///                         |
///    |--------------------|-------------------|
///    |                                        |
///    SUB-TREE                                 SUB-TREE
///
/// STEPS:
/// - 1 CSE will run and will insert cache nodes
///
///                         Tree
///                         |
///                         |
///    |--------------------|-------------------|
///    |                                        |
///    | CACHE 0                                | CACHE 0
///    |                                        |
///    SUB-TREE                                 SUB-TREE
///
/// - 2 predicate and projection pushdown will run and will insert optional FILTER and PROJECTION above the caches
///
///                         Tree
///                         |
///                         |
///    |--------------------|-------------------|
///    | FILTER (optional)                      | FILTER (optional)
///    | PROJ (optional)                        | PROJ (optional)
///    |                                        |
///    | CACHE 0                                | CACHE 0
///    |                                        |
///    SUB-TREE                                 SUB-TREE
///
/// # Projection optimization
/// The union of the projection is determined and the projection will be pushed down.
///
///                         Tree
///                         |
///                         |
///    |--------------------|-------------------|
///    | FILTER (optional)                      | FILTER (optional)
///    | CACHE 0                                | CACHE 0
///    |                                        |
///    SUB-TREE                                 SUB-TREE
///    UNION PROJ (optional)                    UNION PROJ (optional)
///
/// # Filter optimization
/// Depending on the predicates the predicate pushdown optimization will run.
/// Possible cases:
/// - NO FILTERS: run predicate pd from the cache nodes -> finish
/// - Above the filters the caches are the same -> run predicate pd from the filter node -> finish
/// - There is a cache without predicates above the cache node -> run predicate form the cache nodes -> finish
/// - The predicates above the cache nodes are all different -> remove the cache nodes -> finish

@github-actions github-actions bot added enhancement New feature or an improvement of an existing feature python Related to Python Polars rust Related to Rust Polars labels Mar 27, 2024
@ritchie46 ritchie46 changed the title feat: CSE don't scan share if predicate pushdown fitlers don't match feat: CSE don't scan share if predicate pushdown predicates don't match Mar 27, 2024
Copy link

codecov bot commented Mar 27, 2024

Codecov Report

Attention: Patch coverage is 72.55814% with 59 lines in your changes are missing coverage. Please review.

Project coverage is 81.35%. Comparing base (03c5f73) to head (85ba9ce).
Report is 6 commits behind head on main.

Files Patch % Lines
crates/polars-plan/src/dsl/expr.rs 0.00% 47 Missing ⚠️
...rs-plan/src/logical_plan/optimizer/cache_states.rs 92.20% 12 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #15328      +/-   ##
==========================================
- Coverage   81.36%   81.35%   -0.01%     
==========================================
  Files        1364     1365       +1     
  Lines      176612   176861     +249     
  Branches     2525     2531       +6     
==========================================
+ Hits       143694   143879     +185     
- Misses      32434    32497      +63     
- Partials      484      485       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ritchie46 ritchie46 merged commit af81de0 into main Mar 27, 2024
18 checks passed
@ritchie46 ritchie46 deleted the cache branch March 27, 2024 20:33
@c-peters c-peters added the accepted Ready for implementation label Apr 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted Ready for implementation enhancement New feature or an improvement of an existing feature python Related to Python Polars rust Related to Rust Polars
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

2 participants