perf: Avoid redundant copying of arrays in scan->filter->join #762
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Closes #757
Rationale for this change
Improve performance of joins by removing some redundant copying of arrays for join inputs.
What changes are included in this PR?
This PR adds a new
CometFilterExec
which is a copy of DataFusion'sFilterExec
with one small change to ensure that input arrays are never emitted without copying:In
planner.rs
we have this check for join inputs:Because
can_reuse_input_batch
returnsfalse
forCometFilterExec
, we are longer creating theCopyExec
.How are these changes tested?
Existing tests