Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-28169][SQL] Convert scan predicate condition to CNF #28805
[SPARK-28169][SQL] Convert scan predicate condition to CNF #28805
Changes from 27 commits
3356bac
346a1b4
250c7b3
15d62be
39e85ad
d8f7c9e
8856453
697a3a9
7e8319e
3734866
b253af3
478a7a8
603660b
69f1763
2f576fa
e71c45c
94609c8
326fb49
9322ae6
4a2adcd
0e2579d
270324e
219f200
f21cf43
35b5813
3df019a
1b8466e
e2777c9
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second thought, the method name
conjunctiveNormalFormAndGroupExpsByQualifier
is too long and theAnd
is weird.How about changing to
CNFWithGroupExpressionsByQualifier
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit format:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about changing to
CNFWithGroupExpressionsByReference
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have to group by reference here? Can you explain the rationale a bit more?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have discuss this problem in #28733 (comment)
The demo in that comment can show why for partition pruning we need to use reference why not qualifier
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't see the rationale. What if we don't do the group by and simply apply the CNF conversion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Such a case
if we don't group by reference, the condition
(dt = 2 and id < 100 and id > 20 ) or dt = 3
will be converted tobut we know that only
(dt = 3 or dt = 2)
can be predicated as partition pruning, we can combineid < 100
andid > 20
by grouByReference, and return asIn other word , since in final strategies of partition pruning, we partition predicate filter by judge if it's references is subset of partCols, if we combine condition group by reference, Here we result with
or
expression andor
can't be split bysplitConjunctivePredicates
, it won't impact final push down result.It's ok to just simply apply CNF rule, but group by references can avoid generate unnecessary expression to control the length of generated final exprs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, in brief: