-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pushdown more when visit Project Node. #1114
Conversation
Could you post |
// which come from the node, as opposed to an enclosing scope. | ||
Set<Symbol> childOutputSet = ImmutableSet.copyOf(node.getOutputSymbols()); | ||
Map<Symbol, Long> dependencies = SymbolsExtractor.extractAll(expression).stream() | ||
.filter(childOutputSet::contains) | ||
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting())); | ||
|
||
return dependencies.entrySet().stream() | ||
.allMatch(entry -> entry.getValue() == 1 || node.getAssignments().get(entry.getKey()) instanceof Literal); | ||
.allMatch(entry -> entry.getValue() <= 4 || node.getAssignments().get(entry.getKey()) instanceof Literal); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changing it to 4
might cause planning to explode exponentially (in fact even 2 might cause it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it won't, because 4
here means that the biggest explosion is limited to 4.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will cause some explosion in the number of terms. E.g., for a query like:
WITH
t1 (v) AS (VALUES 1),
t2 AS( select if(v = 0, v, v) v from t1 ),
t3 AS( select if(v = 0, v, v) v from t2 ),
t4 AS( select if(v = 0, v, v) v from t3 ),
t5 AS( select if(v = 0, v, v) v from t4 ),
t6 AS( select if(v = 0, v, v) v from t5 ),
t7 AS( select if(v = 0, v, v) v from t6 ),
t8 AS( select if(v = 0, v, v) v from t7 ),
t9 AS( select if(v = 0, v, v) v from t8 ),
t10 AS( select if(v = 0, v, v) v from t9 ),
t11 AS( select if(v = 0, v, v) v from t10 ),
t12 AS( select if(v = 0, v, v) v from t11 ),
t13 AS( select if(v = 0, v, v) v from t12 ),
t14 AS( select if(v = 0, v, v) v from t13 ),
t15 AS( select if(v = 0, v, v) v from t14 ),
t16 AS( select if(v = 0, v, v) v from t15 )
select *
from t16
where v = 0
One of the filters will be:
((CASE
WHEN ((CASE WHEN ("expr_62" = 0) THEN "expr_62" ELSE "expr_62" END) = 0)
THEN
(CASE WHEN ("expr_62" = 0) THEN "expr_62" ELSE "expr_62" END)
ELSE
(CASE WHEN ("expr_62" = 0) THEN "expr_62" ELSE "expr_62" END) END
) = 0)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but I think this explosion is controlled by <= 4
. Or we can change it smaller, but in my case, == 1
will make OR
's pushdown fail.
It's just a workaround I think, not a final solution of OR
's pushdown.
It seems that such OR expressions should be really rewritten to |
Sure.
And when it set to 4, the plan will be:
|
OK. Thanks! |
Closed via #1515 |
Due to commit 6dcd67
Inline only simple expressions in predicate pushdown
,the SQL with pattern like
will scan full table data rather than the data in such 2 partitions.
One of the reason is the function
isInliningCandidate
thinksreferences to complex expressions that appear only once (entry.getValue() == 1)
can go through the predicate pushdown on Project node, butpartition_key
appears 2 times in above case.I think the above pattern is common, and
== 1
is too aggressive, so maybe<= 4(8/16)
is more reasonable.