-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
planner/cascades: add transformation rule EliminateSingleMaxMin #14274
Conversation
Sorry for the late address comments. PTAL @zz-jason @francis0407 @lzmhhh123 |
Update. PTAL @francis0407 @lzmhhh123 |
// Since now there would be at most one row returned, the remained agg operator is not expensive anymore. | ||
newAggExpr.SetChildren(childGroup) | ||
newAggExpr.AddAppliedRule(r) | ||
return []*memo.GroupExpr{newAggExpr}, true, false, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we'd better take more consideration on whether to erase the old plan piece. Consider this situation:
create table t(a bigint, b bigint) -- no index
select max(a) from t;
- The old plan maybe is: TableScan -> Aggregate(max(a))
- The new plan with TopN is: TableScan -> TopN(a, 1) -> Aggregate(max(a))
Since both TableScan is a full range table scan, it's hard to tell whether the new plan is better than the old one. Maybe we should keep the old one in this situation.
@zz-jason @francis0407 @lzmhhh123 Update. PTAL |
@lzmhhh123 @francis0407 @zz-jason Update. PTAL |
} | ||
|
||
// Only one max() or min() in the AggFuncs slice, and | ||
// the other aggregate functions in the AggFuncs slice should be FirstRow(). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we really elimate max/min in this case? After transformation, there is a selection between source and agg, so the first row becomes first row that the max/min arguments is not null
, is it expected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's unexpected. In mysql5.7:
mysql> select * from t;
+------+------+
| a | b |
+------+------+
| NULL | 1 |
| 1 | 2 |
+------+------+
2 rows in set (0.00 sec)
mysql> select max(a), b from t;
+--------+------+
| max(a) | b |
+--------+------+
| 1 | 1 |
+--------+------+
1 row in set (0.00 sec)
I'll fix it. Thanks for your review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems it's not easy to include the first_row()
aggFun in this transformation rule. Or we can add a check in the match
function. We can't use this rule when the two conditions that first_row()
and the column for max/min
aggFunc can be NULL, exist at the same time. But I think it has too many restrictions leading to increased complexity. What's your opinion? @francis0407 @lamxTyler @zz-jason
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lamxTyler good catch !
I think we can revert this new feature. It’s hard to do that in the current framework. Just transform max/min like what we did before.
cc @zz-jason
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can rewrite the Match()
condition to this:
- only one
max
ormin
, we can apply this transformation - only one
max
ormin
, and all themax
ormin
arguments are not null, we can apply this transformation as well.
BTW, how do calcite handle this problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another problem is that currently we cannot check if the arguments are not null. mysql.HashNotNullFlag
does not work here, since operators like OuterJoin may generate null values on not null columns
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can add a TODO here, waiting for the not null prop is manitained.
Update. PTAL! @francis0407 @winoros |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/run-all-tests |
What problem does this PR solve?
This PR adds Transformation rule EliminateSingleMaxMin, which is a part of global max/min elimination in cascades planner.
part of #13709
What is changed and how it works?
The logic is the same with
planner/core/rule_max_min_eliminate.go
/maxMinEliminator.eliminateSingleMaxMin
.Check List
Tests