-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrap potentially conditional expressions with IF in PreAggregateCaseAggregations #20652
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thx! Could you add a test case?
...ino-main/src/main/java/io/trino/sql/planner/iterative/rule/PreAggregateCaseAggregations.java
Outdated
Show resolved
Hide resolved
@sopel39 Thanks for taking a look! I just realized there can be other failing cases, e.g.,
This example is slightly different in that the CAST appears as the direct argument of the aggregation. I think we need to do something similar to limit the applicability here. But what's tricky is we don't know what type of Expression it is at this point (it may just be a symbol like |
I think the problem might be in
and some other checks @weiatwork do you think you will be able to fix this? |
it could be that |
I think regardless the type, the problem is due to the over-generous qualification for CAST in expression like this:
It shouldn't be allowed in the first place, as pre-aggregating on a CAST can also have side effect. I added this additional restriction, as well as updating the UT. Please take a look. |
BW: the query is correctly rewritten. It correctly fails with |
Yes, VARCHAR 'd' cannot be converted to INT, but the conversion shouldn't have been attempted in the first place: it is conditional based on the context provided by the specific case statement. The optimization's pre-aggregation makes the conversion unconditional, thus causing the failure. |
@martint what does spec says about it? I would guess we could wrap pre-aggregate expression in |
In a CASE statement, the conditions are expected to be evaluated eagerly, while the branches only get evaluated for the first condition that matches. So, in the example shown in the description above, this should never fail: |
@weiatwork would you like to work on a fix? For expressions that can fail we need to wrap them in expression similar to |
@sopel39 The wrapping approach sounds good. I've identified two scenarios that are subject to this treatment:
|
@weiatwork there is also division by zero etc.
I think these end-up being same plan really even before input to pre-aggregate rule |
The conditional expressions that could potentially fail (incl. division by zero) are covered by You're right the two use cases produce the same plan. I just wanted to exemplify what query patterns can be subject to such failures. |
@sopel39 Do you have any other concerns regarding the fix? |
No. I think it should do the job |
Can we have this merged? @sopel39 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a query test under io.trino.sql.query that replicates the issue before the fix is applied?
...ino-main/src/main/java/io/trino/sql/planner/iterative/rule/PreAggregateCaseAggregations.java
Outdated
Show resolved
Hide resolved
...ino-main/src/main/java/io/trino/sql/planner/iterative/rule/PreAggregateCaseAggregations.java
Outdated
Show resolved
Hide resolved
@@ -347,9 +349,11 @@ private Optional<CaseAggregation> extractCaseAggregation(Symbol aggregationSymbo | |||
Symbol projectionSymbol = Symbol.from(aggregation.getArguments().get(0)); | |||
Expression projection = projectNode.getAssignments().get(projectionSymbol); | |||
Expression unwrappedProjection; | |||
boolean wrappedByCast = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@martint do you recall if we have a visitor checking whether expression can throw an exception?
Maybe only then we could wrap expression in IF
...main/src/test/java/io/trino/sql/planner/iterative/rule/TestPreAggregateCaseAggregations.java
Outdated
Show resolved
Hide resolved
...ino-main/src/main/java/io/trino/sql/planner/iterative/rule/PreAggregateCaseAggregations.java
Outdated
Show resolved
Hide resolved
...ino-main/src/main/java/io/trino/sql/planner/iterative/rule/PreAggregateCaseAggregations.java
Outdated
Show resolved
Hide resolved
...ino-main/src/main/java/io/trino/sql/planner/iterative/rule/PreAggregateCaseAggregations.java
Show resolved
Hide resolved
...ino-main/src/main/java/io/trino/sql/planner/iterative/rule/PreAggregateCaseAggregations.java
Outdated
Show resolved
Hide resolved
@@ -410,13 +468,22 @@ private Optional<CaseAggregation> extractCaseAggregation(Symbol aggregationSymbo | |||
toSqlType(aggregationType))); | |||
} | |||
|
|||
Expression whenOperand = caseExpression.getWhenClauses().getFirst().getOperand(); | |||
Expression whenResult = caseExpression.getWhenClauses().getFirst().getResult(); | |||
if (wrappedByCast || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we maybe extend it to simple aritmetic (e.g. mul, addition etc). @martint wdyt? Would it be better to whitelist here or blacklist?
@sopel39 In my latest revision, there is still one failing case - the CAST-wrapped CASE: #20652 (comment) In my previous approach, this case was covered, because of the "early wrapping" in I'm thinking of creating a mini-rule, something like
|
core/trino-main/src/test/java/io/trino/sql/query/TestPreAggregateCaseAggregations.java
Outdated
Show resolved
Hide resolved
...ino-main/src/main/java/io/trino/sql/planner/iterative/rule/PreAggregateCaseAggregations.java
Outdated
Show resolved
Hide resolved
...main/src/test/java/io/trino/sql/planner/iterative/rule/TestPreAggregateCaseAggregations.java
Show resolved
Hide resolved
...main/src/test/java/io/trino/sql/planner/iterative/rule/TestPreAggregateCaseAggregations.java
Show resolved
Hide resolved
looks much easier, small nits |
...ino-main/src/main/java/io/trino/sql/planner/iterative/rule/PreAggregateCaseAggregations.java
Outdated
Show resolved
Hide resolved
...ino-main/src/main/java/io/trino/sql/planner/iterative/rule/PreAggregateCaseAggregations.java
Show resolved
Hide resolved
thx! |
Description
There can be query failures with
PreAggregateCaseAggregations
.Repro:
The problem is due to overgeneralized application of the optimization rule. The above example shows a type compatibility issue, but other issues such as division by zero can also happen.
Note: I have not modified the test cases in
TestPreAggregateCaseAggregations
since it has many uses ofCAST
. I can update it once this fix is deemed feasible (not sure whether an allowlist or a denylist makes more sense).Related: #12548
Release notes
( ) This is not user-visible or is docs only, and no release notes are required.
(X) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text: