Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid aggregate SQL query with HAVING can be executed without error (SQLancer-TLP) #12013

Closed
2010YOUY01 opened this issue Aug 15, 2024 · 3 comments · Fixed by #12046
Closed
Assignees
Labels
bug Something isn't working

Comments

@2010YOUY01
Copy link
Contributor

Describe the bug

The following 2 queries can not be executed in DuckDB/Postgres, but can be executed without error in DataFusion.

See reproducer in datafusion-cli

DataFusion CLI v41.0.0
> create table t1(v1 int);
0 row(s) fetched.
Elapsed 0.069 seconds.

> insert into t1 values (1),(2),(3);
+-------+
| count |
+-------+
| 3     |
+-------+
1 row(s) fetched.
Elapsed 0.049 seconds.

> select * from t1 having max(v1)=3;
+------------+
| max(t1.v1) |
+------------+
| 3          |
+------------+
1 row(s) fetched.
Elapsed 0.037 seconds.

> select * from t1 having max(v1)=1;
+------------+
| max(t1.v1) |
+------------+
+------------+
0 row(s) fetched.
Elapsed 0.006 seconds.

Looks like * is expanded to the aggregate function expression i.e. max(v1) inside HAVING clause.
I think such semantics is not meaningful so report it as a potential bug.

To Reproduce

No response

Expected behavior

No response

Additional context

Found by SQLancer #11030

@2010YOUY01 2010YOUY01 added the bug Something isn't working label Aug 15, 2024
@goldmedal
Copy link
Contributor

take

@jayzhan211
Copy link
Contributor

I guess this issue is similar to #11897 #11748

@goldmedal
Copy link
Contributor

I guess this issue is similar to #11897 #11748

I didn't dig into this issue but may be related 🤔

After some research, I believe this is a bug of #11681. When planning an aggregation, we perform some validations at

// finally, we have some validation that the re-written projection can be resolved
// from the aggregate output columns
check_columns_satisfy_exprs(
&column_exprs_post_aggr,
&select_exprs_post_aggr,
"Projection references non-aggregate values",
)?;
// Rewrite the HAVING expression to use the columns produced by the
// aggregation.
let having_expr_post_aggr = if let Some(having_expr) = having_expr_opt {
let having_expr_post_aggr =
rebase_expr(having_expr, &aggr_projection_exprs, input)?;
check_columns_satisfy_exprs(
&column_exprs_post_aggr,
&[having_expr_post_aggr.clone()],
"HAVING clause references non-aggregate values",
)?;

However, after moving the wildcard expansion to the analyzer, we can no longer determine the projection expressions at this point. This causes the validations to fail, resulting in an invalid plan. Essentially, it causes another bug.

Consider the following case:

create table t1(v1 int, v2 int);
insert into t1 values (1,2),(2,4),(3,6);
select * from t1 group by v1, v2 having max(v1)=3;
----
+----+----+------------+
| v1 | v2 | max(t1.v1) |
+----+----+------------+
| 3  | 6  | 3          |
+----+----+------------+

The aggregation field for the having clause shouldn't be printed. This issue relates to how wildcards are expanded based on a filter plan.

I attempted to resolve this issue roughly but encountered another problem with error messages. Consider the SQL provided by @2010YOUY01:

select * from t1 having max(v1)=3;
---
Error: expand_wildcard_rule
caused by
Schema error: No field named t1.v1. Valid fields are "max(t1.v1)".

In the previous version (41.0.0), the message would have been:

> select * from t1 having max(v1)=3;
Error during planning: Projection references non-aggregate values: Expression t1.v1 could not be resolved from available columns: max(t1.v1)

I think we can expand the wildcard when planning the aggregation for validation if the group-by keys are empty. I can ensure that if the group-by keys are empty, the SQL isn't valid for the having clause, resulting in a better error message.

However, if the group-by keys aren't empty and we don't expand the wildcard, it becomes difficult to provide a correct error message. The SQL will fail when invoking ExpandWildCardRule. For example:

create table t1(v1 int, v2 int);
insert into t1 values (1, 2),(2,3),(3,4);
select * from t1 group by v1 having max(v1)=3
----
Error: expand_wildcard_rule
caused by
Schema error: No field named t1.v2. Valid fields are t1.v1, "max(t1.v1)".

In the previous version (41.0.0), the error message would have been:

> select * from t2 group by v1 having max(v2) = 2;
Error during planning: Projection references non-aggregate values: Expression t2.v2 could not be resolved from available columns: t2.v1, max(t2.v2)

To avoid duplicate expansion, I don't prefer expanding the wildcard when the group-by keys aren't empty. However, this results in an unclear error message for the user 🤔.

I'll draft a PR to explain this more clearly.

cc @jayzhan211

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants