-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Invalid aggregate SQL query with HAVING
can be executed without error (SQLancer-TLP)
#12013
Comments
take |
I didn't dig into this issue but may be related 🤔 After some research, I believe this is a bug of #11681. When planning an aggregation, we perform some validations at datafusion/datafusion/sql/src/select.rs Lines 752 to 770 in cb1e3f0
However, after moving the wildcard expansion to the analyzer, we can no longer determine the projection expressions at this point. This causes the validations to fail, resulting in an invalid plan. Essentially, it causes another bug. Consider the following case:
The aggregation field for the I attempted to resolve this issue roughly but encountered another problem with error messages. Consider the SQL provided by @2010YOUY01:
In the previous version (41.0.0), the message would have been:
I think we can expand the wildcard when planning the aggregation for validation if the group-by keys are empty. I can ensure that if the group-by keys are empty, the SQL isn't valid for the However, if the group-by keys aren't empty and we don't expand the wildcard, it becomes difficult to provide a correct error message. The SQL will fail when invoking
In the previous version (41.0.0), the error message would have been:
To avoid duplicate expansion, I don't prefer expanding the wildcard when the group-by keys aren't empty. However, this results in an unclear error message for the user 🤔. I'll draft a PR to explain this more clearly. cc @jayzhan211 |
Describe the bug
The following 2 queries can not be executed in DuckDB/Postgres, but can be executed without error in DataFusion.
See reproducer in datafusion-cli
Looks like
*
is expanded to the aggregate function expression i.e.max(v1)
insideHAVING
clause.I think such semantics is not meaningful so report it as a potential bug.
To Reproduce
No response
Expected behavior
No response
Additional context
Found by SQLancer #11030
The text was updated successfully, but these errors were encountered: