Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(sql): support set operations wrapping subqueries #8414

Merged
merged 1 commit into from
Mar 7, 2024

Conversation

jcrist
Copy link
Member

@jcrist jcrist commented Feb 21, 2024

Previously this would generate invalid SQL if the subqueries repeated components of a SELECT (e.g. included an ORDER BY clause).

Fixes #8561.

@jcrist
Copy link
Member Author

jcrist commented Feb 21, 2024

This is failing for two reasons:

  • Some snapshot tests. I can update those if we think this fix is valid/useful.
  • mssql doesn't support ORDER BY clauses (lacking a limit or offset) in subqueries. Can mark these tests as failing for mssql.

A different but still kinda valid fix would be to strip ORDER BY clauses without limit/offset from subqueries in set operations, since the backend is free to ignore ordering of these subqueries. Mainly trying to avoid these queries failing at execution time due to generating invalid SQL. (nevermind, I think we'd need the fix here even if we did this)

@cpcloud
Copy link
Member

cpcloud commented Feb 22, 2024

This seems like a useful fix to me. How did you come across it? Ah, #8561.

@cpcloud cpcloud added this to the 9.0 milestone Feb 22, 2024
@cpcloud cpcloud added the bug Incorrect behavior inside of ibis label Feb 22, 2024
@jcrist jcrist force-pushed the set-ops-subqueries branch from 8d0e972 to 3736686 Compare March 6, 2024 20:21
@jcrist jcrist added the ci-run-cloud Run BigQuery, Snowflake, Databricks, and Athena backend tests label Mar 6, 2024
@ibis-docs-bot ibis-docs-bot bot removed the ci-run-cloud Run BigQuery, Snowflake, Databricks, and Athena backend tests label Mar 6, 2024
@jcrist jcrist force-pushed the set-ops-subqueries branch from 3736686 to f6a7e0d Compare March 6, 2024 21:22
@jcrist jcrist requested a review from cpcloud March 7, 2024 01:07
Copy link
Member

@cpcloud cpcloud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SQL is so annoying sometimes.

@cpcloud cpcloud merged commit aab0c13 into ibis-project:main Mar 7, 2024
74 checks passed
gforsyth pushed a commit that referenced this pull request Mar 7, 2024
This PR gets benchmarks passing again. Nesting levels were increased by
#8414.

Unfortunately, SQLGlot's SQL generation algorithm (but not its parsing
algorithm) is recursive.

This means that it cannot handle large nesting levels of select
statements.

Pre-SQLGlot, Ibis used to handle much larger nesting levels. This
functionality was lost in the sqlglot refactor.

Ultimately, someone needs to address this upstream in sqlglot by
converting the generation algorithm to an iterative
one.

I believe we should gain a little bit back after #8572 is merged, since
fewer select statements will be generated.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Incorrect behavior inside of ibis
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bug: Parser Error: syntax error at or near "UNION"
2 participants