-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-49558][SQL] Add SQL pipe syntax for LIMIT/OFFSET and ORDER/SORT/CLUSTER/DISTRIBUTE BY #48413
base: master
Are you sure you want to change the base?
Conversation
cc @cloud-fan @gengliangwang this is the PR to support LIMIT/OFFSET + sorting. There are a few more changes in the |
sql/core/src/test/resources/sql-tests/results/pipe-operators.sql.out
Outdated
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
Outdated
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/resources/sql-tests/results/pipe-operators.sql.out
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @cloud-fan for your review!
sql/core/src/test/resources/sql-tests/results/pipe-operators.sql.out
Outdated
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
Outdated
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/resources/sql-tests/results/pipe-operators.sql.out
Outdated
Show resolved
Hide resolved
throw QueryParsingErrors.combinationQueryResultClausesUnsupportedError(ctx) | ||
val allClauses = "ORDER BY/SORT BY/DISTRIBUTE BY/CLUSTER BY" | ||
if (forPipeOperators) { | ||
throw QueryParsingErrors.clausesWithPipeOperatorsUnsupportedError(ctx, allClauses) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should always call combinationQueryResultClausesUnsupportedError
here, as this is not related to pipe. We don't support this combination in both pipe and classic SQL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, this is done.
val limitClause = "LIMIT" | ||
if (forPipeOperators && clause.nonEmpty && clause != offsetClause) { | ||
throw QueryParsingErrors.clausesWithPipeOperatorsUnsupportedError( | ||
ctx, s"the $clause and $limitClause clauses") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ctx, s"the $clause and $limitClause clauses") | |
ctx, s"the co-existence of $clause and $limitClause clauses") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, this is done.
val withOffset = withWindow.optional(offset) { | ||
if (forPipeOperators && clause.nonEmpty) { | ||
throw QueryParsingErrors.clausesWithPipeOperatorsUnsupportedError( | ||
ctx, s"the $clause and $offsetClause clauses") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ctx, s"the $clause and $offsetClause clauses") | |
ctx, s"the co-existence of $clause and $offsetClause clauses") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, this is done.
@@ -4926,9 +4926,14 @@ | |||
"Catalog <catalogName> does not support <operation>." | |||
] | |||
}, | |||
"CLAUSE_WITH_PIPE_OPERATORS" : { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe @srielau can also comment here, but it looks to me that we have two new errors:
- WINDOW in the SQL pipe operator. We don't support window definition in SQL pipe yet.
- More than one QUERY_RESULT_CLAUSES in SQL pipe. SQL pipe is designed to specify one operator at a time, so we don't allow any combination, such as ORDER BY col LIMIT 1, or LIMIT 1 OFFSET 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- appears to be a 0A000 (unsupported feature)
- Isn't that a syntax error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, 2 is a syntax error for the pipe statement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, split this into two new errors per suggestion.
respond to code review comments respond to code review comments respond to code review comments
5205e12
to
d55dce0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @cloud-fan for your reviews!
@@ -4926,9 +4926,14 @@ | |||
"Catalog <catalogName> does not support <operation>." | |||
] | |||
}, | |||
"CLAUSE_WITH_PIPE_OPERATORS" : { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, split this into two new errors per suggestion.
|
||
// OFFSET | ||
// - OFFSET 0 is the same as omitting the OFFSET clause | ||
val offsetClause = "OFFSET" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: shall we put all the constant clauses into an object?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, it is good to keep magic strings out of the code.
@@ -5006,6 +5011,11 @@ | |||
"Multiple bucket TRANSFORMs." | |||
] | |||
}, | |||
"MULTIPLE_QUERY_RESULT_CLAUSES_WITH_PIPE_OPERATORS" : { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we turn it into a top-level error class? This is syntax error, not unsupported feature, and should have a different SQL state.
@@ -5006,6 +5011,11 @@ | |||
"Multiple bucket TRANSFORMs." | |||
] | |||
}, | |||
"MULTIPLE_QUERY_RESULT_CLAUSES_WITH_PIPE_OPERATORS" : { | |||
"message" : [ | |||
"Syntax error: the SQL pipe operator syntax using |> does not support <clauses>. Please separate the multiple result clauses into separate pipe operators and then retry the query again." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about
<clause1> and <clause2> can not coexist in SQL pipe operator using '|>'. Please separate ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is to make the error parameters simple, so that people can consume it programatically.
What changes were proposed in this pull request?
This PR adds SQL pipe syntax support for LIMIT/OFFSET and ORDER/SORT/CLUSTER/DISTRIBUTE BY.
For example:
Why are the changes needed?
The SQL pipe operator syntax will let users compose queries in a more flexible fashion.
Does this PR introduce any user-facing change?
Yes, see above.
How was this patch tested?
This PR adds a few unit test cases, but mostly relies on golden file test coverage. I did this to make sure the answers are correct as this feature is implemented and also so we can look at the analyzer output plans to ensure they look right as well.
Was this patch authored or co-authored using generative AI tooling?
No