-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-45033][SQL] Support maps by parameterized sql()
#42752
Conversation
…ark-proto` uber jar to test the `connect` module" This reverts commit df63adf.
sql()
sql()
sql()
sql()
@cloud-fan Could you review this PR, please. |
sql()
sql()
@@ -96,7 +96,11 @@ case class PosParameterizedQuery(child: LogicalPlan, args: Array[Expression]) | |||
*/ | |||
object BindParameters extends Rule[LogicalPlan] with QueryErrorsBase { | |||
private def checkArgs(args: Iterable[(String, Expression)]): Unit = { | |||
args.find(!_._2.isInstanceOf[Literal]).foreach { case (name, expr) => | |||
def isNotAllowed(expr: Expression): Boolean = expr.exists { | |||
case _: Literal | _: CreateArray | _: MapFromArrays | _: CreateMap => false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should recursively check the children of CreateArray
, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_.exists
checks the children recursively, doesn't it? So, we allow mix of arrays, maps and literals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me add an negative test with a deep nested prohibited expression.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see! It's TreeNode#exists
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add CreateStruct
? It's weird that we only leave it out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. How about to support MapFromEntries
as well? And maybe CreateNamedStruct
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CreateNamedStruct
is the actual expression for CreateStruct
. MapFromEntries
also look fine to me.
Merging to master. Thank you, @cloud-fan and @HyukjinKwon for review. |
…`sql()` ### What changes were proposed in this pull request? In the PR, I propose to extend the proto protocol, and add two more fields to `SqlCommand`: `arguments` and `pos_arguments`. This should allow to construct arrays/maps/structs by the `map()`, `array()`, `struct()`, `map_from_arrays()` and `map_from_entries()`, and pass them to `sql()` as arguments. ### Why are the changes needed? To fix the issue demonstrated by the example: ```scala spark.sql("select element_at(?, 1)", Array(array(lit(1)))).collect() java.lang.UnsupportedOperationException: literal unresolved_function { [info] function_name: "array" [info] arguments { [info] literal { [info] integer: 1 [info] } [info] } [info] } [info] not supported (yet). ``` Note: The PR #42752 supported constructors, and this PR extends the feature on Scala connect client. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? By new checks: ``` $ build/sbt -Phive-2.3 -Phive-thriftserver "test:testOnly *.ClientE2ETestSuite" ``` ### Was this patch authored or co-authored using generative AI tooling? No. Closes #42931 from MaxGekk/fix-parameterized-sql-connect-map. Authored-by: Max Gekk <max.gekk@gmail.com> Signed-off-by: Max Gekk <max.gekk@gmail.com>
…ed `sql()` ### What changes were proposed in this pull request? In the PR, I propose to update some error formats and comments regarding `sql()` parameters - maps, arrays and struct might be used as `sql()` parameters. New behaviour has been added by #42752. ### Why are the changes needed? To inform users about recent changes introduced by SPARK-45033. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? By running the affected test suite: ``` $ build/sbt "core/testOnly *SparkThrowableSuite" ``` ### Was this patch authored or co-authored using generative AI tooling? No. Closes #42957 from MaxGekk/clean-ClientE2ETestSuite. Authored-by: Max Gekk <max.gekk@gmail.com> Signed-off-by: Max Gekk <max.gekk@gmail.com>
…`sql()` ### What changes were proposed in this pull request? In the PR, I propose to add a few more examples for the `sql()` method in PySpark API with array and map parameters. ### Why are the changes needed? To inform users about recent changes introduced by #42752 and #42470, and check the changes work actually. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? By running new examples: ``` $ python/run-tests --parallelism=1 --testnames 'pyspark.sql.session SparkSession.sql' ``` ### Was this patch authored or co-authored using generative AI tooling? No. Closes #42996 from MaxGekk/map-sql-parameterized-python-connect. Authored-by: Max Gekk <max.gekk@gmail.com> Signed-off-by: Max Gekk <max.gekk@gmail.com>
What changes were proposed in this pull request?
In the PR, I propose to allow a few more column expressions as parameters of
sql()
to construct maps/array:CreateArray
,MapFromArrays
,CreateMap
, MapFromArraysand
MapFromEntries`. Need to allow such expression because Spark SQL doesn't support constructing literals of the map type.Why are the changes needed?
To improve user experience with Spark SQL, and support the rest built-in type.
Does this PR introduce any user-facing change?
No. It extends the existing API.
How was this patch tested?
By running new tests:
Was this patch authored or co-authored using generative AI tooling?
No.