Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-43922] Add named parameter support in parser for function calls #41429

Closed
wants to merge 39 commits into from
Closed

[SPARK-43922] Add named parameter support in parser for function calls #41429

wants to merge 39 commits into from

Conversation

learningchess2003
Copy link
Contributor

@learningchess2003 learningchess2003 commented Jun 1, 2023

What changes were proposed in this pull request?

We plan on adding two new tokens called namedArgumentExpression and functionArgument which would enable this feature. When parsing this logic, we also make changes to ASTBuilder such that it can detect if the argument passed is a named argument or a positional one.

Here is the link for the design document:
https://docs.google.com/document/d/1uOTX0MICxqu8fNanIsiyB8FV68CceGGpa8BJLP2u9o4/edit

Why are the changes needed?

This is part of a larger project to implement named parameter support for user defined functions, built-in functions, and table valued functions.

Does this PR introduce any user-facing change?

Yes, the user would be able to call functions with argument lists that contain named arguments.

How was this patch tested?

We add tests in the PlanParserSuite that will verify that the plan parsed is as intended.

@github-actions github-actions bot added the SQL label Jun 1, 2023
Copy link
Contributor

@anchovYu anchovYu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for implementing it! Except the ones Daniel have posted, add one nit comment.

if (e.namedArgumentExpression != null) {
val key = e.namedArgumentExpression.key.strictIdentifier
val value = e.namedArgumentExpression.value
NamedArgumentExpression(key.getText, expression(value))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Contributor

@dtenedor dtenedor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation generally LGTM, I tried to come up with testing ideas.


override def dataType: DataType = value.dataType

override def toString: String = s"""$key => $value"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
override def toString: String = s"""$key => $value"""
override def toString: String = s"$key => $value"

@learningchess2003
Copy link
Contributor Author

@dtenedor Thanks for the comments! I've added the tests.

@@ -921,7 +930,7 @@ primaryExpression
| LEFT_PAREN namedExpression (COMMA namedExpression)+ RIGHT_PAREN #rowConstructor
| LEFT_PAREN query RIGHT_PAREN #subqueryExpression
| IDENTIFIER_KW LEFT_PAREN expression RIGHT_PAREN #identifierClause
| functionName LEFT_PAREN (setQuantifier? argument+=expression (COMMA argument+=expression)*)? RIGHT_PAREN
| functionName LEFT_PAREN (setQuantifier? argument+=functionArgument (COMMA argument+=functionArgument)*)? RIGHT_PAREN
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you maybe fix formatting of these three lines so they break before the 94th column that the above labeks (e.g. #identifierClause) start on? This could make the parser logic easier to read.

@@ -1544,8 +1544,17 @@ class AstBuilder extends SqlBaseParserBaseVisitor[AnyRef] with SQLConfHelper wit
if (name.length > 1) {
throw QueryParsingErrors.invalidTableValuedFunctionNameError(name, ctx)
}
val args = func.functionArgument.asScala.map { e =>
if (e.namedArgumentExpression != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can simplify using Option and map:

Option(e.namedArgumentExpression).map { n =>
  NamedArgumentExpression(n.get.getText, expression(nvalue))
}.getOrElse {
  expression(e)
}

Same on L2188 below.

if (e.namedArgumentExpression != null) {
val key = e.namedArgumentExpression.key.getText
val value = e.namedArgumentExpression.value
NamedArgumentExpression(key, expression(value))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you control this behavior with a SQLConf? You can create a new entry like ALLOW_NAMED_FUNCTION_ARGUMENTS and return an error here if the config is not set to true:

https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

learningchess2003 and others added 3 commits June 12, 2023 16:14
…essions/NamedArgumentExpression.scala

Co-authored-by: Daniel Tenedorio <daniel.tenedorio@databricks.com>
…essions/NamedArgumentExpression.scala

Co-authored-by: Daniel Tenedorio <daniel.tenedorio@databricks.com>
@learningchess2003
Copy link
Contributor Author

@dtenedor The configuration has been added. Other comments has been addressed.

NamedArgumentExpression(n.key.getText, expression(n.value))
} else {
throw new ParseException(
errorClass = "Named arguments not enabled.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not quite how the error classes work :) instead you need to:

  1. Add a new entry in error-classes.json [1]
  2. Add a corresponding function in QueryCompilationErrors.scala [2]
  3. Call that function here like throw QueryCompilationErrors.namedArgumentsNotEnabled(n.key.getText)
  4. Exercise this behavior in a unit test to make sure we really get this error message if the configuration is not enabled.

[1] https://github.com/apache/spark/blob/master/core/src/main/resources/error/error-classes.json#L1326

[2] https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala

@github-actions github-actions bot added the CORE label Jun 13, 2023
@@ -862,6 +862,15 @@ expression
: booleanExpression
;

namedArgumentExpression
: key=identifier FAT_ARROW value=expression
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

identifier can be quoted and unquoted
Shall we make it IDENTIFIER(unquoted) only?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gengliangwang After looking at the design doc, would it be fine to keep it like this?

Co-authored-by: Maxim Gekk <max.gekk@gmail.com>
@learningchess2003
Copy link
Contributor Author

@MaxGekk Sorry about the confusion. I realized your suggestion for quoting the spark.sql.allowNamedFunctionArguments config was accurate. That was my mistake. I committed your suggestion. Let me know what you think!

*/
case class NamedArgumentExpression(key: String, value: Expression)
extends UnaryExpression with Unevaluable {
override def nullable: Boolean = value.nullable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we can remove this line as UnaryExpression already provides an implementation of nullable

@@ -1527,6 +1527,18 @@ class AstBuilder extends SqlBaseParserBaseVisitor[AnyRef] with SQLConfHelper wit
}
}

private def extractExpression(expr: FunctionArgumentContext, funcName: String) : Expression = {
Copy link
Contributor

@cloud-fan cloud-fan Jun 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about extractFuncArgument?

@@ -875,6 +875,18 @@ class QueryExecutionErrorsSuite
sqlState = "XX000")
}

test("INTERNAL_ERROR: Calling eval on Unevaluable NamedArgumentExpression") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to test it. Unevaluable is not added by this PR and has been there for a long time.

@learningchess2003
Copy link
Contributor Author

@cloud-fan Thanks for the review! Addressed the comments.

Copy link
Member

@MaxGekk MaxGekk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@learningchess2003 Are there any end-to-end tests? If there are some, could you point them out. If not, could you add something to sql/core/src/test/resources/sql-tests/inputs

@learningchess2003
Copy link
Contributor Author

@MaxGekk Sounds good. I've added the end-to-end tests. In the next PR after this, most of the error messages will be replaced with the intended results. Right now, nothing is really supported yet.

@github-actions github-actions bot removed the CORE label Jun 29, 2023
@github-actions github-actions bot added the CORE label Jun 29, 2023
@learningchess2003
Copy link
Contributor Author

I messed up my Git structure with the fork, so migrating to another PR #41796. @MaxGekk just to let you know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants