Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-48529][SQL] Introduction of Labels in SQL Scripting #47146

Closed
wants to merge 32 commits into from

Conversation

miland-db
Copy link
Contributor

What changes were proposed in this pull request?

Previous PR1 and PR2 introduced parser and interpreter changes for SQL Scripting. This PR is a follow-up to introduce the concept of labels for SQL Scripting language and proposes the following changes:

  • Changes grammar to support labels at start and end of the compound statements.
  • Updates visitor functions for compound nodes in the syntax tree in AstBuilder to check if labels are present and valid.

More details can be found in Jira item for this task and its parent (where the design doc is uploaded as well).

Why are the changes needed?

The intent is to add support for various SQL scripting concepts like loops, leave & iterate statements.

Does this PR introduce any user-facing change?

No.
This PR is among first PRs in series of PRs that will introduce changes to sql() API to add support for SQL scripting, but for now, the API remains unchanged.
In the future, the API will remain the same as well, but it will have new possibility to execute SQL scripts.

How was this patch tested?

There are tests for newly introduced parser changes:

SqlScriptingParserSuite - unit tests for execution nodes.

Was this patch authored or co-authored using generative AI tooling?

No.

@miland-db miland-db changed the title [WIP][SPARK-48529][SQL] Introduction of Labels in SQL Scripting [SPARK-48529][SQL] Introduction of Labels in SQL Scripting Jul 9, 2024
# Conflicts:
#	sql/core/src/main/scala/org/apache/spark/sql/scripting/SqlScriptingExecutionNode.scala
#	sql/core/src/main/scala/org/apache/spark/sql/scripting/SqlScriptingInterpreter.scala
#	sql/core/src/test/scala/org/apache/spark/sql/scripting/SqlScriptingExecutionNodeSuite.scala
#	sql/core/src/test/scala/org/apache/spark/sql/scripting/SqlScriptingInterpreterSuite.scala
@github-actions github-actions bot removed the CONNECT label Jul 9, 2024
@miland-db miland-db requested a review from cloud-fan July 9, 2024 21:35
case class CompoundBody(collection: Seq[CompoundPlanStatement]) extends CompoundPlanStatement
case class CompoundBody(
collection: Seq[CompoundPlanStatement],
label: Option[String]) extends CompoundPlanStatement
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when can it be None? Seems we always specify an UUID if label is not given.

Copy link
Contributor Author

@miland-db miland-db Jul 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this comment I gave 2 examples when it can be None. We can also add UUID there, but I don't think it makes sense to do it because surrounding loop will have label which is logically correct, and if user adds BEGIN and END block then the label will be appointed to that block automatically because it will pass through visitBeginEndCompoundBlock(ctx: BeginEndCompoundBlockContext).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add a parameter doc for it in the classdoc above? We need to explain when Spark should generate the label if not specified by users.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@miland-db
Copy link
Contributor Author

In the follow up PR that is introducing some kind of scope for variables and error handlers etc. we will add check for unique label names in the compound.

@miland-db miland-db requested a review from cloud-fan July 10, 2024 19:59
@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in 2e3a6fb Jul 11, 2024
biruktesf-db pushed a commit to biruktesf-db/spark that referenced this pull request Jul 11, 2024
### What changes were proposed in this pull request?
Previous [PR1](apache#46665) and [PR2](apache#46665) introduced parser and interpreter changes for SQL Scripting. This PR is a follow-up to introduce the concept of labels for SQL Scripting language and proposes the following changes:

- Changes grammar to support labels at start and end of the compound statements.
- Updates visitor functions for compound nodes in the syntax tree in AstBuilder to check if labels are present and valid.

More details can be found in [Jira item](https://issues.apache.org/jira/browse/SPARK-48529) for this task and its parent (where the design doc is uploaded as well).

### Why are the changes needed?
The intent is to add support for various SQL scripting concepts like loops, leave & iterate statements.

### Does this PR introduce any user-facing change?
No.
This PR is among first PRs in series of PRs that will introduce changes to sql() API to add support for SQL scripting, but for now, the API remains unchanged.
In the future, the API will remain the same as well, but it will have new possibility to execute SQL scripts.

### How was this patch tested?
There are tests for newly introduced parser changes:

SqlScriptingParserSuite - unit tests for execution nodes.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes apache#47146 from miland-db/sql_batch_labels.

Lead-authored-by: David Milicevic <david.milicevic@databricks.com>
Co-authored-by: Milan Dankovic <milan.dankovic@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
jingz-db pushed a commit to jingz-db/spark that referenced this pull request Jul 22, 2024
### What changes were proposed in this pull request?
Previous [PR1](apache#46665) and [PR2](apache#46665) introduced parser and interpreter changes for SQL Scripting. This PR is a follow-up to introduce the concept of labels for SQL Scripting language and proposes the following changes:

- Changes grammar to support labels at start and end of the compound statements.
- Updates visitor functions for compound nodes in the syntax tree in AstBuilder to check if labels are present and valid.

More details can be found in [Jira item](https://issues.apache.org/jira/browse/SPARK-48529) for this task and its parent (where the design doc is uploaded as well).

### Why are the changes needed?
The intent is to add support for various SQL scripting concepts like loops, leave & iterate statements.

### Does this PR introduce any user-facing change?
No.
This PR is among first PRs in series of PRs that will introduce changes to sql() API to add support for SQL scripting, but for now, the API remains unchanged.
In the future, the API will remain the same as well, but it will have new possibility to execute SQL scripts.

### How was this patch tested?
There are tests for newly introduced parser changes:

SqlScriptingParserSuite - unit tests for execution nodes.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes apache#47146 from miland-db/sql_batch_labels.

Lead-authored-by: David Milicevic <david.milicevic@databricks.com>
Co-authored-by: Milan Dankovic <milan.dankovic@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
attilapiros pushed a commit to attilapiros/spark that referenced this pull request Oct 4, 2024
### What changes were proposed in this pull request?
Previous [PR1](apache#46665) and [PR2](apache#46665) introduced parser and interpreter changes for SQL Scripting. This PR is a follow-up to introduce the concept of labels for SQL Scripting language and proposes the following changes:

- Changes grammar to support labels at start and end of the compound statements.
- Updates visitor functions for compound nodes in the syntax tree in AstBuilder to check if labels are present and valid.

More details can be found in [Jira item](https://issues.apache.org/jira/browse/SPARK-48529) for this task and its parent (where the design doc is uploaded as well).

### Why are the changes needed?
The intent is to add support for various SQL scripting concepts like loops, leave & iterate statements.

### Does this PR introduce any user-facing change?
No.
This PR is among first PRs in series of PRs that will introduce changes to sql() API to add support for SQL scripting, but for now, the API remains unchanged.
In the future, the API will remain the same as well, but it will have new possibility to execute SQL scripts.

### How was this patch tested?
There are tests for newly introduced parser changes:

SqlScriptingParserSuite - unit tests for execution nodes.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes apache#47146 from miland-db/sql_batch_labels.

Lead-authored-by: David Milicevic <david.milicevic@databricks.com>
Co-authored-by: Milan Dankovic <milan.dankovic@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants