[SPARK-43885][SQL] DataSource V2: Handle MERGE commands for delta-based sources #41448

aokolnychyi · 2023-06-04T05:47:35Z

What changes were proposed in this pull request?

This PR adds RewriteMergeIntoTable, similar to RewriteUpdateTable and RewriteDeleteFromTable, to handle MERGE commands for delta-based sources. Support for group-based sources will come in a follow-up PR.

Implementation notes:

RewriteMergeIntoTable is an analyzer rule that acts similar to existing rules for deletes and updates.
MergeRows is a new logical node that holds a set of instructions to apply on a joined target and source datasets.
- Instruction is a parent trait for all instructions.
- Keep means a joined row is part of the result of this MERGE operation.
- Split means a joined row is part of the result but must be split into two rows (update into delete and insert).
MergeRowsExec is a new physical node that is responsible for merging rows and producing a delta of rows. It also performs the MERGE cardinality check if needed.
NO_BROADCAST_AND_REPLICATION is a new internal join hint to prohibit broadcasting and replicating the target table to perform the cardinality check in MERGE operations, as required by the SQL standard.

Why are the changes needed?

These changes are needed per SPIP SPARK-35801.

Does this PR introduce any user-facing change?

This PR adds a new SQL config to enable/disable MERGE cardinality check required by the SQL standard.

How was this patch tested?

This PR comes with tests. There are more tests in AlignMergeAssignmentsSuite, which was merged earlier.

aokolnychyi · 2023-06-04T05:55:46Z

core/src/main/resources/error/error-classes.json

+      "The ON search condition of the MERGE statement matched a single row from the target table with multiple rows of the source table.",
+      "This could result in the target row being operated on more than once with an update or delete operation and is not allowed."
+    ],
+    "sqlState" : "23000"


I used 23 class as it is constraint violation but wasn't sure about the subclass. It is not defined in the SQL standard so I used 000, meaning no subclass. I am not sure how Spark assigns subclasses in these cases.

Here is an example of this error in SAP docs:
https://dcx.sap.com/sqla170/en/html/80ca9fd06ce21014bc30ac05c444ee4d.html

Here is the original JIRA for this error in Hive:
https://issues.apache.org/jira/browse/HIVE-14949

Ya, it sounds tricky. As you know, currently we use 23505 for two error cases, DUPLICATED_MAP_KEY and DUPLICATE_KEY. Just a question, if there is no other reference, do you want to use 23509 like SAP instead?

If I remember correctly, the SQL standard reserves only a few subclasses and all 5XX subclasses are custom so we are probably free to pick either one. In that case, let's use 23509 like in SAP. I'll also check more systems.

Sorry, I missed we have README.md and 23509 is mentioned as a different error from DB2 .

spark/core/src/main/resources/error/README.md

Line 485 in d88633a

|23509 |23 |Constraint Violation |509 |The owner of the package has constrained its use to environments which do not include that of the application process.|DB2 |N |DB2 |

Since we cannot choose 23509 because of the conflicts between DB2 and SAP, let's add a new one as Spark's new error code. According to the above README.md, we can claim 'K**' subclass range.

So, let's use 23K01 here and add a new line to README.md like the following.

|23K01 |23 |Merge Cardinality Violation |K01 |your full description |Spark |N |Spark

That's my thinking as well. Let's use 23K01. I did not find a sqlstate we can borrow from other systems.

aokolnychyi · 2023-06-04T05:56:27Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/RewriteMergeIntoTable.scala

+  private final val ROW_FROM_TARGET = "__row_from_target"
+
+  override def apply(plan: LogicalPlan): LogicalPlan = plan resolveOperators {
+    case m @ MergeIntoTable(aliasedTable, source, cond, matchedActions, notMatchedActions,


This is a special case when there is only one NOT MATCHED action.

aokolnychyi · 2023-06-04T05:56:45Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/RewriteMergeIntoTable.scala

+          m
+      }
+
+    case m @ MergeIntoTable(aliasedTable, source, cond, matchedActions, notMatchedActions,


This is a special case when there are only NOT MATCHED actions (having just 1 such action is handled above).

aokolnychyi · 2023-06-04T05:57:41Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/RewriteMergeIntoTable.scala

+  }
+
+  // build a rewrite plan for sources that support row deltas
+  private def buildWriteDeltaPlan(


Similar to buildWriteDeltaPlan in RewriteUpdateTable.

aokolnychyi · 2023-06-04T05:58:08Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/RewriteMergeIntoTable.scala

+      (readRelation, cond)
+    }
+
+    val checkCardinality = shouldCheckCardinality(matchedActions)


More about the cardinality check in MergeRowsExec.

aokolnychyi · 2023-06-04T05:58:51Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/RewriteRowLevelCommand.scala

@@ -167,4 +183,36 @@ trait RewriteRowLevelCommand extends Rule[LogicalPlan] {
  private def findColOrdinal(plan: LogicalPlan, name: String): Int = {
    plan.output.indexWhere(attr => conf.resolver(attr.name, name))
  }
+
+  protected def buildOriginalRowIdValues(


Copied from RewriteUpdateTable to reuse in both rules.

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/MergeRows.scala

aokolnychyi · 2023-06-04T06:01:33Z

sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/MergeRowsExec.scala

+        instructions: Seq[InstructionExec]): InternalRow = {
+
+      for (instruction <- instructions) {
+        if (instruction.condition.eval(row)) {


Not using find to avoid extra calls as it is called for every row. Using a simple for loop instead.
Not using Option for the same reason.

Regarding https://github.com/databricks/scala-style-guide#traversal-and-zipwithindex

while is preferred, I'm not sure if it still suitable for the current Scala compiler

Let me dig into the generated bytecode while adding a benchmark as part of SPARK-44013.
I'll cc you on that PR.

OK, thanks, @aokolnychyi

dongjoon-hyun

cc @sunchao , @viirya , @huaxingao

aokolnychyi · 2023-06-05T02:13:20Z

sql/core/src/test/scala/org/apache/spark/sql/JoinSuite.scala

@@ -92,6 +93,67 @@ class JoinSuite extends QueryTest with SharedSparkSession with AdaptiveSparkPlan
    operators.head
  }

+  test("NO_BROADCAST_AND_REPLICATION hint is respected in cross joins") {


These 3 tests cover scenarios when it is not safe to broadcast or replicate the target table to perform the cardinality check. The newly added internal hint handles this. There are MERGE tests for this too.

aokolnychyi · 2023-06-05T05:46:08Z

The test failures don't seem related. I'll need to take a closer look at what happened in sql - other tests, though.

aokolnychyi · 2023-06-06T14:37:47Z

The test job seems to just hang, I've also seen the same in other PRs. No relevant test failures.

2023-06-06T08:21:31.5118813Z �[0m[�[0m�[0minfo�[0m] �[0m�[0m�[34mTest run �[0mtest.org.apache.spark.sql.�[33mJavaRowSuite�[0m�[34m finished: �[0m�[34m0 failed�[0m�[34m, �[0m�[34m0 ignored�[0m�[34m, 2 total, 0.002s�[0m�[0m
2023-06-06T12:05:34.0724172Z 
2023-06-06T12:05:36.0713172Z Session terminated, killing shell... ...killed.
2023-06-06T12:05:36.1056661Z ##[error]The operation was canceled.

This change is ready for a detailed review round, @dongjoon-hyun @viirya @huaxingao @cloud-fan @sunchao.

dongjoon-hyun · 2023-06-07T04:58:31Z

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

@@ -4280,6 +4280,20 @@ object SQLConf {
      .checkValue(_ >= 0, "The threshold of cached local relations must not be negative")
      .createWithDefault(64 * 1024 * 1024)

+  val MERGE_CARDINALITY_CHECK_ENABLED =
+    buildConf("spark.sql.merge.cardinalityCheck.enabled")


In general, if there is no other confs in this namespace, spark.sql.merge.*, we had better avoid introducing new namespace like the following.

buildConf("spark.sql.merge.cardinalityCheck.enabled") buildConf("spark.sql.mergeCardinalityCheck.enabled")

dongjoon-hyun · 2023-06-07T05:01:05Z

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

+        "operations to ensure the integrity and accuracy of data. The ON search condition in " +
+        "MERGE must match a single row from the target table with at most one incoming row. " +
+        "If this assumption is violated, Spark must throw an exception. Otherwise, this may " +
+        "lead to data corruption as Spark could operate on the same target row more than once. " +


If this is a correctness issue, I'd not add this config.

dongjoon-hyun · 2023-06-07T05:02:26Z

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

+        "lead to data corruption as Spark could operate on the same target row more than once. " +
+        "The cardinality check can be disabled to avoid the computational overhead, " +
+        "but doing so is highly discouraged and can corrupt the underlying table. In most cases, " +
+        "the overhead should be negligible.")


Given this analysis, could you remove this config fro this PR (if you don't mind)?

I was following what Hive did as it offers the hive.merge.cardinality.check flag. The way we do the cardinality check in Spark should be cheaper than in Hive, so it is less critical. It seems Snowflake also has a similar parameter that can be disabled.

That said, I am OK removing it for now to be safe and consider adding in the future, if necessary. I'll make the change.

dongjoon-hyun · 2023-06-07T05:05:01Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/hints.scala

+ * by some rules where broadcasting or replicating a particular side of the join is not permitted,
+ * such as the MERGE cardinality check.
+ */
+case object NO_BROADCAST_AND_REPLICATION extends JoinStrategyHint {


Shall we spin off this NO_BROADCAST_AND_REPLICATION as an independent PR? The feature itself looks like standalone although this PR uses this.

Let me do that. I included it initially here to show how it is being used. Will create a separate PR in a bit.

Created PR #41499.

sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/MergeRowsExec.scala

dongjoon-hyun

I finished my first round reviews. Mostly, looks reasonable. Here are my two main comments.

It would be better avoid correctness issue by not providing the new configuration.
It would be great if we can spin-off new HINT PR from this.

…ne side of join ### What changes were proposed in this pull request? This PR adds a new internal join hint to disable broadcasting and replicating one side of join. ### Why are the changes needed? These changes are needed to disable broadcasting and replicating one side of join when it is not permitted, such as the cardinality check in MERGE operations in PR #41448. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? This PR comes with tests. More tests are in #41448. Closes #41499 from aokolnychyi/spark-44000. Authored-by: aokolnychyi <aokolnychyi@apple.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

dongjoon-hyun · 2023-06-08T03:16:22Z

Hi, @aokolnychyi . #41499 is merged. Could you rebase this PR to master branch?

…ed sources

dongjoon-hyun · 2023-06-08T08:56:43Z

core/src/main/resources/error/error-classes.json

+      "The ON search condition of the MERGE statement matched a single row from the target table with multiple rows of the source table.",
+      "This could result in the target row being operated on more than once with an update or delete operation and is not allowed."
+    ],
+    "sqlState" : "23K01"


Thank you for updates. We can update README.txt later. (#41448 (comment))

dongjoon-hyun

+1, LGTM from my side.

dongjoon-hyun · 2023-06-08T08:57:45Z

Also, cc @cloud-fan, @viirya, @huaxingao, @sunchao once more.

dongjoon-hyun · 2023-06-08T14:41:10Z

Could you fix this test failure by adding new error code to README.md, @aokolnychyi ?

[info] SparkThrowableSuite:
[info] - No duplicate error classes (30 milliseconds)
[info] - Error classes are correctly formatted (29 milliseconds)
[info] - SQLSTATE invariants *** FAILED *** (25 milliseconds)

aokolnychyi · 2023-06-09T05:48:39Z

Fixed, tested SparkThrowableSuite locally.

aokolnychyi · 2023-06-09T05:50:28Z

Also created SPARK-44013 to add a benchmark. Will be used to measure the impact of adding codegen later.

aokolnychyi · 2023-06-09T15:56:50Z

Tests failed because of sql - other tests:

2023-06-09T08:18:09.3443155Z Session terminated, killing shell...
2023-06-09T08:18:09.5476503Z ##[error]The operation was canceled.

Triggered again.

…ne side of join ### What changes were proposed in this pull request? This PR adds a new internal join hint to disable broadcasting and replicating one side of join. ### Why are the changes needed? These changes are needed to disable broadcasting and replicating one side of join when it is not permitted, such as the cardinality check in MERGE operations in PR apache#41448. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? This PR comes with tests. More tests are in apache#41448. Closes apache#41499 from aokolnychyi/spark-44000. Authored-by: aokolnychyi <aokolnychyi@apple.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

dongjoon-hyun

+1, LGTM again with the updated README.md.
In the master branch, we are trying to stabilize sql - others pipeline (933dfd9), it's still flaky.

Thank you for triggering multiple time and your patience. I'll merge this for Apache Spark 3.5.0.

aokolnychyi · 2023-06-13T05:45:09Z

Thanks for reviewing, @dongjoon-hyun @pan3793!

…ed sources ### What changes were proposed in this pull request? This PR adds support for group-based data sources in `RewriteMergeIntoTable`. This PR builds on top of PR #41448 and earlier PRs that added `RewriteDeleteFromTable`. ### Why are the changes needed? These changes are needed per SPIP SPARK-35801. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? This PR comes with tests. There are more tests in `AlignMergeAssignmentsSuite`, which was merged earlier. Closes #41577 from aokolnychyi/spark-43963. Authored-by: aokolnychyi <aokolnychyi@apple.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

cloud-fan · 2023-06-16T10:01:03Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/MergeRows.scala

+import org.apache.spark.sql.catalyst.util.truncatedString
+import org.apache.spark.sql.types.DataType
+
+case class MergeRows(


can we add a classdoc to explain what this new logical plan does? The name MergeRows seems a bit confusing as it does not merge anything. Looking at the physical implementation, it turns one input row into one or two output rows if the condition is met.

Sure, let me add it. The reason I called it MergeRows because it takes a plan where the target and source rows are joined and then applies MATCHED/NOT MATCHED/NOT MATCHED BY SOURCE to derive an output row. I call the latter merging because the output row is derived using values in target and source relation, so matched rows are kind of merged into one and this node is only used in MERGE commands.

I am open to any other names. Any suggestions, @cloud-fan?

…ed sources ### What changes were proposed in this pull request? This PR adds `RewriteMergeIntoTable`, similar to `RewriteUpdateTable` and `RewriteDeleteFromTable`, to handle MERGE commands for delta-based sources. Support for group-based sources will come in a follow-up PR. Implementation notes: - `RewriteMergeIntoTable` is an analyzer rule that acts similar to existing rules for deletes and updates. - `MergeRows` is a new logical node that holds a set of instructions to apply on a joined target and source datasets. - `Instruction` is a parent trait for all instructions. - `Keep` means a joined row is part of the result of this MERGE operation. - `Split` means a joined row is part of the result but must be split into two rows (update into delete and insert). - `MergeRowsExec` is a new physical node that is responsible for merging rows and producing a delta of rows. It also performs the MERGE cardinality check if needed. - `NO_BROADCAST_AND_REPLICATION` is a new internal join hint to prohibit broadcasting and replicating the target table to perform the cardinality check in MERGE operations, as required by the SQL standard. ### Why are the changes needed? These changes are needed per SPIP SPARK-35801. ### Does this PR introduce _any_ user-facing change? This PR adds a new SQL config to enable/disable MERGE cardinality check required by the SQL standard. ### How was this patch tested? This PR comes with tests. There are more tests in `AlignMergeAssignmentsSuite`, which was merged earlier. Closes apache#41448 from aokolnychyi/spark-43885. Authored-by: aokolnychyi <aokolnychyi@apple.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

…ed sources ### What changes were proposed in this pull request? This PR adds support for group-based data sources in `RewriteMergeIntoTable`. This PR builds on top of PR apache#41448 and earlier PRs that added `RewriteDeleteFromTable`. ### Why are the changes needed? These changes are needed per SPIP SPARK-35801. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? This PR comes with tests. There are more tests in `AlignMergeAssignmentsSuite`, which was merged earlier. Closes apache#41577 from aokolnychyi/spark-43963. Authored-by: aokolnychyi <aokolnychyi@apple.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

### What changes were proposed in this pull request? This is a followup of #41448 . As an optimizer rule, the produced plan should be resolved and resolved expressions should be able to report data type. The `Instruction` expression fails to report data type and may break external optimizer rules. This PR makes it to return dummy NullType. ### Why are the changes needed? to not break external optimizer rules. ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? existing tests Closes #42482 from cloud-fan/merge. Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

### What changes were proposed in this pull request? This is a followup of #41448 . As an optimizer rule, the produced plan should be resolved and resolved expressions should be able to report data type. The `Instruction` expression fails to report data type and may break external optimizer rules. This PR makes it to return dummy NullType. ### Why are the changes needed? to not break external optimizer rules. ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? existing tests Closes #42482 from cloud-fan/merge. Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit c9ff702) Signed-off-by: Wenchen Fan <wenchen@databricks.com>

### What changes were proposed in this pull request? This is a followup of apache#41448 . As an optimizer rule, the produced plan should be resolved and resolved expressions should be able to report data type. The `Instruction` expression fails to report data type and may break external optimizer rules. This PR makes it to return dummy NullType. ### Why are the changes needed? to not break external optimizer rules. ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? existing tests Closes apache#42482 from cloud-fan/merge. Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

…ne side of join ### What changes were proposed in this pull request? This PR adds a new internal join hint to disable broadcasting and replicating one side of join. ### Why are the changes needed? These changes are needed to disable broadcasting and replicating one side of join when it is not permitted, such as the cardinality check in MERGE operations in PR apache#41448. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? This PR comes with tests. More tests are in apache#41448. Closes apache#41499 from aokolnychyi/spark-44000. Authored-by: aokolnychyi <aokolnychyi@apple.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit d88633a)

…ed sources This PR adds `RewriteMergeIntoTable`, similar to `RewriteUpdateTable` and `RewriteDeleteFromTable`, to handle MERGE commands for delta-based sources. Support for group-based sources will come in a follow-up PR. Implementation notes: - `RewriteMergeIntoTable` is an analyzer rule that acts similar to existing rules for deletes and updates. - `MergeRows` is a new logical node that holds a set of instructions to apply on a joined target and source datasets. - `Instruction` is a parent trait for all instructions. - `Keep` means a joined row is part of the result of this MERGE operation. - `Split` means a joined row is part of the result but must be split into two rows (update into delete and insert). - `MergeRowsExec` is a new physical node that is responsible for merging rows and producing a delta of rows. It also performs the MERGE cardinality check if needed. - `NO_BROADCAST_AND_REPLICATION` is a new internal join hint to prohibit broadcasting and replicating the target table to perform the cardinality check in MERGE operations, as required by the SQL standard. These changes are needed per SPIP SPARK-35801. This PR adds a new SQL config to enable/disable MERGE cardinality check required by the SQL standard. This PR comes with tests. There are more tests in `AlignMergeAssignmentsSuite`, which was merged earlier. Closes apache#41448 from aokolnychyi/spark-43885. Authored-by: aokolnychyi <aokolnychyi@apple.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit daee47a)

…ed sources This PR adds support for group-based data sources in `RewriteMergeIntoTable`. This PR builds on top of PR apache#41448 and earlier PRs that added `RewriteDeleteFromTable`. These changes are needed per SPIP SPARK-35801. No. This PR comes with tests. There are more tests in `AlignMergeAssignmentsSuite`, which was merged earlier. Closes apache#41577 from aokolnychyi/spark-43963. Authored-by: aokolnychyi <aokolnychyi@apple.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit 305506e)

### What changes were proposed in this pull request? This is a followup of apache#41448 . As an optimizer rule, the produced plan should be resolved and resolved expressions should be able to report data type. The `Instruction` expression fails to report data type and may break external optimizer rules. This PR makes it to return dummy NullType. ### Why are the changes needed? to not break external optimizer rules. ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? existing tests Closes apache#42482 from cloud-fan/merge. Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

github-actions bot added CORE SQL labels Jun 4, 2023

aokolnychyi commented Jun 4, 2023

View reviewed changes

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/MergeRows.scala Show resolved Hide resolved

aokolnychyi commented Jun 4, 2023

View reviewed changes

dongjoon-hyun reviewed Jun 4, 2023

View reviewed changes

aokolnychyi force-pushed the spark-43885 branch from f91273d to 1ab1917 Compare June 5, 2023 02:04

aokolnychyi commented Jun 5, 2023

View reviewed changes

aokolnychyi force-pushed the spark-43885 branch 2 times, most recently from 51b23cc to 1a2211d Compare June 5, 2023 23:53

dongjoon-hyun reviewed Jun 7, 2023

View reviewed changes

sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/MergeRowsExec.scala Show resolved Hide resolved

dongjoon-hyun reviewed Jun 7, 2023

View reviewed changes

sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/MergeRowsExec.scala Show resolved Hide resolved

dongjoon-hyun reviewed Jun 7, 2023

View reviewed changes

aokolnychyi mentioned this pull request Jun 7, 2023

[SPARK-44000][SQL] Add hint to disable broadcasting and replicating one side of join #41499

Closed

aokolnychyi force-pushed the spark-43885 branch from 1a2211d to 7b7817c Compare June 8, 2023 06:01

[SPARK-43885][SQL] DataSource V2: Handle MERGE commands for delta-bas…

decbb2f

…ed sources

aokolnychyi force-pushed the spark-43885 branch from 7b7817c to decbb2f Compare June 8, 2023 06:13

dongjoon-hyun reviewed Jun 8, 2023

View reviewed changes

dongjoon-hyun approved these changes Jun 8, 2023

View reviewed changes

Document sqlstate

55f05c6

github-actions bot added the DOCS label Jun 9, 2023

dongjoon-hyun approved these changes Jun 13, 2023

View reviewed changes

dongjoon-hyun closed this in daee47a Jun 13, 2023

aokolnychyi mentioned this pull request Jun 13, 2023

[SPARK-43963][SQL] DataSource V2: Handle MERGE commands for group-based sources #41577

Closed

cloud-fan reviewed Jun 16, 2023

View reviewed changes

aokolnychyi mentioned this pull request Jun 22, 2023

[SPARK-44138][SQL] Prohibit non-deterministic expressions, subqueries and aggregates in MERGE conditions #41694

Closed

cloud-fan mentioned this pull request Aug 14, 2023

[SPARK-43885][SQL][FOLLOWUP] Instruction#dataType should not fail #42482

Closed

mythrocks mentioned this pull request Oct 12, 2023

[AUDIT] SPARK-43885: Support for RewriteMergeIntoTable for Delta tables NVIDIA/spark-rapids#9430

Closed

[SPARK-43885][SQL] DataSource V2: Handle MERGE commands for delta-based sources #41448

[SPARK-43885][SQL] DataSource V2: Handle MERGE commands for delta-based sources #41448

Conversation

aokolnychyi commented Jun 4, 2023

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

aokolnychyi Jun 4, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aokolnychyi Jun 7, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dongjoon-hyun Jun 8, 2023 • edited Loading

Choose a reason for hiding this comment

aokolnychyi Jun 8, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aokolnychyi Jun 4, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aokolnychyi Jun 4, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aokolnychyi Jun 9, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dongjoon-hyun left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aokolnychyi commented Jun 5, 2023

aokolnychyi commented Jun 6, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dongjoon-hyun left a comment • edited Loading

Choose a reason for hiding this comment

dongjoon-hyun commented Jun 8, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dongjoon-hyun left a comment

Choose a reason for hiding this comment

dongjoon-hyun commented Jun 8, 2023

dongjoon-hyun commented Jun 8, 2023 • edited Loading

aokolnychyi commented Jun 9, 2023

aokolnychyi commented Jun 9, 2023

aokolnychyi commented Jun 9, 2023

dongjoon-hyun left a comment • edited Loading

Choose a reason for hiding this comment

aokolnychyi commented Jun 13, 2023

Choose a reason for hiding this comment

aokolnychyi Jun 22, 2023 • edited Loading

Choose a reason for hiding this comment

aokolnychyi Jun 4, 2023 •

edited

Loading

aokolnychyi Jun 7, 2023 •

edited

Loading

dongjoon-hyun Jun 8, 2023 •

edited

Loading

aokolnychyi Jun 8, 2023 •

edited

Loading

aokolnychyi Jun 4, 2023 •

edited

Loading

aokolnychyi Jun 4, 2023 •

edited

Loading

aokolnychyi Jun 9, 2023 •

edited

Loading

dongjoon-hyun left a comment •

edited

Loading

dongjoon-hyun commented Jun 8, 2023 •

edited

Loading

dongjoon-hyun left a comment •

edited

Loading

aokolnychyi Jun 22, 2023 •

edited

Loading