Allow returning an EmptyHashedRelation when a broadcast result is empty [databricks] #4256

abellina · 2021-12-01T21:40:02Z

Signed-off-by: Alessandro Bellina abellina@nvidia.com

Closes #4134.

The PR allows the broadcast exchange to produce an EmptyHashedRelation or an empty array in the case of the identity broadcast, in order for AQE's EliminateJoinToEmptyRelation rule to be able to optimize the plan. This changes q16 massively in the way we run things, but it lets us match what the CPU does.

In terms of performance, I ran this at 3TB and Q16 is now ~12 seconds which is slightly faster than the CPU (this is 3.5x faster than what we had before on the GPU). I don't see regressions with other queries.

This PR enables isFoldableNonLitAllowed for UnaryExprMeta so that expressions like cast(null as bigint) can be handled. These casts show up given an empty projection due to the AQE rule to remove the join. That said, ConstantFolding does not re-execute as part of AQE, so they are left in the plan. Both tests I have added will generate plans with these for AQE.

Signed-off-by: Alessandro Bellina <abellina@nvidia.com>

sql-plugin/src/main/scala/org/apache/spark/sql/rapids/execution/GpuBroadcastHelper.scala

abellina · 2021-12-01T22:55:35Z

Ok EliminateJoinToEmptyRelation changed in Spark 3.2 to EliminateUnnecessaryJoin and now it's looking at getRowCount from the Statistics object in the ShuffleExchangeExec it seems. This caught me off guard, I need to spend time on Spark 3.2 to understand if this is going to cause problems or how they are handling it now.

jlowe · 2021-12-01T22:41:46Z

sql-plugin/src/main/scala/org/apache/spark/sql/rapids/execution/GpuBroadcastHelper.scala

+   * @param broadcastPlan - the SparkPlan to use to obtain the schema for the broadcast
+   *                      batch


Passing an entire plan just to get the schema is very heavyweight. This should simply take a schema parameter.

This should be fixed now.

sql-plugin/src/main/scala/org/apache/spark/sql/rapids/execution/GpuBroadcastHelper.scala

sql-plugin/src/main/31xdb/scala/com/nvidia/spark/rapids/shims/v2/GpuBroadcastHashJoinExec.scala

sql-plugin/src/main/scala/org/apache/spark/sql/rapids/execution/GpuBroadcastExchangeExec.scala

sql-plugin/src/main/scala/org/apache/spark/sql/rapids/execution/GpuBroadcastHelper.scala

...rc/main/scala/org/apache/spark/sql/rapids/execution/GpuBroadcastNestedLoopJoinExecBase.scala

Co-authored-by: Jason Lowe <jlowe@nvidia.com>

abellina · 2021-12-06T23:34:03Z

build

integration_tests/src/main/python/join_test.py

...in/src/main/311+-nondb/scala/com/nvidia/spark/rapids/shims/v2/GpuBroadcastHashJoinExec.scala

abellina · 2021-12-07T20:30:19Z

Note that: f62167c adds isFoldableNonLitAllowed to the UnaryExprMeta. What @revans2 explained was (if I understand correctly) that an AQE rule that removes a join is likely running after constant folding, and therefore we are left with some unary expressions that didn't get folded.

That said I am not entirely sure yet of the order of things, so I am not 100% there yet.

abellina · 2021-12-08T05:49:25Z

That said I am not entirely sure yet of the order of things, so I am not 100% there yet.

I've been reading more about this and I think it makes sense now. Yes ConstantFolding isn't in the path after AQE removes the join because the ConstantFolding optimization happens for the basic logical plan optimizer (which happens early on), and not the AQEOptimizer. As far as I understand, once the plan is wrapped in an AdaptiveSparkPlanExec via InsertAdaptiveSparkPlan (from org.apache.spark.sql.executions.QueryExecution.preparations), the optimizer that includes ConstantFolding is not executed again, instead the optimizers used are the AQE optimizer (which only worries about Propagate Empty Relations,Dynamic Join Selection), and the canonicalizer (CleanExpressions). @andygrove does this make sense?

Given the optimization, we are looking at the logicalLink of the plan wrapped in the adaptive exec, and then producing a new projection, which in my case included a cast(null as bigint) (aka something that should have gotten folded to a literal). Nowadays, support for this should be available on the GPU, but we don't have test cases for it, so it is disabled by default.

abellina · 2021-12-08T05:49:47Z

build

abellina · 2021-12-08T15:25:09Z

build

abellina · 2021-12-08T15:26:57Z

build

abellina · 2021-12-08T16:19:49Z

This is broken in databricks:

  override def broadcastModeTransform(mode: BroadcastMode, rows: Array[InternalRow]): Any =
    mode.transform(rows, TaskContext.get.taskMemoryManager())

I was expecting to be able to transform the BroadcastMode, as spark does (https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala#L148), but it seems we need the task context.

I am looking into it further.

… config

integration_tests/src/main/python/join_test.py

sql-plugin/src/main/scala/org/apache/spark/sql/rapids/execution/GpuBroadcastExchangeExec.scala

sql-plugin/src/main/scala/org/apache/spark/sql/rapids/execution/GpuBroadcastHelper.scala

abellina · 2021-12-09T04:57:43Z

build

abellina · 2021-12-09T04:58:13Z

Thanks @jlowe, I believe I have addressed the comments

abellina · 2021-12-09T04:59:28Z

@revans2 this is ready for another look when you get a chance.

andygrove · 2021-12-09T15:12:44Z

I've been reading more about this and I think it makes sense now. Yes ConstantFolding isn't in the path after AQE removes the join because the ConstantFolding optimization happens for the basic logical plan optimizer (which happens early on), and not the AQEOptimizer. As far as I understand, once the plan is wrapped in an AdaptiveSparkPlanExec via InsertAdaptiveSparkPlan (from org.apache.spark.sql.executions.QueryExecution.preparations), the optimizer that includes ConstantFolding is not executed again, instead the optimizers used are the AQE optimizer (which only worries about Propagate Empty Relations,Dynamic Join Selection), and the canonicalizer (CleanExpressions). @andygrove does this make sense?

Yes, that is correct. I ran some tests to confirm this.

integration_tests/src/main/python/join_test.py

sql-plugin/src/main/scala/org/apache/spark/sql/rapids/execution/GpuBroadcastHelper.scala

abellina · 2021-12-09T16:51:25Z

build

Allow returning an EmptyHashedRelation when a broadcast result is empty

26a8d37

Signed-off-by: Alessandro Bellina <abellina@nvidia.com>

abellina added the performance A performance related task/issue label Dec 1, 2021

jlowe reviewed Dec 1, 2021

View reviewed changes

sql-plugin/src/main/scala/org/apache/spark/sql/rapids/execution/GpuBroadcastHelper.scala Outdated Show resolved Hide resolved

revans2 previously approved these changes Dec 1, 2021

View reviewed changes

jlowe reviewed Dec 1, 2021

View reviewed changes

Address review comments

6508550

abellina dismissed revans2’s stale review via 6508550 December 6, 2021 19:46

abellina force-pushed the perf/empty_hash_relation branch 2 times, most recently from 860e35c to 3f31a67 Compare December 6, 2021 19:51

abellina and others added 6 commits December 6, 2021 13:51

Revert change in integration tests

3f31a67

Remove unwanted change

b9684c2

Minor cleanup

0dffe9f

identity.length == 0 to identity.isEmpty

5df6437

Co-authored-by: Jason Lowe <jlowe@nvidia.com>

Apply more suggested cleanup

d048334

Remove incRefCount

e021a00

abellina changed the title ~~Allow returning an EmptyHashedRelation when a broadcast result is empty~~ Allow returning an EmptyHashedRelation when a broadcast result is empty [databricks] Dec 6, 2021

revans2 reviewed Dec 7, 2021

View reviewed changes

integration_tests/src/main/python/join_test.py Outdated Show resolved Hide resolved

...in/src/main/311+-nondb/scala/com/nvidia/spark/rapids/shims/v2/GpuBroadcastHashJoinExec.scala Outdated Show resolved Hide resolved

abellina added 8 commits December 7, 2021 10:46

EmptyHashedRelation was introduced in 3.1.x, so this fixes the shims

01bd86f

Cache build schema outside of mapPartitions

282f402

Fix bug with the broadcast helper + make a new test in join_test

30beaf1

Adds a test that forces a broadcast for the EmptyHashedRelation scenario

6c0d593

Fix typo

a3fc8ab

Upmerge to 22.02

4bba02f

Fix typo

2e83fa8

Adding isFoldableNonLitAllowed to UnaryExprMeta

f62167c

abellina added 3 commits December 8, 2021 08:53

Fix Spark 3.0.x build

ee80225

Also need to fix 30Xdb

71b91f3

Move isEmptyRelation override to Spark31xdb

5ac0143

Upmerge

012bdf0

abellina added 3 commits December 8, 2021 13:14

Disable in databricks and do some clenaup

a41f97d

Parametrize databricks so we dont request AQE when that is an invalid…

7fd4a20

… config

Cleanup

6f9b75f

jlowe reviewed Dec 8, 2021

View reviewed changes

Extra comment in RapidsMeta, take care of other review comments

a0704e2

abellina marked this pull request as ready for review December 9, 2021 04:57

Fix import spacing

5c527eb

revans2 previously approved these changes Dec 9, 2021

View reviewed changes

jlowe reviewed Dec 9, 2021

View reviewed changes

integration_tests/src/main/python/join_test.py Outdated Show resolved Hide resolved

sql-plugin/src/main/scala/org/apache/spark/sql/rapids/execution/GpuBroadcastHelper.scala Outdated Show resolved Hide resolved

abellina dismissed revans2’s stale review via 438ff4c December 9, 2021 16:50

Call the non-capturing assert

438ff4c

jlowe approved these changes Dec 9, 2021

View reviewed changes

Apply suggestion in GpuBroadcastHelper

34f2b59

abellina merged commit 1588f6a into NVIDIA:branch-22.02 Dec 9, 2021

abellina deleted the perf/empty_hash_relation branch December 9, 2021 19:57

jlowe mentioned this pull request Apr 28, 2022

[FEA] Update joins to optimize for the case where the relation table is empty or null #1462

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow returning an EmptyHashedRelation when a broadcast result is empty [databricks] #4256

Allow returning an EmptyHashedRelation when a broadcast result is empty [databricks] #4256

abellina commented Dec 1, 2021 •

edited

Loading

abellina commented Dec 1, 2021

jlowe Dec 1, 2021

abellina Dec 6, 2021

abellina commented Dec 6, 2021

abellina commented Dec 7, 2021

abellina commented Dec 8, 2021

abellina commented Dec 8, 2021

abellina commented Dec 8, 2021

abellina commented Dec 8, 2021

abellina commented Dec 8, 2021

abellina commented Dec 9, 2021

abellina commented Dec 9, 2021

abellina commented Dec 9, 2021

andygrove commented Dec 9, 2021

abellina commented Dec 9, 2021

		* @param broadcastPlan - the SparkPlan to use to obtain the schema for the broadcast
		* batch

Allow returning an EmptyHashedRelation when a broadcast result is empty [databricks] #4256

Allow returning an EmptyHashedRelation when a broadcast result is empty [databricks] #4256

Conversation

abellina commented Dec 1, 2021 • edited Loading

abellina commented Dec 1, 2021

jlowe Dec 1, 2021

Choose a reason for hiding this comment

abellina Dec 6, 2021

Choose a reason for hiding this comment

abellina commented Dec 6, 2021

abellina commented Dec 7, 2021

abellina commented Dec 8, 2021

abellina commented Dec 8, 2021

abellina commented Dec 8, 2021

abellina commented Dec 8, 2021

abellina commented Dec 8, 2021

abellina commented Dec 9, 2021

abellina commented Dec 9, 2021

abellina commented Dec 9, 2021

andygrove commented Dec 9, 2021

abellina commented Dec 9, 2021

abellina commented Dec 1, 2021 •

edited

Loading