Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-34637] [SQL] Support DPP + AQE when the broadcast exchange can be reused #31756

Closed
wants to merge 6 commits into from

Conversation

JkSelf
Copy link
Contributor

@JkSelf JkSelf commented Mar 5, 2021

What changes were proposed in this pull request?

We have supported DPP in AQE when the join is Broadcast hash join before applying the AQE rules in SPARK-34168, which has some limitations. It only apply DPP when the small table side executed firstly and then the big table side can reuse the broadcast exchange in small table side. This PR is to address the above limitations and can apply the DPP when the broadcast exchange can be reused.

Why are the changes needed?

Resolve the limitations when both enabling DPP and AQE

Does this PR introduce any user-facing change?

No

How was this patch tested?

Adding new ut

@JkSelf
Copy link
Contributor Author

JkSelf commented Mar 5, 2021

@cloud-fan Please help me review if you have available time. Thanks for your help.

@github-actions github-actions bot added the SQL label Mar 5, 2021
@SparkQA
Copy link

SparkQA commented Mar 5, 2021

Test build #135798 has finished for PR 31756 at commit 547ac92.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -1409,4 +1409,17 @@ class DynamicPartitionPruningSuiteAEOff extends DynamicPartitionPruningSuiteBase
with DisableAdaptiveExecutionSuite

class DynamicPartitionPruningSuiteAEOn extends DynamicPartitionPruningSuiteBase
with EnableAdaptiveExecutionSuite
with EnableAdaptiveExecutionSuite {
test("simple inner join triggers DPP with mock-up tables test") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only for debug. I will remove this test later.

@cloud-fan
Copy link
Contributor

Can you briefly introduce your approach?

@JkSelf
Copy link
Contributor Author

JkSelf commented Mar 10, 2021

@cloud-fan
This approach mainly contain three steps.

  1. Find the reused exchange. If exist, it will apply the DPP filter.
  2. In order to reuse the exchange stored in AdaptiveExecutionContext#stageCache, we wrap the AdaptiveSparkPlanExec plan in SubqueryAdaptiveBroadcastExec.
  3. SubqueryAdaptiveBroadcastExec#executeCollect will reuse the exchange in runtime by calling the AdaptiveSparkPlanExec#getFinalPhysicalPlan()

SubqueryBroadcastExec(name, index, buildKeys, reuseQueryStage)

val canReuseExchange = conf.exchangeReuseEnabled && buildKeys.nonEmpty &&
plan.find {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PlanAdaptiveDynamicPruningFilters is a stage optimization rule and the input plan is only a small piece of the plan tree (for one stage). I think we should put the entire plan as a parameter of this rule, when creating this rule in AdaptiveSparkPlanExec

@SparkQA
Copy link

SparkQA commented Mar 30, 2021

Kubernetes integration test unable to build dist.

exiting with code: 1
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41278/

@SparkQA
Copy link

SparkQA commented Mar 30, 2021

Test build #136696 has finished for PR 31756 at commit ae6fe64.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@github-actions
Copy link

Test build #751342356 for PR 31756 at commit 1ed23c5.

@github-actions github-actions bot added the BUILD label Apr 15, 2021
@SparkQA
Copy link

SparkQA commented Apr 15, 2021

Test build #137409 has finished for PR 31756 at commit 1ed23c5.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@github-actions
Copy link

Test build #751377372 for PR 31756 at commit 3bc4baf.

@SparkQA
Copy link

SparkQA commented Apr 15, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41986/

@SparkQA
Copy link

SparkQA commented Apr 15, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41986/

@SparkQA
Copy link

SparkQA commented Apr 15, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41985/

@SparkQA
Copy link

SparkQA commented Apr 15, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41985/

@@ -310,6 +310,11 @@ case class AdaptiveSparkPlanExec(
rdd
}

override def doExecuteBroadcast[T](): broadcast.Broadcast[T] = {
val broadcastPlan = getFinalPhysicalPlan()
broadcastPlan.doExecuteBroadcast()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: getFinalPhysicalPlan().doExecuteBroadcast()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.


/**
* A rule to insert dynamic pruning predicates in order to reuse the results of broadcast.
*/
case class PlanAdaptiveDynamicPruningFilters(
stageCache: TrieMap[SparkPlan, QueryStageExec]) extends Rule[SparkPlan] {
originalPlan: SparkPlan) extends Rule[SparkPlan] {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rootPlan

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

case _ => false
}.isDefined

if(canReuseExchange) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: if (canReuseExchange)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

@@ -41,15 +40,26 @@ case class PlanAdaptiveDynamicPruningFilters(
adaptivePlan: AdaptiveSparkPlanExec), exprId, _)) =>
val packedKeys = BindReferences.bindReferences(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can move this into if (canReuseExchange)

@cloud-fan
Copy link
Contributor

In general LGTM, can we add some tests?

@github-actions
Copy link

Test build #751980907 for PR 31756 at commit 657c61b.

@SparkQA
Copy link

SparkQA commented Apr 15, 2021

Test build #137410 has finished for PR 31756 at commit 3bc4baf.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

// +- HashAggregate
// +- Filter
// +- FileScan
// +- SubqueryBroadcast
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

subquery has different symbols in the tree string format. please try to explain some plans locally and update this comment.

// +- FileScan
// +- SubqueryBroadcast
// +- AdaptiveSparkPlan
// +- BroadcastQueryStage
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is no other place to reuse this broadcast, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This broadcast only be reused in the build side.

// +- FileScan
// +- SubqueryBroadcast
// +- AdaptiveSparkPlan
// +- BroadcastQueryStage
Copy link
Contributor

@tgravescs tgravescs Apr 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how is broadcast before the FileScan? what is being broadcast

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This broadcast is in the DPP subquery of the FileScan. It will broadcast the results of the build side and then prune the dataset.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make it more clear that this is the subquery in the file scan node, not the child of it?

@tgravescs
Copy link
Contributor

@JkSelf will you have time to look at the questions and comments?

@JkSelf
Copy link
Contributor Author

JkSelf commented May 6, 2021

@tgravescs Sorry for the delay responses.

@SparkQA
Copy link

SparkQA commented May 6, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42724/

@SparkQA
Copy link

SparkQA commented May 6, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42724/

@tgravescs
Copy link
Contributor

made this comment in one of the thread but it got collapse so making it again.
Could you add more description to the PR as to what you are doing and how this solves the problem?

@SparkQA
Copy link

SparkQA commented May 6, 2021

Test build #138203 has finished for PR 31756 at commit 6b07c84.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@JkSelf
Copy link
Contributor Author

JkSelf commented May 7, 2021

@tgravescs
This PR is mainly to solve the limitations of PR#31258. When DPP + AQE is supported in PR#31258, only the broadcast exchange on the build side can be executed first. Then the probe side can reuse the exchange of the build side in the DPP subquery, otherwise DPP will not be supported in AQE.

This approach mainly contain two steps.

  1. In PlanAdaptiveDynamicPruningFilters rule, judge whether the broadcast exchange can be reused, if so, it will insert the DPP subquery filter on the probe side.
  2. Create a AdaptiveSparkPlanExec with the broadcast exchange and then we can reuse the existing reuse logic to reuse the broadcast exchange in AdaptiveSparkPlanExec plan。

if (canReuseExchange) {
exchange.setLogicalLink(adaptivePlan.executedPlan.logicalLink.get)
val newAdaptivePlan = AdaptiveSparkPlanExec(
exchange, adaptivePlan.context, adaptivePlan.preprocessingRules, true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto: adaptivePlan.copy(inputPlan = exchange)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated.

@@ -1463,6 +1474,37 @@ abstract class DynamicPartitionPruningSuiteBase
}
}
}

test("SPARK-34637: test DPP side broadcast query stage is created firstly") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: SPARK-34637: DPP ... remove the test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated.

test("SPARK-34637: test DPP side broadcast query stage is created firstly") {
withSQLConf(SQLConf.DYNAMIC_PARTITION_PRUNING_REUSE_BROADCAST_ONLY.key -> "true") {
val df = sql(
""" WITH view1 as (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

view1 -> v?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated.

// +- HashAggregate
// +- Filter
// +- FileScan
// Dynamicpruning Subquery
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did try to explain some queries locally? If you did you should see how subqueries are displayed. For example, select 1, (select 2):

Project [1 AS 1#7, scalar-subquery#6 [] AS scalarsubquery()#9]
:  +- Project [2 AS 2#8]
:     +- OneRowRelation
+- OneRowRelation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. In this case, it looks like the following:
image

@SparkQA
Copy link

SparkQA commented May 11, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42890/

@SparkQA
Copy link

SparkQA commented May 11, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42890/

@SparkQA
Copy link

SparkQA commented May 11, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42899/

@SparkQA
Copy link

SparkQA commented May 11, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42899/

@SparkQA
Copy link

SparkQA commented May 11, 2021

Test build #138367 has finished for PR 31756 at commit 4ccd4b8.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 11, 2021

Test build #138376 has finished for PR 31756 at commit 701f1c3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • case class PlanAdaptiveDynamicPruningFilters(

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in b6d57b6 May 13, 2021
domybest11 pushed a commit to domybest11/spark that referenced this pull request Jun 15, 2022
…be reused

We have supported DPP in AQE when the join is Broadcast hash join before applying the AQE rules in [SPARK-34168](https://issues.apache.org/jira/browse/SPARK-34168), which has some limitations. It only apply DPP when the small table side executed firstly and then the big table side can reuse the broadcast exchange in small table side. This PR is to address the above limitations and can apply the DPP when the broadcast exchange can be reused.

Resolve the limitations when both enabling DPP and AQE

No

Adding new ut

Closes apache#31756 from JkSelf/supportDPP2.

Authored-by: jiake <ke.a.jia@intel.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants