Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-32820][SQL] Remove redundant shuffle exchanges inserted by EnsureRequirements #29677

Closed
wants to merge 10 commits into from

Conversation

sarutak
Copy link
Member

@sarutak sarutak commented Sep 8, 2020

What changes were proposed in this pull request?

This PR changes EnsureRequirements to let it remove redundant ShuffleExchange.
Normally, redundant repartition operations are removed by CollapseRepartition rule but EnsureRequirements can insert another HashPartitioning or RangePartitioning immediately after the repartition, leading adjacent ShuffleExchange will be in the physical plan.
Even if their outputPartitioning are different, those adjacent ShuffleExcnahge are redundant.

An example.

val ordered = spark.range(1, 100).repartitionByRange(10, $"id".desc).orderBy($"id")
ordered.explain(true)

...

== Physical Plan ==
*(2) Sort [id#0L ASC NULLS FIRST], true, 0
+- Exchange rangepartitioning(id#0L ASC NULLS FIRST, 200), true, [id=#15]
   +- Exchange rangepartitioning(id#0L DESC NULLS LAST, 10), false, [id=#14]
      +- *(1) Range (1, 100, step=1, splits=12)

In this case, the lower Exchange for rangepartitioning is redundant.

Another example.

spark.conf.set("spark.sql.autoBroadcastJoinThreshold", 0)
val left = Seq(1,2,3).toDF.repartition(10)
val right = Seq(1,2,3).toDF
val joined = left.join(right, left("value") + 1 === right("value"))
joined.explain(true)

...

== Physical Plan ==
*(3) SortMergeJoin [(value#11 + 1)], [value#1], Inner
:- *(1) Sort [(value#11 + 1) ASC NULLS FIRST], false, 0
:  +- Exchange hashpartitioning((value#11 + 1), 200), true, [id=#41]
:     +- Exchange RoundRobinPartitioning(10), false, [id=#37]
:        +- LocalTableScan [value#11]
+- *(2) Sort [value#1 ASC NULLS FIRST], false, 0
   +- Exchange hashpartitioning(value#1, 200), true, [id=#42]
      +- LocalTableScan [value#1]

In this case, the lower Exchange for RoundRobinPartitioninginserted byEnsureRequirements` for the left side is not necessary.

Why are the changes needed?

To remove unnecessary shuffle.

Does this PR introduce any user-facing change?

Yes. After this change, such redundant ShuffleExchange will be removed.

== Physical Plan ==
*(2) Sort [id#0L ASC NULLS FIRST], true, 0
+- Exchange rangepartitioning(id#0L ASC NULLS FIRST, 200), true, [id=#14]
   +- *(1) Range (1, 100, step=1, splits=12)
== Physical Plan ==
*(3) SortMergeJoin [(value#6 + 1)], [value#1], Inner
:- *(1) Sort [(value#6 + 1) ASC NULLS FIRST], false, 0
:  +- Exchange hashpartitioning((value#6 + 1), 200), true, [id=#16]
:     +- LocalTableScan [value#6]
+- *(2) Sort [value#1 ASC NULLS FIRST], false, 0
   +- Exchange hashpartitioning(value#1, 200), true, [id=#17]
      +- LocalTableScan [value#1]

How was this patch tested?

New tests.

@maropu
Copy link
Member

maropu commented Sep 8, 2020

cc: @c21 @imback82

@@ -52,7 +52,8 @@ case class EnsureRequirements(conf: SQLConf) extends Rule[SparkPlan] {
case (child, distribution) =>
val numPartitions = distribution.requiredNumPartitions
.getOrElse(conf.numShufflePartitions)
ShuffleExchangeExec(distribution.createPartitioning(numPartitions), child)
val newChild = if (child.isInstanceOf[ShuffleExchangeExec]) child.children.head else child
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add some comments about why we can remove it?

@SparkQA
Copy link

SparkQA commented Sep 8, 2020

Test build #128397 has finished for PR 29677 at commit 1effe75.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 8, 2020

Test build #128407 has finished for PR 29677 at commit 0da9e82.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@c21 c21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sarutak for making this change. I have a question whether this optimization should be done on user side or on system side.

EnsureRequirements will add shuffle/sort when it's necessary to add, but will not remove the shuffle/sort added explicitly by users (DISTRIBUTE BY/SORT BY in SQL, repartitionByRange/orderBy in dataframe, etc). Users can choose to remove these repartitionByRange/orderBy in query by themselves to save the shuffle/sort, as they are not necessary to add. E.g. we can have more complicated case if user don't do the right thing: spark.range(1, 100).repartitionByRange(10, $"id".desc).repartitionByRange(10, $"id").orderBy($"id"), should we also handle these cases?

I vaguely remember removing the redundant shuffle exchange/sort explicitly added by users in query, is a won't fix. But I cannot find the old PR now, cc @cloud-fan . Thanks.

@sarutak
Copy link
Member Author

sarutak commented Sep 9, 2020

@c21 Thanks for the comment.

Users can choose to remove these repartitionByRange/orderBy in query by themselves to save the shuffle/sort, as they are not necessary to add.

Yes, user can choose it but it requires users to understand how Spark and Spark SQL work internally and some distributed computing knowledge beforehand.
Also, if a data processing logic or query is very complex, it will be difficult for users to judge which repartition operations can be removed.
Should Spark hide complexity for users?

E.g. we can have more complicated case if user don't do the right thing: spark.range(1, 100).repartitionByRange(10, $"id".desc).repartitionByRange(10, $"id").orderBy($"id"), should we also handle these cases?

Actually, this case is already handled by CollapseRepartition rule.

@SparkQA
Copy link

SparkQA commented Sep 11, 2020

Test build #128550 has finished for PR 29677 at commit 5680d48.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@sarutak
Copy link
Member Author

sarutak commented Sep 11, 2020

retest this please.

@SparkQA
Copy link

SparkQA commented Sep 11, 2020

Test build #128557 has finished for PR 29677 at commit 5680d48.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@sarutak
Copy link
Member Author

sarutak commented Sep 25, 2020

@c21 @imback82 @maropu @HyukjinKwon
Any other feedback for this change?

@HyukjinKwon
Copy link
Member

retest this please

ShuffleExchangeExec(distribution.createPartitioning(numPartitions), child)
// Like optimizer.CollapseRepartition removes adjacent repartition operations,
// adjacent repartitions performed by shuffle can be also removed.
val newChild = if (child.isInstanceOf[ShuffleExchangeExec]) child.children.head else child
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reminds me of #26946. cc @cloud-fan, @maryannxue and @stczwd FYI

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid the case @HyukjinKwon pointed out above, it seems we need to check if outputPartitioning is the same for narrowing down this optimization scope.

Btw, in my opinion, to avoid complicating the EnsureRequirements rule more, it would be better to remove these kinds of redundant shuffles in a new rule after EnsureRequirements like #27096.

Copy link
Member Author

@sarutak sarutak Sep 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems we need to check if outputPartitioning is the same for narrowing down this optimization scope.

Do you mean we should check whether outputPartitioning of ShuffleExchangeExec to be inserted and the one of the existing ShuffleExchangeExec?
If you mean so, it should already match this condition.

Just removing the existing ShuffleExchange and inserting the new ShuffleExchange whose outputPartitioning satisfies the required Distribution works, doesn't it?

Btw, in my opinion, to avoid complicating the EnsureRequirements rule more, it would be better to remove these kinds of redundant shuffles in a new rule after EnsureRequirements like #27096.

Having a new rule sounds better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean we should check whether outputPartitioning of ShuffleExchangeExec to be inserted and the one of the existing ShuffleExchangeExec?

I didn't mean so; have you checked #26946? This PR currently removes shuffles incrrectly;

scala> spark.range(1).selectExpr("id as a").write.saveAsTable("test")
scala> sql("SELECT /*+ REPARTITION(5) */ * FROM test ORDER BY a").explain()
== Physical Plan ==
*(2) Sort [a#5L ASC NULLS FIRST], true, 0
+- Exchange rangepartitioning(a#5L ASC NULLS FIRST, 200), true, [id=#53]
   +- Exchange RoundRobinPartitioning(5), false, [id=#52] <--- !!! Removed? !!!
      +- *(1) ColumnarToRow
         +- FileScan parquet default.test[a#5L] Batched: true, DataFilters: [], Format: Parquet, Location: InMemoryFileIndex[file:/Users/maropu/Repositories/spark/spark-master/spark-warehouse/test], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<a:bigint>

Copy link
Member Author

@sarutak sarutak Sep 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have considered such a case but if a shuffle/partitioning performs after its immediate child of another shuffle/partitioning, is the child shuffle/partitioning meaningful?
In the example above, the result is not different regardless of RoundRobinPartitioning removed or not right?

I checked #26946 and I understand that the root cause of that issue was the bug about how to handle hints in the parser so the approach which uses optimization was wrong and incomplete for that issue.

The solutions are similar between this PR and that PR but the issue is different.
This PR just focuses on removing redundant shuffles.

I might misunderstand about what you point out and that PR, please correct me if my understanding is wrong.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I got your point and it makes sense. Coud you update the PR description? The current one only describes the same output partioning cases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I've updated. Thanks.

@SparkQA
Copy link

SparkQA commented Sep 28, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33786/

@SparkQA
Copy link

SparkQA commented Sep 28, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33786/

@SparkQA
Copy link

SparkQA commented Sep 28, 2020

Test build #129171 has finished for PR 29677 at commit 5680d48.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 1, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33914/

@SparkQA
Copy link

SparkQA commented Oct 1, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33914/

@SparkQA
Copy link

SparkQA commented Oct 1, 2020

Test build #129298 has finished for PR 29677 at commit e699cb6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 4, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34003/

@SparkQA
Copy link

SparkQA commented Oct 4, 2020

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34003/

@sarutak
Copy link
Member Author

sarutak commented Oct 4, 2020

retest this please.

@SparkQA
Copy link

SparkQA commented Oct 5, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34004/

@SparkQA
Copy link

SparkQA commented Oct 5, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34004/

@SparkQA
Copy link

SparkQA commented Oct 5, 2020

Test build #129396 has finished for PR 29677 at commit d6e8cbe.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • case class DisableUnnecessaryBucketedScan(conf: SQLConf) extends Rule[SparkPlan]
  • abstract class JdbcConnectionProvider

Comment on lines +28 to +43
case class PruneShuffle() extends Rule[SparkPlan] {

override def apply(plan: SparkPlan): SparkPlan = plan.transform {
case op @ ShuffleExchangeExec(_, child: ShuffleExchangeExec, _) =>
op.withNewChildren(Seq(pruneShuffle(child)))
case other => other
}

private def pruneShuffle(plan: SparkPlan): SparkPlan = {
plan match {
case shuffle: ShuffleExchangeExec =>
pruneShuffle(shuffle.child)
case other => other
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd propose a more concise way:

Suggested change
case class PruneShuffle() extends Rule[SparkPlan] {
override def apply(plan: SparkPlan): SparkPlan = plan.transform {
case op @ ShuffleExchangeExec(_, child: ShuffleExchangeExec, _) =>
op.withNewChildren(Seq(pruneShuffle(child)))
case other => other
}
private def pruneShuffle(plan: SparkPlan): SparkPlan = {
plan match {
case shuffle: ShuffleExchangeExec =>
pruneShuffle(shuffle.child)
case other => other
}
}
}
case class PruneShuffle() extends Rule[SparkPlan] {
override def apply(plan: SparkPlan): SparkPlan = plan transformUp {
case op @ ShuffleExchangeExec(_, ShuffleExchangeExec(_, grandchild, _), _) =>
op.withNewChildren(grandchild :: Nil)
}
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, in the current implementation, at most two ShuffleExchangeExec can be consecutive. But if it is more consecutive in the future, transformUp is not efficient.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH performance wise we are talking about millisecond level optimization here. I would value readability over micro optimizations.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to what is discussed here, I'd like to avoid unnecessary transformation.

@SparkQA
Copy link

SparkQA commented Oct 5, 2020

Test build #129397 has finished for PR 29677 at commit d6e8cbe.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • case class DisableUnnecessaryBucketedScan(conf: SQLConf) extends Rule[SparkPlan]
  • abstract class JdbcConnectionProvider

@sarutak
Copy link
Member Author

sarutak commented Oct 28, 2020

cc: @cloud-fan

@SparkQA
Copy link

SparkQA commented Oct 28, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34949/

@SparkQA
Copy link

SparkQA commented Oct 28, 2020

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34949/

@SparkQA
Copy link

SparkQA commented Oct 28, 2020

Test build #130347 has finished for PR 29677 at commit 95454a2.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@github-actions
Copy link

github-actions bot commented Feb 6, 2021

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

@github-actions github-actions bot added the Stale label Feb 6, 2021
@github-actions github-actions bot closed this Feb 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants