Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-44000][SQL] Add hint to disable broadcasting and replicating one side of join #41499

Closed
wants to merge 1 commit into from

Conversation

aokolnychyi
Copy link
Contributor

What changes were proposed in this pull request?

This PR adds a new internal join hint to disable broadcasting and replicating one side of join.

Why are the changes needed?

These changes are needed to disable broadcasting and replicating one side of join when it is not permitted, such as the cardinality check in MERGE operations in PR #41448.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

This PR comes with tests. More tests are in #41448.

@aokolnychyi
Copy link
Contributor Author

cc @dongjoon-hyun

@dongjoon-hyun
Copy link
Member

Thank you, @aokolnychyi !

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM (only minor comment). Thank you again, @aokolnychyi .

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, cc @sunchao , @viirya , @huaxingao , too

@aokolnychyi
Copy link
Contributor Author

@dongjoon-hyun, I'll submit a follow-up PR with the comments.

@@ -354,27 +367,29 @@ abstract class SparkStrategies extends QueryPlanner[SparkPlan] {
}

def createCartesianProduct() = {
if (joinType.isInstanceOf[InnerLike]) {
if (joinType.isInstanceOf[InnerLike] && !hintToNotBroadcastAndReplicate(hint)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is CartesianProduct related to broadcast?

Copy link
Contributor Author

@aokolnychyi aokolnychyi Jun 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is related to replication. If I am correct, it would zip each partition on one side with each partition on the other side so the same target record would appear in multiple places.

Copy link
Member

@viirya viirya Jun 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, this is for replicating?

EDIT: saw your reply after adding this comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new hint prohibits both broadcasting and replication so it applies to 3 cases:

  • Broadcast hash
  • Cartesian product
  • BNLJ build side

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to update this to the pr description too.

@dongjoon-hyun
Copy link
Member

Thank you, @viirya and @aokolnychyi . Now, pending CIs.

@dongjoon-hyun
Copy link
Member

It seems that sql test pipeline is killed by receiving shutdown signal somehow.

...
2023-06-07T18:39:01.1867247Z ##[error]The runner has received a shutdown signal. This can happen when the runner service is stopped, or a manually started runner is canceled.
2023-06-07T18:39:01.5173829Z Session terminated, killing shell...
2023-06-07T18:39:01.7418237Z ##[error]The operation was canceled.

@dongjoon-hyun
Copy link
Member

When you have a chance, could you re-trigger the failed test pipeline?

@aokolnychyi
Copy link
Contributor Author

Triggered again: https://github.com/aokolnychyi/spark/actions/runs/5202190900

@dongjoon-hyun
Copy link
Member

Thank you so much!

@dongjoon-hyun
Copy link
Member

Unfortunately, master branch still has the issue on sql - other tests module and the last test of this PR seems to be running over 4h 40minutes as of now.

Screenshot 2023-06-07 at 7 58 58 PM

Screenshot 2023-06-07 at 7 58 34 PM

This PR passed catalyst and all hive modules. I verified JoinSuite (and new test cases) manually.

$ build/sbt "sql/testOnly *.JoinSuite"
...
[info] JoinSuite:
20:13:26.224 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[info] - equi-join is hash-join (69 milliseconds)
[info] - NO_BROADCAST_AND_REPLICATION hint is respected in cross joins (19 milliseconds)
...
[info] Run completed in 25 seconds, 333 milliseconds.
[info] Total number of tests run: 50
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 50, failed 0, canceled 0, ignored 0, pending 0
[info] All tests passed.
[success] Total time: 60 s, completed Jun 7, 2023, 8:13:50 PM

@dongjoon-hyun
Copy link
Member

Thank you, @aokolnychyi and @viirya .

@aokolnychyi
Copy link
Contributor Author

Thank you, @dongjoon-hyun @viirya! I've created #41509 to add comments.

czxm pushed a commit to czxm/spark that referenced this pull request Jun 12, 2023
…ne side of join

### What changes were proposed in this pull request?

This PR adds a new internal join hint to disable broadcasting and replicating one side of join.

### Why are the changes needed?

These changes are needed to disable broadcasting and replicating one side of join when it is not permitted, such as the cardinality check in MERGE operations in PR apache#41448.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

This PR comes with tests. More tests are in apache#41448.

Closes apache#41499 from aokolnychyi/spark-44000.

Authored-by: aokolnychyi <aokolnychyi@apple.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
Comment on lines +344 to +352
def getBroadcastNestedLoopJoinBuildSide(hint: JoinHint): Option[BuildSide] = {
if (hintToNotBroadcastAndReplicateLeft(hint)) {
Some(BuildRight)
} else if (hintToNotBroadcastAndReplicateRight(hint)) {
Some(BuildLeft)
} else {
None
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic is saying that if we hint to NOT broadcast left, we can broadcast right, and vice-versa. Shouldn't we check the hint on the other side as well?
i.e.

if (hintToNotBroadcastAndReplicateLeft(hint) && !hintToNotBroadcastAndReplicateRight(hint)) {
  Some(BuildRight)
} else if (hintToNotBroadcastAndReplicateRight(hint) && !hintToNotBroadcastAndReplicateLeft(hint)) {
  Some(BuildLeft)
} else {
  None
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

@aokolnychyi aokolnychyi Jun 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment, the new hint can be only set on one side and never on both. BNLJ is considered as the default join strategy and having no broadcast and replicate hints on both sides would mean there is no applicable fallback join strategy to use. If we were to adapt the method above, we can't keep the existing default logic that picks the broadcast side based on size (that could cause a correctness problem). What about adding validation the new hint is set only on one side?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about adding validation the new hint is set only on one side?
Sounds good.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, I will work on it. Created SPARK-44148.

viirya pushed a commit to viirya/spark-1 that referenced this pull request Oct 19, 2023
…ne side of join

### What changes were proposed in this pull request?

This PR adds a new internal join hint to disable broadcasting and replicating one side of join.

### Why are the changes needed?

These changes are needed to disable broadcasting and replicating one side of join when it is not permitted, such as the cardinality check in MERGE operations in PR apache#41448.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

This PR comes with tests. More tests are in apache#41448.

Closes apache#41499 from aokolnychyi/spark-44000.

Authored-by: aokolnychyi <aokolnychyi@apple.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit d88633a)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants