Skip to content

Commit

Permalink
Update docs to be more explicit.
Browse files Browse the repository at this point in the history
  • Loading branch information
concretevitamin committed Jul 29, 2014
1 parent 573e644 commit 729a8e2
Showing 1 changed file with 4 additions and 7 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -52,8 +52,10 @@ private[sql] abstract class SparkStrategies extends QueryPlanner[SparkPlan] {
* the two join sides. When planning a [[execution.BroadcastHashJoin]], if one side has an
* estimated physical size smaller than the user-settable threshold
* `spark.sql.auto.convert.join.size`, the planner would mark it as the ''build'' relation and
* mark the other relation as the ''stream'' side. If both estimates exceed the threshold,
* they will instead be used to decide the build side in a [[execution.ShuffledHashJoin]].
* mark the other relation as the ''stream'' side. The build table will be ''broadcasted'' to
* all of the executors involved in the join, as a [[org.apache.spark.broadcast.Broadcast]]
* object. If both estimates exceed the threshold, they will instead be used to decide the build
* side in a [[execution.ShuffledHashJoin]].
*/
object HashJoin extends Strategy with PredicateHelper {
private[this] def broadcastHashJoin(
Expand Down Expand Up @@ -144,11 +146,6 @@ private[sql] abstract class SparkStrategies extends QueryPlanner[SparkPlan] {
}
}

/**
* This strategy applies a simple optimization based on the estimates of the physical sizes of
* the two join sides: the planner would mark the relation with the smaller estimated physical
* size as the ''build'' (broadcast) relation and mark the other as the ''stream'' relation.
*/
object BroadcastNestedLoopJoin extends Strategy {
def apply(plan: LogicalPlan): Seq[SparkPlan] = plan match {
case logical.Join(left, right, joinType, condition) =>
Expand Down

0 comments on commit 729a8e2

Please sign in to comment.