Skip to content

Commit

Permalink
update comments
Browse files Browse the repository at this point in the history
  • Loading branch information
carsonwang committed Jan 15, 2019
1 parent 4a2311c commit 2c55985
Show file tree
Hide file tree
Showing 2 changed files with 10 additions and 7 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,10 @@ import org.apache.spark.util.ThreadUtils

/**
* In adaptive execution mode, an execution plan is divided into multiple QueryStages. Each
* QueryStage is a sub-tree that runs in a single stage.
* QueryStage is a sub-tree that runs in a single stage. Before executing current stage, we will
* first submit all its child stages, wait for their completions and collect their statistics.
* Based on the collected data, we can potentially optimize the execution plan in current stage,
* change the number of reducer and do other optimizations.
*/
abstract class QueryStage extends UnaryExecNode {

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,13 +25,13 @@ import org.apache.spark.sql.catalyst.plans.physical.{HashPartitioning, Partition
import org.apache.spark.sql.execution._

/**
* QueryStageInput is the leaf node of a QueryStage and serves as its input. It is responsible for
* changing the output partition based on the need of its QueryStage. It gets the ShuffledRowRDD
* QueryStageInput is the leaf node of a QueryStage and serves as its input. A QueryStage knows
* its child stages by collecting all the QueryStageInputs. For a ShuffleQueryStageInput, it
* controls how to read the ShuffledRowRDD generated by its child stage. It gets the ShuffledRowRDD
* from its child stage and creates a new ShuffledRowRDD with different partitions by specifying
* an optional array of partition start indices. For example, a ShuffledQueryStage can be reused
* by two different QueryStages. One QueryStageInput can let the first task read partition 0 to 3,
* while in another stage, the QueryStageInput can let the first task read partition 0 to 1.
* A QueryStage knows its child stages by collecting all the QueryStageInputs.
* an array of partition start indices. For example, a ShuffledQueryStage can be reused by two
* different QueryStages. One QueryStageInput can let the first task read partition 0 to 3, while
* in another stage, the QueryStageInput can let the first task read partition 0 to 1.
*/
abstract class QueryStageInput extends LeafExecNode {

Expand Down

0 comments on commit 2c55985

Please sign in to comment.