Skip to content

Commit

Permalink
Avoid the prepareExecuteStage#QueryStage method is executed multi-tim…
Browse files Browse the repository at this point in the history
…es when call executeCollect, executeToIterator and executeTake action multi-times (apache#70)

* Avoid the prepareExecuteStage#QueryStage method is executed multi-times when call executeCollect, executeToIterator and executeTake action multi-times

* only add the check in prepareExecuteStage method to avoid duplicate check in other methods

* small fix
  • Loading branch information
JkSelf authored and carsonwang committed Nov 23, 2018
1 parent 1ab87f9 commit 011c2d3
Showing 1 changed file with 8 additions and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -85,12 +85,18 @@ abstract class QueryStage extends UnaryExecNode {
Future.sequence(shuffleStageFutures)(implicitly, QueryStage.executionContext), Duration.Inf)
}

private var prepared = false

/**
* Before executing the plan in this query stage, we execute all child stages, optimize the plan
* in this stage and determine the reducer number based on the child stages' statistics. Finally
* we do a codegen for this query stage and update the UI with the new plan.
*/
def prepareExecuteStage(): Unit = {
def prepareExecuteStage(): Unit = synchronized {
// Ensure the prepareExecuteStage method only be executed once.
if (prepared) {
return
}
// 1. Execute childStages
executeChildStages()

Expand Down Expand Up @@ -152,6 +158,7 @@ abstract class QueryStage extends UnaryExecNode {
queryExecution.toString,
SparkPlanInfo.fromSparkPlan(queryExecution.executedPlan)))
}
prepared = true
}

// Caches the created ShuffleRowRDD so we can reuse that.
Expand Down

0 comments on commit 011c2d3

Please sign in to comment.