Skip to content

Commit

Permalink
[YARN][SPARK-4929] Bug fix: fix the yarn-client code to support HA
Browse files Browse the repository at this point in the history
Nowadays, yarn-client will exit directly when the HA change happens no matter how many times the am should retry.
The reason may be that the default final status only considerred the sys.exit, and the yarn-client HA cann't benefit from this.
So we should distinct the default final status between client and cluster, because the SUCCEEDED status may cause the HA failed in client mode and UNDEFINED may cause the error reporter in cluster when using sys.exit.

Author: huangzhaowei <carlmartinmax@gmail.com>

Closes #3771 from SaintBacchus/YarnHA and squashes the following commits:

c02bfcc [huangzhaowei] Improve the comment of the funciton 'getDefaultFinalStatus'
0e69924 [huangzhaowei] Bug fix: fix the yarn-client code to support HA
  • Loading branch information
SaintBacchus authored and tgravescs committed Jan 7, 2015
1 parent e21acc1 commit 5fde661
Showing 1 changed file with 15 additions and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ private[spark] class ApplicationMaster(args: ApplicationMasterArguments,
@volatile private var exitCode = 0
@volatile private var unregistered = false
@volatile private var finished = false
@volatile private var finalStatus = FinalApplicationStatus.SUCCEEDED
@volatile private var finalStatus = getDefaultFinalStatus
@volatile private var finalMsg: String = ""
@volatile private var userClassThread: Thread = _

Expand Down Expand Up @@ -152,6 +152,20 @@ private[spark] class ApplicationMaster(args: ApplicationMasterArguments,
exitCode
}

/**
* Set the default final application status for client mode to UNDEFINED to handle
* if YARN HA restarts the application so that it properly retries. Set the final
* status to SUCCEEDED in cluster mode to handle if the user calls System.exit
* from the application code.
*/
final def getDefaultFinalStatus() = {
if (isDriver) {
FinalApplicationStatus.SUCCEEDED
} else {
FinalApplicationStatus.UNDEFINED
}
}

/**
* unregister is used to completely unregister the application from the ResourceManager.
* This means the ResourceManager will not retry the application attempt on your behalf if
Expand Down

0 comments on commit 5fde661

Please sign in to comment.