-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-13054] Always post TaskEnd event for tasks #10951
Conversation
Conflicts: core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala
Test build #50216 has finished for PR 10951 at commit
|
Test build #50219 has finished for PR 10951 at commit
|
// Need to handle tasks coming in late (speculative and jobs killed) | ||
// post a task end event so accounting for things manually tracking tasks work. | ||
// This really should be something other then success since the other speculative task | ||
// finished first. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's a better way to always post this event. I have some changes in #10958 to do this in a cleaner way: https://github.com/apache/spark/pull/10958/files#r51510044. I believe the semantics there are the same as the ones here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that is fine for now since its combining failed vs success tasks. I do think its a bit weird that Spark marks all speculative tasks as Success even when both obviously don't commit. That is part of the other jira I was going to file though and if needed it can be split back apart at that point.
It does seem a bit odd to throw SPARK-13054 in with the other changes in the same pr though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by the way you should put SPARK-13054 in the title of this patch if you plan to do that here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updating now, need to test and then will post updated version.
@tgravescs Have you verified that the changes here actually fix the issues you observed with dynamic allocation? I thought we needed to make the events posted in the right order in addition to making sure we always post a task end event for each task that ran, otherwise we'll still run into SPARK-11334. |
hey @andrewor14. Yes these changes fix the issue. Its really easy to reproduce if you want to test it out yourself just check the instructions in the jira. dynamic allocation is pretty broken right now with speculation on. At least if you want it to give executors back. Here I changed what was necessary (kind of the minimum) to make it work, all the tests I ran passed. I ran many times because its all timing dependent on the order in which the events come in and whether speculative tasks finish before original, etc.. Honestly the speculation code needs some cleanup and rework. I am going to file another jira for that though. In my opinion, you can try to make things happen in the right order as much as possible, but this is a distributed system and you have to be able to handle things coming out of order. |
Test build #51058 has finished for PR 10951 at commit
|
Jenkins, test this please |
Test build #51099 has started for PR 10951 at commit |
jenkins, test this please |
Test build #51112 has finished for PR 10951 at commit
|
I have no idea why these tests are failing since they shouldn't be related to this change, I'll try to run them locally today. |
Actually I see now a couple of the failures are due to throwing the Exception for commit denied so I'll look at those tests closer |
Since some of the unit tests are having issues with me change it to throw a commit denied exceptions rather then ignore when needsTaskCommit=false I'm removing that and we can handle it under SPARK-13343. Just sending out the TaskEnd fixes the issue with the accounting being wrong. |
Jenkins, test this please |
Test build #51381 has finished for PR 10951 at commit
|
@andrewor14 this should be basically the same change you had made with addition of the test now. |
LGTM. Let's quickly pass this by @kayousterhout before merging. |
@kayousterhout any comments |
Its been 2 weeks on this, unless there are other comments I'd like to commit this. @andrewor14 any objections? not sure this will cherry-pick back into 1.6 cleanly but if not I'll put another pull request up for that. |
Sorry for being insanely slow to look at this. I'm concerned about this code because of this line of code: We call taskEnded (which results in the code you modified getting called) when an executor is lost, for all of the tasks on that executor. I think it's possible, as a result, to get multiple task-end events for a particular task, in theory (if messages get re-ordered), so I think this could result in multiple SparkListenerTaskEnd events for the same task. I didn't look at this super thoroughly so let me know if you think this is a non-issue. |
thanks for the feedback, I'll check into that. |
So I don't think this PR changes that. https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala#L793 https://github.com/apache/spark/pull/10951/files#diff-6a9ff7fb74fd490a50462d45db2d5e11L1149 The only time it now sends the taskEnd that it didn't before is: |
To add some more detail, the executorLost function only calls taskEnded (which is task failed) if the task was in the list of successful tasks. handleSuccessfulTask calls the taskEnded and then adds it to the list of successful tasks. since taskEnded ends up sending the CompletionEvent it is possible for the events to show up in either order. But the DAGScheduler.handleTaskCompletion before this PR handled both of those already and sent SparkListenerTaskEnd in both cases. This PR doesn't change that behavior for those 2 events. The messages are success and resubmitted (which is failed). Both of which DAGScheduler.handleTaskCompletion would send the SparkListenerTaskEnd event for both with this PR and before this PR. If there is anything else you think I missed let me know. |
@@ -134,6 +134,7 @@ class DAGSchedulerSuite extends SparkFunSuite with LocalSparkContext with Timeou | |||
val successfulStages = new HashSet[Int] | |||
val failedStages = new ArrayBuffer[Int] | |||
val stageByOrderOfExecution = new ArrayBuffer[Int] | |||
var endedTasks = new HashSet[Long] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can be a val
, its a mutable hashset.
some minor comments, otherwise lgtm |
I'm merging this into master since |
Test build #53086 has finished for PR 10951 at commit
|
I am using dynamic container allocation and speculation and am seeing issues with the active task accounting. The Executor UI still shows active tasks on the an executor but the job/stage is all completed. I think its also affecting the dynamic allocation being able to release containers because it thinks there are still tasks. There are multiple issues with this: - If the task end for tasks (in this case probably because of speculation) comes in after the stage is finished, then the DAGScheduler.handleTaskCompletion will skip the task completion event Author: Thomas Graves <tgraves@prevailsail.corp.gq1.yahoo.com> Author: Thomas Graves <tgraves@staydecay.corp.gq1.yahoo.com> Author: Tom Graves <tgraves@yahoo-inc.com> Closes apache#10951 from tgravescs/SPARK-11701.
I am using dynamic container allocation and speculation and am seeing issues with the active task accounting. The Executor UI still shows active tasks on the an executor but the job/stage is all completed. I think its also affecting the dynamic allocation being able to release containers because it thinks there are still tasks. There are multiple issues with this: - If the task end for tasks (in this case probably because of speculation) comes in after the stage is finished, then the DAGScheduler.handleTaskCompletion will skip the task completion event Author: Thomas Graves <tgraves@prevailsail.corp.gq1.yahoo.com> Author: Thomas Graves <tgraves@staydecay.corp.gq1.yahoo.com> Author: Tom Graves <tgraves@yahoo-inc.com> Closes apache#10951 from tgravescs/SPARK-11701.
I am using dynamic container allocation and speculation and am seeing issues with the active task accounting. The Executor UI still shows active tasks on the an executor but the job/stage is all completed. I think its also affecting the dynamic allocation being able to release containers because it thinks there are still tasks.
There are multiple issues with this: