-
Notifications
You must be signed in to change notification settings - Fork 385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
job killing in Spark #935
job killing in Spark #935
Conversation
@@ -812,6 +813,38 @@ class SparkContext( | |||
result | |||
} | |||
|
|||
def submitJob[T, U, R]( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Certainly not unique to this method, but this is another example of something that is publicly accessible in a SparkContext that probably shouldn't be. Is there any plan to provide a more restricted SparkContext API (for use, e.g., in the shell), or are we going to continue to assume that if we don't tell people about something, then they won't misuse it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually why wouldn't we want this till be accessible? It is at the same
level as runJob.
On Sep 18, 2013 9:50 AM, "Mark Hamstra" notifications@github.com wrote:
In core/src/main/scala/org/apache/spark/SparkContext.scala:
@@ -812,6 +813,38 @@ class SparkContext(
result
}
- def submitJob[T, U, R](
Certainly not unique to this method, but this is another example of
something that is publicly accessible in a SparkContext that probably
shouldn't be. Is there any plan to provide a more restricted SparkContext
API (for use, e.g., in the shell), or are we going to continue to assume
that if we don't tell people about something, then they won't misuse it?—
Reply to this email directly or view it on GitHubhttps://github.com//pull/935/files#r6439038
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true, and we certainly want custom RDD developers and the like to be able to access runJob etc. My question, though, is whether we really want all users of, e.g., spark-shell to be able to call runJob, submitJob, etc. directly. It's something of a question of whether spark-shell is targeted at developers (who want to be able to do anything in the repl that they could in standalone code), or whether it is targeted at end users who just want/need to be able to do the kinds of higher-level things with RDDs that we typically show in examples. Maybe the issue isn't so much that, within SparkContext itself, there needs to be another access level for methods as it is that we could use another kind of interactive tool/UI (something like iPython Notebooks, perhaps...) that doesn't have full access to all the SparkContext methods that many users shouldn't really be messing with directly.
* without a JobSubmitted event. | ||
* Submit a job to the job scheduler and get a JobWaiter object back. The JobWaiter object | ||
* can be used to block until the the job finishes executing or can be used to kill the job. | ||
* If the given RDD does not contain any partitions, the function returns None. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this last sentence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea it's an old comment. Need to change that.
Merged build triggered. |
Merged build started. |
Merged build finished. |
One or more automated tests failed |
} else { | ||
// Don't apply map-side combiner. | ||
// A sanity check to make sure mergeCombiners is not defined. | ||
assert(mergeCombiners == null) | ||
val values = new ShuffledRDD[K, V, (K, V)](self, partitioner).setSerializer(serializerClass) | ||
values.mapPartitions(aggregator.combineValuesByKey, preservesPartitioning = true) | ||
.interruptible() | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which means that you need this in PairRDDFunctionsSuite
:
- assert(deps.size === 2) // ShuffledRDD, ParallelCollection
+ assert(deps.size === 3) // ShuffledRDD, ParallelCollection, InterruptibleRDD
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea i know. im intentionally not changing the tests until i am happy with my design
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heh, okay. But so far, that's the only test you've broken.
Reynold, did you run the spark-perf tests on this? Would be good to check that it adds no overhead. |
Just ran some perf tests. Didn't notice any performance degradation with this turned on. without job killing:
with job killing run 1
with job killing run 2
|
Moving this one to https://github.com/apache/incubator-spark/pull/29 |
This subsumes pull request #665 by @harsha2010
Different from #665 , this PR is not using interrupts to kill tasks because interrupts would rely on the user code properly handling the interrupts. One big counter example is Hadoop client would mark a data node as dead when it sees an interrupt during reads. Instead, this pull requests add a new volatile variable "interrupted" to the TaskContext class, and rely on the upper level code to check that flag. This can be done by simply wrapping an existing iterator with an InterruptibleIterator.
Todos: