-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-4759] Fix driver hanging from coalescing partitions #3633
Conversation
This is causing the TaskSetManager to try to schedule certain tasks on the host "" (empty string). The intended semantics here, however, is that the partition does not have preferred location, and the TSM should schedule the corresponding task in accordance.
Test build #24219 has started for PR 3633 at commit
|
Test build #24219 has finished for PR 3633 at commit
|
Test FAILed. |
Test build #24231 has started for PR 3633 at commit
|
Test build #24231 has finished for PR 3633 at commit
|
Test PASSed. |
def size = arr.size | ||
} | ||
|
||
private object PartitionGroup { | ||
def apply(prefLoc: String): PartitionGroup = PartitionGroup(Some(prefLoc)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this for backwards-compatibility? Can you have a case class with multiple constructors? If so, it might be nice to just add this to the PartitionGroup
class as a secondary constructor.
Also, what do you think about adding a require(prefLoc != "")
to guard against code that uses the old empty string technique?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is mainly because you can instantiate case classes without the new
keyword. In fact this is how we instantiate instances of this particular class in this file. Adding a new constructor means we need to use the new
keyword to instantiate it, and I believe many users of case classes don't actually do that. (also this is a private class so this argument probably doesn't even matter at all)
Yeah I'll add the guard against empty string.
Left a couple of small comments, but this looks good to me overall. +1 for refactoring "stringly-typed" fields into better types. |
Test build #24283 has started for PR 3633 at commit
|
Test build #24283 has finished for PR 3633 at commit
|
Test FAILed. |
Test build #24292 has started for PR 3633 at commit
|
Test build #24292 has finished for PR 3633 at commit
|
Test PASSed. |
Alright I'm merging this into master and branch-1.1 thanks for the comments. I'll back port this into branch-1.2 later. |
The driver hangs sometimes when we coalesce RDD partitions. See JIRA for more details and reproduction. This is because our use of empty string as default preferred location in `CoalescedRDDPartition` causes the `TaskSetManager` to schedule the corresponding task on host `""` (empty string). The intended semantics here, however, is that the partition does not have a preferred location, and the TSM should schedule the corresponding task accordingly. Author: Andrew Or <andrew@databricks.com> Closes #3633 from andrewor14/coalesce-preferred-loc and squashes the following commits: e520d6b [Andrew Or] Oops 3ebf8bd [Andrew Or] A few comments f370a4e [Andrew Or] Fix tests 2f7dfb6 [Andrew Or] Avoid using empty string as default preferred location (cherry picked from commit 4f93d0c) Signed-off-by: Andrew Or <andrew@databricks.com>
The driver hangs sometimes when we coalesce RDD partitions. See JIRA for more details and reproduction. This is because our use of empty string as default preferred location in `CoalescedRDDPartition` causes the `TaskSetManager` to schedule the corresponding task on host `""` (empty string). The intended semantics here, however, is that the partition does not have a preferred location, and the TSM should schedule the corresponding task accordingly. Author: Andrew Or <andrew@databricks.com> Closes #3633 from andrewor14/coalesce-preferred-loc and squashes the following commits: e520d6b [Andrew Or] Oops 3ebf8bd [Andrew Or] A few comments f370a4e [Andrew Or] Fix tests 2f7dfb6 [Andrew Or] Avoid using empty string as default preferred location (cherry picked from commit 4f93d0c) Signed-off-by: Andrew Or <andrew@databricks.com>
The driver hangs sometimes when we coalesce RDD partitions. See JIRA for more details and reproduction.
This is because our use of empty string as default preferred location in
CoalescedRDDPartition
causes theTaskSetManager
to schedule the corresponding task on host""
(empty string). The intended semantics here, however, is that the partition does not have a preferred location, and the TSM should schedule the corresponding task accordingly.