SPARK-2634: Change MapOutputTrackerWorker.mapStatuses to ConcurrentHashMap #1541

zsxwing · 2014-07-23T03:22:27Z

MapOutputTrackerWorker.mapStatuses is used concurrently, it should be thread-safe. This bug has already been fixed in #1328. Nevertheless, considering #1328 won't be merged soon, I send this trivial fix and hope this issue can be solved soon.

SparkQA · 2014-07-23T03:28:25Z

QA tests have started for PR 1541. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17011/consoleFull

SparkQA · 2014-07-23T03:29:15Z

QA results for PR 1541:
- This patch FAILED unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17011/consoleFull

…shMap

SparkQA · 2014-07-23T03:38:27Z

QA tests have started for PR 1541. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17012/consoleFull

SparkQA · 2014-07-23T05:19:23Z

QA results for PR 1541:
- This patch PASSES unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17012/consoleFull

mridulm · 2014-07-23T10:14:23Z

Instead of a ConcurrentHashMap, we should actually move it to a disk backed Map - the cleanup of this datastructure is painful - which it can become extremely large; particularly for iterative algo's.
Fortunately, most cases, we just need the last few entries - and so LRU scheme by most disk backed map's work beautifully.

We have been using mapdb for this in MapOutputTrackerWorker - and it has worked beautifully.
@rxin might be particularly interested since he is looking into reduce memory footprint of spark
CC @mateiz - this is what I had mentioned about earlier.

zsxwing · 2014-07-23T12:16:32Z

Instead of a ConcurrentHashMap, we should actually move it to a disk backed Map

Agree. Is there a PR ready? I think this is a critical bug and hope it can be fixed soon.

kayousterhout · 2014-07-24T20:27:13Z

When is this accessed concurrently? I looked quickly and can only find updates from the (single-threaded) DAGScheduler event loop. Is the issue that it can be read/written concurrently?

zsxwing · 2014-07-25T00:56:45Z

When is this accessed concurrently?

For example, HashShuffleReader.read -> object BlockStoreShuffleFetcher.fetch -> MapOutputTracker.getServerStatuses. Different HashShuffleReader instances can be used in different Tasks. All TaskRunners share the same env of Executor. Therefore, all Tasks uses the same MapOutputTracker instance in the SparkEnv.

zsxwing · 2014-09-01T14:44:33Z

ping @JoshRosen, could you help take a look at this one?

JoshRosen · 2014-09-02T00:20:48Z

Thanks for the reminder.

@kayousterhout I looked over @zsxwing's example and I agree that there's a thread-safety issue here. We can definitely have multiple concurrent block fetches that could race when accessing mapStatuses.

There's a lot of other state in MapOutputTracker that's guarded with synchronized, which implies that this instances of MapOutputTracker will be accessed from multiple threads. In fact, there's even a statuses.synchronized at the end of getServerStatuses that's guarding a MapOutputTracker.convertMapStatuses call, but for some reason the other branch of the if guards it using fetchedStatuses.synchronized (which doesn't even make sense, since fetchedStatuses is a local variable defined inside of getServerStatuses).

Since the synchronization logic here seems kind of messy / confusing and mapStatuses is only accessed from MapOutputTracker, maybe it would be better to just add proper synchronization around reads/writes to mapStatuses rather than converting it to a ConcurrentHashMap.

JoshRosen · 2014-09-02T00:24:48Z

Actually, it looks like the fetchedStatuses vs statuses synchronization is correct, since it's guarding against modification to that statuses array while reading it in convertMapStatuses. This needs a closer look, but I'm not sure whether we need this synchronization, since the output status for a particular map task should be immutable once written.

zsxwing · 2014-09-02T02:15:46Z

the output status for a particular map task should be immutable once written.

But mapStatuses is mutable. Therefore, some thread is putting an item to mapStatuses and the space of map is not enough to store this new item, map will expand its space, rehash the items and update its internal state (https://github.com/scala/scala/blob/2.11.x/src/library/scala/collection/mutable/HashTable.scala#L249). If another thread is calling get at the same time, it may get a wrong value, or crash the thread.

JoshRosen · 2014-09-02T02:20:53Z

I agree that mapStatuses itself is mutable. I was just observing that the values stored in mapStatuses (the Array[MapStatus]es) aren't modified after they're stored in the HashMap.

zsxwing · 2014-09-02T02:31:33Z

You remind me that even if Array[MapStatus] won't be modified, according to java memory model, fetchedStatuses (https://github.com/zsxwing/spark/blob/SPARK-2634/core/src/main/scala/org/apache/spark/MapOutputTracker.scala#L168) and statuses (https://github.com/zsxwing/spark/blob/SPARK-2634/core/src/main/scala/org/apache/spark/MapOutputTracker.scala#L186) may be inconsistent without proper protection.

zsxwing · 2014-09-02T02:41:28Z

ConcurrentHashMap should fix all these issues I found.

zsxwing · 2014-09-18T01:46:35Z

@JoshRosen do you think it's OK?

JoshRosen · 2014-09-25T21:43:39Z

This seems fine to me, actually, so I'm going to re-run the tests then merge it. We might want to do some other cleanup in this file, but that can wait for a separate PR.

SparkQA · 2014-09-25T21:45:53Z

QA tests have started for PR 1541 at commit d450053.

This patch merges cleanly.

SparkQA · 2014-09-25T22:53:03Z

QA tests have finished for PR 1541 at commit d450053.

This patch passes unit tests.
This patch merges cleanly.
This patch adds no public classes.

zsxwing · 2014-09-26T01:30:01Z

Thank you @JoshRosen

SPARK-2634: Change MapOutputTrackerWorker.mapStatuses to ConcurrentHa…

d450053

…shMap

asfgit closed this in 86bce76 Sep 26, 2014

zsxwing deleted the SPARK-2634 branch September 26, 2014 01:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SPARK-2634: Change MapOutputTrackerWorker.mapStatuses to ConcurrentHashMap #1541

SPARK-2634: Change MapOutputTrackerWorker.mapStatuses to ConcurrentHashMap #1541

zsxwing commented Jul 23, 2014

SparkQA commented Jul 23, 2014

SparkQA commented Jul 23, 2014

SparkQA commented Jul 23, 2014

SparkQA commented Jul 23, 2014

mridulm commented Jul 23, 2014

zsxwing commented Jul 23, 2014

kayousterhout commented Jul 24, 2014

zsxwing commented Jul 25, 2014

zsxwing commented Sep 1, 2014

JoshRosen commented Sep 2, 2014

JoshRosen commented Sep 2, 2014

zsxwing commented Sep 2, 2014

JoshRosen commented Sep 2, 2014

zsxwing commented Sep 2, 2014

zsxwing commented Sep 2, 2014

zsxwing commented Sep 18, 2014

JoshRosen commented Sep 25, 2014

SparkQA commented Sep 25, 2014

SparkQA commented Sep 25, 2014

zsxwing commented Sep 26, 2014

SPARK-2634: Change MapOutputTrackerWorker.mapStatuses to ConcurrentHashMap #1541

SPARK-2634: Change MapOutputTrackerWorker.mapStatuses to ConcurrentHashMap #1541

Conversation

zsxwing commented Jul 23, 2014

SparkQA commented Jul 23, 2014

SparkQA commented Jul 23, 2014

SparkQA commented Jul 23, 2014

SparkQA commented Jul 23, 2014

mridulm commented Jul 23, 2014

zsxwing commented Jul 23, 2014

kayousterhout commented Jul 24, 2014

zsxwing commented Jul 25, 2014

zsxwing commented Sep 1, 2014

JoshRosen commented Sep 2, 2014

JoshRosen commented Sep 2, 2014

zsxwing commented Sep 2, 2014

JoshRosen commented Sep 2, 2014

zsxwing commented Sep 2, 2014

zsxwing commented Sep 2, 2014

zsxwing commented Sep 18, 2014

JoshRosen commented Sep 25, 2014

SparkQA commented Sep 25, 2014

SparkQA commented Sep 25, 2014

zsxwing commented Sep 26, 2014