Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-4664][Core] Throw an exception when spark.akka.frameSize > 2047 #3527

Closed
wants to merge 1 commit into from
Closed

Conversation

zsxwing
Copy link
Member

@zsxwing zsxwing commented Dec 1, 2014

If spark.akka.frameSize > 2047, it will overflow and become negative. Should have some assertion in maxFrameSizeBytes to warn people.

@JoshRosen
Copy link
Contributor

Nice catch. I don't think that it's very common to set spark.akka.frameSize these days, since 1.1's task broadcasting should have addressed the most common causes of messages that exceeded the frame size, but it certainly doesn't hurt to warn / guard against bad inputs.

@SparkQA
Copy link

SparkQA commented Dec 1, 2014

Test build #23973 has started for PR 3527 at commit 0089c7a.

  • This patch merges cleanly.

@zsxwing
Copy link
Member Author

zsxwing commented Dec 1, 2014

A potential usage of spark.akka.frameSize is when the size of MapStatuss exceeds spark.akka.frameSize, such as large number of mappers and reducers.

It was asked in stackoverflow: http://stackoverflow.com/questions/26904619/apache-spark-message-understanding

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23971/
Test FAILed.

@JoshRosen
Copy link
Contributor

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Dec 1, 2014

Test build #23974 has started for PR 3527 at commit 0089c7a.

  • This patch merges cleanly.

@aarondav
Copy link
Contributor

aarondav commented Dec 1, 2014

@zsxwing Note that the case you mentioned should no longer cause this issue either, as we use an extra compressed data structure when dealing with very large numbers of map partitions.

@zsxwing
Copy link
Member Author

zsxwing commented Dec 1, 2014

@zsxwing Note that the case you mentioned should no longer cause this issue either, as we use an extra compressed data structure when dealing with very large numbers of map partitions.

In extreme case, it's still possible. For example, assume that there are 10000 partitions in map side. If the user does not set a new numPartition, there will be 10000 reducer. If all sizes of these blocks are not 0, there will be huge MapStatuss: 10000 * 10000 * 1 = 100MB. I'm not sure what the compression ratio of GZIPOutputStream will be, but it may exceed spark.akka.frameSize.

Admittedly, this might be a user mistake and the user should set a proper numPartition.

@sryza
Copy link
Contributor

sryza commented Dec 1, 2014

10000 partitions doesn't sound that extreme to me.

@SparkQA
Copy link

SparkQA commented Dec 1, 2014

Test build #23973 has finished for PR 3527 at commit 0089c7a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23973/
Test PASSed.

@aarondav
Copy link
Contributor

aarondav commented Dec 1, 2014

I believe it is only 1 bit, not byte, per block. Further I would estimate
compression on largely uniform data to be at least around 10x. So your
example would ideally only use around 1.2MB.

Anyway, you can arbitrarily multiply the number of partitions to
demonstrate the issue. 1mil by 1mil is still a tough cookie to crack, but
we don't really want users to have to meddle with frame sizes.

Having this check is fine, of course, whether or not users should have to
change it.

@zsxwing
Copy link
Member Author

zsxwing commented Dec 1, 2014

I believe it is only 1 bit, not byte, per block

Thank you for correcting me. Was not aware of HighlyCompressedMapStatus.

@SparkQA
Copy link

SparkQA commented Dec 1, 2014

Test build #23974 has finished for PR 3527 at commit 0089c7a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23974/
Test PASSed.

@rxin
Copy link
Contributor

rxin commented Dec 1, 2014

Yea as @aarondav pointed out, I don't think akka framesize is going to be a problem anymore in 1.2+, regardless of the number of partitions. Still good to have this check to be defensive.

@rxin
Copy link
Contributor

rxin commented Dec 1, 2014

Merging in master.

@asfgit asfgit closed this in 1d238f2 Dec 1, 2014
@zsxwing zsxwing deleted the SPARK-4664 branch December 1, 2014 08:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants