Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-4196][SPARK-4602][Streaming] Fix serialization issue in PairDStreamFunctions.saveAsNewAPIHadoopFiles #3457

Closed
wants to merge 2 commits into from

Conversation

tdas
Copy link
Contributor

@tdas tdas commented Nov 25, 2014

Solves two JIRAs in one shot

  • Makes the ForechDStream created by saveAsNewAPIHadoopFiles serializable for checkpoints
  • Makes the default configuration object used saveAsNewAPIHadoopFiles be the Spark's hadoop configuration

@tdas tdas changed the title [SPARK-4196][SPARK-4602] Fix serialization issue in PairDStreamFunctions.saveAsNewAPIHadoopFiles [SPARK-4196][SPARK-4602][Streaming] Fix serialization issue in PairDStreamFunctions.saveAsNewAPIHadoopFiles Nov 25, 2014
) {
// Wrap this in SerializableWritable so that ForeachDStream can be serialized for checkpoints
val serializableConf = new SerializableWritable(conf)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah you're already on it, yep, that looks good.

@srowen
Copy link
Member

srowen commented Nov 25, 2014

Just had a thought -- will saveAsHadoopFiles and JobConf need the same treatment?

@tdas
Copy link
Contributor Author

tdas commented Nov 25, 2014

That's exactly I am testing now. Adding an unit test at the very least.

@SparkQA
Copy link

SparkQA commented Nov 25, 2014

Test build #23842 has started for PR 3457 at commit b382ea9.

  • This patch merges cleanly.

@tdas
Copy link
Contributor Author

tdas commented Nov 25, 2014

@srowen Yep, saveAsHadoopFiles had the same issue. Fixed it. Thanks for catching this bug!

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23842/
Test FAILed.

@tdas
Copy link
Contributor Author

tdas commented Nov 25, 2014

Jenkins, test this please.

@SparkQA
Copy link

SparkQA commented Nov 25, 2014

Test build #23843 has started for PR 3457 at commit bb4729a.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Nov 25, 2014

Test build #23843 has finished for PR 3457 at commit bb4729a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23843/
Test PASSed.

mengxr pushed a commit to mengxr/spark that referenced this pull request Nov 26, 2014
…treamFunctions.saveAsNewAPIHadoopFiles

Solves two JIRAs in one shot
- Makes the ForechDStream created by saveAsNewAPIHadoopFiles serializable for checkpoints
- Makes the default configuration object used saveAsNewAPIHadoopFiles be the Spark's hadoop configuration

Author: Tathagata Das <tathagata.das1565@gmail.com>

Closes apache#3457 from tdas/savefiles-fix and squashes the following commits:

bb4729a [Tathagata Das] Same treatment for saveAsHadoopFiles
b382ea9 [Tathagata Das] Fix serialization issue in PairDStreamFunctions.saveAsNewAPIHadoopFiles.

(cherry picked from commit 8838ad7)
Signed-off-by: Tathagata Das <tathagata.das1565@gmail.com>
@asfgit asfgit closed this in 8838ad7 Nov 26, 2014
asfgit pushed a commit that referenced this pull request Nov 26, 2014
…treamFunctions.saveAsNewAPIHadoopFiles

Solves two JIRAs in one shot
- Makes the ForechDStream created by saveAsNewAPIHadoopFiles serializable for checkpoints
- Makes the default configuration object used saveAsNewAPIHadoopFiles be the Spark's hadoop configuration

Author: Tathagata Das <tathagata.das1565@gmail.com>

Closes #3457 from tdas/savefiles-fix and squashes the following commits:

bb4729a [Tathagata Das] Same treatment for saveAsHadoopFiles
b382ea9 [Tathagata Das] Fix serialization issue in PairDStreamFunctions.saveAsNewAPIHadoopFiles.

(cherry picked from commit 8838ad7)
Signed-off-by: Tathagata Das <tathagata.das1565@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants