[SPARK-5441][pyspark] Make SerDeUtil PairRDD to Python conversions more robust #4236

ghost · 2015-01-28T00:19:41Z

SerDeUtil.pairRDDToPython and SerDeUtil.pythonToPairRDD now both support empty RDDs by checking the result of take(1) instead of calling first which throws an exception.

AmplabJenkins · 2015-01-28T00:22:11Z

Can one of the admins verify this patch?

pwendell · 2015-01-28T02:16:34Z

Hey thanks for this - mind adding a regression test that fails on the old code?

ghost · 2015-01-28T19:58:13Z

I've added two regression tests which I made sure failed beforehand and succeed now.

JoshRosen · 2015-01-28T20:17:18Z

Jenkins, this is ok to test.

SparkQA · 2015-01-28T20:22:44Z

Test build #26243 has started for PR 4236 at commit a531c0c.

This patch merges cleanly.

SparkQA · 2015-01-28T21:33:00Z

Test build #26243 has finished for PR 4236 at commit a531c0c.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

AmplabJenkins · 2015-01-28T21:33:04Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26243/
Test PASSed.

JoshRosen · 2015-01-28T21:53:54Z

Thanks for adding tests. This looks good to me, so I'm going to merge it into master (1.3.0) and mark it for later backport into branch-1.2 (I'd commit it now, but we're in the middle of the 1.2.1 voting period right now, so we've placed a hold on merging in that branch until the vote passes).

JoshRosen · 2015-01-28T21:54:47Z

core/src/test/scala/org/apache/spark/api/python/SerDeUtilSuite.scala

+
+import java.io.{ByteArrayOutputStream, DataOutputStream}
+
+import org.apache.spark.{SharedSparkContext, SparkContext}


Minor nit: we usually place the Spark imports in their own section, separate from third-party library imports like Scalatest. I'll just fix this myself on merge, but I thought I'd mention it for future patches.

…re robust SerDeUtil.pairRDDToPython and SerDeUtil.pythonToPairRDD now both support empty RDDs by checking the result of take(1) instead of calling first which throws an exception. Author: Michael Nazario <mnazario@palantir.com> Closes #4236 from mnazario/feature/empty-first and squashes the following commits: a531c0c [Michael Nazario] Added regression tests for SPARK-5441 e3b2fb6 [Michael Nazario] Added acceptance of the empty case

JoshRosen · 2015-02-17T00:39:24Z

I've cherry-picked this to branch-1.2 (1.2.2).

…re robust SerDeUtil.pairRDDToPython and SerDeUtil.pythonToPairRDD now both support empty RDDs by checking the result of take(1) instead of calling first which throws an exception. Author: Michael Nazario <mnazario@palantir.com> Closes apache#4236 from mnazario/feature/empty-first and squashes the following commits: a531c0c [Michael Nazario] Added regression tests for SPARK-5441 e3b2fb6 [Michael Nazario] Added acceptance of the empty case

Added acceptance of the empty case

e3b2fb6

Added regression tests for SPARK-5441

a531c0c

JoshRosen reviewed Jan 28, 2015
View reviewed changes

asfgit closed this in e023112 Jan 28, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-5441][pyspark] Make SerDeUtil PairRDD to Python conversions more robust #4236

[SPARK-5441][pyspark] Make SerDeUtil PairRDD to Python conversions more robust #4236

ghost commented Jan 28, 2015

AmplabJenkins commented Jan 28, 2015

pwendell commented Jan 28, 2015

ghost commented Jan 28, 2015

JoshRosen commented Jan 28, 2015

SparkQA commented Jan 28, 2015

SparkQA commented Jan 28, 2015

AmplabJenkins commented Jan 28, 2015

JoshRosen commented Jan 28, 2015

JoshRosen Jan 28, 2015

JoshRosen commented Feb 17, 2015


		import java.io.{ByteArrayOutputStream, DataOutputStream}

		import org.apache.spark.{SharedSparkContext, SparkContext}

[SPARK-5441][pyspark] Make SerDeUtil PairRDD to Python conversions more robust #4236

[SPARK-5441][pyspark] Make SerDeUtil PairRDD to Python conversions more robust #4236

Conversation

ghost commented Jan 28, 2015

AmplabJenkins commented Jan 28, 2015

pwendell commented Jan 28, 2015

ghost commented Jan 28, 2015

JoshRosen commented Jan 28, 2015

SparkQA commented Jan 28, 2015

SparkQA commented Jan 28, 2015

AmplabJenkins commented Jan 28, 2015

JoshRosen commented Jan 28, 2015

JoshRosen Jan 28, 2015

Choose a reason for hiding this comment

JoshRosen commented Feb 17, 2015