Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPARK-4968: takeOrdered to skip reduce step in case mappers return no partitions #3830

Closed
wants to merge 1 commit into from

Conversation

saucam
Copy link

@saucam saucam commented Dec 29, 2014

takeOrdered should skip reduce step in case mapped RDDs have no partitions. This prevents the mentioned exception :

  1. run query
    SELECT * FROM testTable WHERE market = 'market2' ORDER BY End_Time DESC LIMIT 100;
    Error trace
    java.lang.UnsupportedOperationException: empty collection
    at org.apache.spark.rdd.RDD$$anonfun$reduce$1.apply(RDD.scala:863)
    at org.apache.spark.rdd.RDD$$anonfun$reduce$1.apply(RDD.scala:863)
    at scala.Option.getOrElse(Option.scala:120)
    at org.apache.spark.rdd.RDD.reduce(RDD.scala:863)
    at org.apache.spark.rdd.RDD.takeOrdered(RDD.scala:1136)

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@rxin
Copy link
Contributor

rxin commented Dec 29, 2014

Jenkins, test this please.

@SparkQA
Copy link

SparkQA commented Dec 29, 2014

Test build #24867 has started for PR 3830 at commit 5974d10.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Dec 29, 2014

Test build #24867 has finished for PR 3830 at commit 5974d10.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24867/
Test PASSed.

@rxin
Copy link
Contributor

rxin commented Dec 29, 2014

Merging in master & branch-1.2. Thanks.

@asfgit asfgit closed this in 9bc0df6 Dec 29, 2014
asfgit pushed a commit that referenced this pull request Dec 29, 2014
… partitions

takeOrdered should skip reduce step in case mapped RDDs have no partitions. This prevents the mentioned exception :

4. run query
SELECT * FROM testTable WHERE market = 'market2' ORDER BY End_Time DESC LIMIT 100;
Error trace
java.lang.UnsupportedOperationException: empty collection
at org.apache.spark.rdd.RDD$$anonfun$reduce$1.apply(RDD.scala:863)
at org.apache.spark.rdd.RDD$$anonfun$reduce$1.apply(RDD.scala:863)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.reduce(RDD.scala:863)
at org.apache.spark.rdd.RDD.takeOrdered(RDD.scala:1136)

Author: Yash Datta <Yash.Datta@guavus.com>

Closes #3830 from saucam/fix_takeorder and squashes the following commits:

5974d10 [Yash Datta] SPARK-4968: takeOrdered to skip reduce step in case mappers return no partitions

(cherry picked from commit 9bc0df6)
Signed-off-by: Reynold Xin <rxin@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants