Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-8138] [SQL] Improves error message when conflicting partition columns are found #6610

Closed
wants to merge 4 commits into from

Conversation

liancheng
Copy link
Contributor

This PR improves the error message shown when conflicting partition column names are detected. This can be particularly annoying and confusing when there are a large number of partitions while a handful of them happened to contain unexpected temporary file(s). Now all suspicious directories are listed as below:

java.lang.AssertionError: assertion failed: Conflicting partition column names detected:

        Partition column name list #0: b, c, d
        Partition column name list #1: b, c
        Partition column name list #2: b

For partitioned table directories, data files should only live in leaf directories. Please check the following directories for unexpected files:

        file:/tmp/foo/b=0
        file:/tmp/foo/b=1
        file:/tmp/foo/b=1/c=1
        file:/tmp/foo/b=0/c=0

@liancheng
Copy link
Contributor Author

cc @rxin

@SparkQA
Copy link

SparkQA commented Jun 3, 2015

Test build #34081 has finished for PR 6610 at commit c77fa4e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • trait TypeCheckResult
    • case class TypeCheckFailure(message: String) extends TypeCheckResult
    • abstract class UnaryArithmetic extends UnaryExpression
    • case class UnaryMinus(child: Expression) extends UnaryArithmetic
    • case class Sqrt(child: Expression) extends UnaryArithmetic
    • case class Abs(child: Expression) extends UnaryArithmetic
    • case class BitwiseNot(child: Expression) extends UnaryArithmetic
    • case class MaxOf(left: Expression, right: Expression) extends BinaryArithmetic
    • case class MinOf(left: Expression, right: Expression) extends BinaryArithmetic
    • case class Atan2(left: Expression, right: Expression)
    • case class Hypot(left: Expression, right: Expression)
    • case class EqualTo(left: Expression, right: Expression) extends BinaryComparison

@yhuai
Copy link
Contributor

yhuai commented Jun 5, 2015

Let's create a jira for this.

@liancheng liancheng changed the title [SQL] [Minor] Improves error message when conflicting partition columns are found [SPARK-8138] [SQL] Improves error message when conflicting partition columns are found Jun 6, 2015
@liancheng
Copy link
Contributor Author

Filed SPARK-8138 for this and updated PR title.

@liancheng
Copy link
Contributor Author

@yhuai As discussed offline, now we give a more descriptive and help message with a list of all suspicious non-leaf partition directories.

@SparkQA
Copy link

SparkQA commented Jun 7, 2015

Test build #34389 has finished for PR 6610 at commit afbdb14.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

})
val distinctPartColNames = pathsWithPartitionValues.map(_._2.columnNames).distinct

def listConflictingPartitionColumns: String = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this out, make it accept arguments in the form of collections, and write unit test for this function.

this function is way too complicated to not have unit tests.

@liancheng
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented Jun 24, 2015

Test build #35651 has finished for PR 6610 at commit a149250.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 24, 2015

Test build #35656 has finished for PR 6610 at commit 7d05f2c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@liancheng
Copy link
Contributor Author

Merging to master.

@asfgit asfgit closed this in cc465fd Jun 24, 2015
@liancheng liancheng deleted the part-errmsg branch June 24, 2015 09:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants