Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-25406][SQL] For ParquetSchemaPruningSuite.scala, move calls to withSQLConf inside calls to test #22394

Conversation

mallman
Copy link
Contributor

@mallman mallman commented Sep 11, 2018

(Link to Jira: https://issues.apache.org/jira/browse/SPARK-25406)

What changes were proposed in this pull request?

The current use of withSQLConf in ParquetSchemaPruningSuite.scala is incorrect. The desired configuration settings are not being set when running the test cases.

This PR fixes that defective usage and addresses the test failures that were previously masked by that defect.

How was this patch tested?

I added code to relevant test cases to print the expected SQL configuration settings and found that the settings were not being set as expected. When I changed the order of calls to test and withSQLConf I found that the configuration settings were being set as expected.

@mallman
Copy link
Contributor Author

mallman commented Sep 11, 2018

I'm working on fixing these test failures now. Hopefully I'll have something pushed soon.

succeed when using a case-sensitive query parser
@mallman
Copy link
Contributor Author

mallman commented Sep 11, 2018

FYI @viirya @dbtsai @gatorsmile @HyukjinKwon

Can I get someone's review of this PR please? The unmasked failures appear to be false positives, so no changes to the tested code are required—just changes to the tests themselves.

Thanks.

@SparkQA
Copy link

SparkQA commented Sep 11, 2018

Test build #95951 has finished for PR 22394 at commit 8cca76b.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 11, 2018

Test build #95952 has finished for PR 22394 at commit c759aea.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@viirya
Copy link
Member

viirya commented Sep 12, 2018

retest this please.


private def testMixedCasePruning(testName: String)(testThunk: => Unit) {
test(s"Parquet-mr reader - case-insensitive parser - mixed-case schema - $testName") {
withSQLConf(SQLConf.PARQUET_VECTORIZED_READER_ENABLED.key -> "false",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The vectorized reader cases in both testSchemaPruning and testExactCasePruning are put ahead of Parquet-mr reader cases. Shall we follow it too in testMixedCasePruning?

SQLConf.CASE_SENSITIVE.key -> "true") {
test(s"Spark vectorized reader - case-sensitive parser - mixed-case schema - $testName") {
withMixedCaseData(testThunk)
private def testExactCasePruning(testName: String)(testThunk: => Unit) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

testCaseSensitivePruning?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks testing case insensitivity too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method names are meant to clarify the kind of queries being tested, not the setting of SQLConf.CASE_SENSITIVE.key. In this case, testExactCasePruning is supposed to mean that we're passing in a test in which the column names in the query are exactly the same as the column names in the relation.

It's not a very good name in that sense. I'll try to make it clearer and add a code comment to clarify.

testMixedCasePruning(testName)(testThunk)
}

private def testMixedCasePruning(testName: String)(testThunk: => Unit) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

testCaseInSensitivePruning?

@SparkQA
Copy link

SparkQA commented Sep 12, 2018

Test build #95970 has finished for PR 22394 at commit c759aea.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

testMixedCasePruning(testName)(testThunk)
}

private def testMixedCasePruning(testName: String)(testThunk: => Unit) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

testMixedCasePruning looks previously testing case sensitive case too. Now, it looks not. Would you mind if I ask the reason why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, this method ran testThunk with SQLConf.CASE_SENSITIVE.key set to true and false. That was a mistake and incorrect. For example, the query

select col1, col2.b from mixedcase

will fail if SQLConf.CASE_SENSITIVE.key is set to true. That mistake was causing 6 test cases to fail. Therefore, I moved the code that tests with a case-sensitive parser out of testMixedCasePruning into testExactCasePruning and included a call to testMixedCasePruning in testExactCasePruning.

I'll push a commit that refactors the method names and add code comments that will make this clearer.

@HyukjinKwon
Copy link
Member

Hey @mallman, let's just target to fix the problem in the JIRA without other refactorings.

@mallman
Copy link
Contributor Author

mallman commented Sep 12, 2018

Hey @mallman, let's just target to fix the problem in the JIRA without other refactorings.

The refactorings I've made address the problem directly. Hopefully that will be clearer with my most recent commit.

@SparkQA
Copy link

SparkQA commented Sep 12, 2018

Test build #95992 has finished for PR 22394 at commit 1c89637.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mallman
Copy link
Contributor Author

mallman commented Sep 12, 2018

Retest this please.

@SparkQA
Copy link

SparkQA commented Sep 12, 2018

Test build #95997 has finished for PR 22394 at commit 1c89637.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

retest this please

@SparkQA
Copy link

SparkQA commented Sep 13, 2018

Test build #96018 has finished for PR 22394 at commit 1c89637.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dilipbiswal
Copy link
Contributor

retest this please

@SparkQA
Copy link

SparkQA commented Sep 13, 2018

Test build #96029 has finished for PR 22394 at commit 1c89637.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dbtsai
Copy link
Member

dbtsai commented Sep 13, 2018

LGTM. Merged into master and branch 2.4. Thanks.

asfgit pushed a commit that referenced this pull request Sep 13, 2018
… `withSQLConf` inside calls to `test`

(Link to Jira: https://issues.apache.org/jira/browse/SPARK-25406)

## What changes were proposed in this pull request?

The current use of `withSQLConf` in `ParquetSchemaPruningSuite.scala` is incorrect. The desired configuration settings are not being set when running the test cases.

This PR fixes that defective usage and addresses the test failures that were previously masked by that defect.

## How was this patch tested?

I added code to relevant test cases to print the expected SQL configuration settings and found that the settings were not being set as expected. When I changed the order of calls to `test` and `withSQLConf` I found that the configuration settings were being set as expected.

Closes #22394 from mallman/spark-25406-fix_broken_schema_pruning_tests.

Authored-by: Michael Allman <msa@allman.ms>
Signed-off-by: DB Tsai <d_tsai@apple.com>
(cherry picked from commit a7e5aa6)
Signed-off-by: DB Tsai <d_tsai@apple.com>
@asfgit asfgit closed this in a7e5aa6 Sep 13, 2018
otterc pushed a commit to linkedin/spark that referenced this pull request Mar 22, 2023
… `withSQLConf` inside calls to `test`

(Link to Jira: https://issues.apache.org/jira/browse/SPARK-25406)

The current use of `withSQLConf` in `ParquetSchemaPruningSuite.scala` is incorrect. The desired configuration settings are not being set when running the test cases.

This PR fixes that defective usage and addresses the test failures that were previously masked by that defect.

I added code to relevant test cases to print the expected SQL configuration settings and found that the settings were not being set as expected. When I changed the order of calls to `test` and `withSQLConf` I found that the configuration settings were being set as expected.

Closes apache#22394 from mallman/spark-25406-fix_broken_schema_pruning_tests.

Authored-by: Michael Allman <msa@allman.ms>
Signed-off-by: DB Tsai <d_tsai@apple.com>
(cherry picked from commit a7e5aa6)
Signed-off-by: DB Tsai <d_tsai@apple.com>

Ref: LIHADOOP-48531

RB=1857521
A=
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants