-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-25406][SQL] For ParquetSchemaPruningSuite.scala, move calls to withSQLConf
inside calls to test
#22394
[SPARK-25406][SQL] For ParquetSchemaPruningSuite.scala, move calls to withSQLConf
inside calls to test
#22394
Conversation
I'm working on fixing these test failures now. Hopefully I'll have something pushed soon. |
succeed when using a case-sensitive query parser
FYI @viirya @dbtsai @gatorsmile @HyukjinKwon Can I get someone's review of this PR please? The unmasked failures appear to be false positives, so no changes to the tested code are required—just changes to the tests themselves. Thanks. |
Test build #95951 has finished for PR 22394 at commit
|
Test build #95952 has finished for PR 22394 at commit
|
retest this please. |
...est/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaPruningSuite.scala
Show resolved
Hide resolved
|
||
private def testMixedCasePruning(testName: String)(testThunk: => Unit) { | ||
test(s"Parquet-mr reader - case-insensitive parser - mixed-case schema - $testName") { | ||
withSQLConf(SQLConf.PARQUET_VECTORIZED_READER_ENABLED.key -> "false", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The vectorized reader cases in both testSchemaPruning
and testExactCasePruning
are put ahead of Parquet-mr reader cases. Shall we follow it too in testMixedCasePruning
?
SQLConf.CASE_SENSITIVE.key -> "true") { | ||
test(s"Spark vectorized reader - case-sensitive parser - mixed-case schema - $testName") { | ||
withMixedCaseData(testThunk) | ||
private def testExactCasePruning(testName: String)(testThunk: => Unit) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
testCaseSensitivePruning?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks testing case insensitivity too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method names are meant to clarify the kind of queries being tested, not the setting of SQLConf.CASE_SENSITIVE.key
. In this case, testExactCasePruning
is supposed to mean that we're passing in a test in which the column names in the query are exactly the same as the column names in the relation.
It's not a very good name in that sense. I'll try to make it clearer and add a code comment to clarify.
testMixedCasePruning(testName)(testThunk) | ||
} | ||
|
||
private def testMixedCasePruning(testName: String)(testThunk: => Unit) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
testCaseInSensitivePruning?
Test build #95970 has finished for PR 22394 at commit
|
testMixedCasePruning(testName)(testThunk) | ||
} | ||
|
||
private def testMixedCasePruning(testName: String)(testThunk: => Unit) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
testMixedCasePruning
looks previously testing case sensitive case too. Now, it looks not. Would you mind if I ask the reason why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously, this method ran testThunk
with SQLConf.CASE_SENSITIVE.key
set to true and false. That was a mistake and incorrect. For example, the query
select col1, col2.b from mixedcase
will fail if SQLConf.CASE_SENSITIVE.key
is set to true. That mistake was causing 6 test cases to fail. Therefore, I moved the code that tests with a case-sensitive parser out of testMixedCasePruning
into testExactCasePruning
and included a call to testMixedCasePruning
in testExactCasePruning
.
I'll push a commit that refactors the method names and add code comments that will make this clearer.
Hey @mallman, let's just target to fix the problem in the JIRA without other refactorings. |
The refactorings I've made address the problem directly. Hopefully that will be clearer with my most recent commit. |
Test build #95992 has finished for PR 22394 at commit
|
Retest this please. |
Test build #95997 has finished for PR 22394 at commit
|
retest this please |
Test build #96018 has finished for PR 22394 at commit
|
retest this please |
Test build #96029 has finished for PR 22394 at commit
|
LGTM. Merged into master and branch 2.4. Thanks. |
… `withSQLConf` inside calls to `test` (Link to Jira: https://issues.apache.org/jira/browse/SPARK-25406) ## What changes were proposed in this pull request? The current use of `withSQLConf` in `ParquetSchemaPruningSuite.scala` is incorrect. The desired configuration settings are not being set when running the test cases. This PR fixes that defective usage and addresses the test failures that were previously masked by that defect. ## How was this patch tested? I added code to relevant test cases to print the expected SQL configuration settings and found that the settings were not being set as expected. When I changed the order of calls to `test` and `withSQLConf` I found that the configuration settings were being set as expected. Closes #22394 from mallman/spark-25406-fix_broken_schema_pruning_tests. Authored-by: Michael Allman <msa@allman.ms> Signed-off-by: DB Tsai <d_tsai@apple.com> (cherry picked from commit a7e5aa6) Signed-off-by: DB Tsai <d_tsai@apple.com>
… `withSQLConf` inside calls to `test` (Link to Jira: https://issues.apache.org/jira/browse/SPARK-25406) The current use of `withSQLConf` in `ParquetSchemaPruningSuite.scala` is incorrect. The desired configuration settings are not being set when running the test cases. This PR fixes that defective usage and addresses the test failures that were previously masked by that defect. I added code to relevant test cases to print the expected SQL configuration settings and found that the settings were not being set as expected. When I changed the order of calls to `test` and `withSQLConf` I found that the configuration settings were being set as expected. Closes apache#22394 from mallman/spark-25406-fix_broken_schema_pruning_tests. Authored-by: Michael Allman <msa@allman.ms> Signed-off-by: DB Tsai <d_tsai@apple.com> (cherry picked from commit a7e5aa6) Signed-off-by: DB Tsai <d_tsai@apple.com> Ref: LIHADOOP-48531 RB=1857521 A=
(Link to Jira: https://issues.apache.org/jira/browse/SPARK-25406)
What changes were proposed in this pull request?
The current use of
withSQLConf
inParquetSchemaPruningSuite.scala
is incorrect. The desired configuration settings are not being set when running the test cases.This PR fixes that defective usage and addresses the test failures that were previously masked by that defect.
How was this patch tested?
I added code to relevant test cases to print the expected SQL configuration settings and found that the settings were not being set as expected. When I changed the order of calls to
test
andwithSQLConf
I found that the configuration settings were being set as expected.