-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-25313][SQL][FOLLOW-UP] Fix InsertIntoHiveDirCommand output schema in Parquet issue #22359
Conversation
@@ -803,6 +803,23 @@ class HiveDDLSuite | |||
} | |||
} | |||
|
|||
test("Insert overwrite directory should output correct schema") { | |||
withSQLConf(CONVERT_METASTORE_PARQUET.key -> "false") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add withTable("tbl") {
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for the fix!
Test build #95788 has finished for PR 22359 at commit
|
Test build #95791 has finished for PR 22359 at commit
|
Hi, @wangyum .
|
@@ -803,6 +803,25 @@ class HiveDDLSuite | |||
} | |||
} | |||
|
|||
test("Insert overwrite directory should output correct schema") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is a bug fix, can we have SPARK-25313
prefix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also added here?
spark/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala
Line 758 in 8e60b98
test("Insert overwrite Hive table should output correct schema") { |
spark/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala
Line 782 in 8e60b98
test("Create Hive table as select should output correct schema") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this PR, let's handle this test case only.
I removed my previous comment. It seems to be the Parquet behavior from the beginning of this command at 2.3.0. I was confused because it's different from ORC. |
Test build #95818 has finished for PR 22359 at commit
|
Since this is related to Parquet behavior only, can we have |
cc @cloud-fan |
thanks, merging to master/2.4! |
…ema in Parquet issue ## What changes were proposed in this pull request? How to reproduce: ```scala spark.sql("CREATE TABLE tbl(id long)") spark.sql("INSERT OVERWRITE TABLE tbl VALUES 4") spark.sql("CREATE VIEW view1 AS SELECT id FROM tbl") spark.sql(s"INSERT OVERWRITE LOCAL DIRECTORY '/tmp/spark/parquet' " + "STORED AS PARQUET SELECT ID FROM view1") spark.read.parquet("/tmp/spark/parquet").schema scala> spark.read.parquet("/tmp/spark/parquet").schema res10: org.apache.spark.sql.types.StructType = StructType(StructField(id,LongType,true)) ``` The schema should be `StructType(StructField(ID,LongType,true))` as we `SELECT ID FROM view1`. This pr fix this issue. ## How was this patch tested? unit tests Closes #22359 from wangyum/SPARK-25313-FOLLOW-UP. Authored-by: Yuming Wang <yumwang@ebay.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit f8b4d5a) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
…and output schema in Parquet issue ## What changes were proposed in this pull request? Backport #22359 to branch-2.3. ## How was this patch tested? unit tests Closes #22387 from wangyum/SPARK-25313-FOLLOW-UP-branch-2.3. Authored-by: Yuming Wang <yumwang@ebay.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
What changes were proposed in this pull request?
How to reproduce:
The schema should be
StructType(StructField(ID,LongType,true))
as weSELECT ID FROM view1
.This pr fix this issue.
How was this patch tested?
unit tests