-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix dereference operations for union type in Hive Connector #15278
Conversation
2d8966e
to
13aaab0
Compare
testing/trino-product-tests/src/main/java/io/trino/tests/product/hive/TestReadUniontype.java
Show resolved
Hide resolved
testing/trino-product-tests/src/main/java/io/trino/tests/product/hive/TestReadUniontype.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/main/java/io/trino/plugin/hive/HiveType.java
Outdated
Show resolved
Hide resolved
testing/trino-product-tests/src/main/java/io/trino/tests/product/hive/TestReadUniontype.java
Outdated
Show resolved
Hide resolved
private void testAvroUnionTypeDereference(String tableName) | ||
{ | ||
/* | ||
* On hive 1.2, when create AVRO table with nested format, such as: uniontype<struct<unionlevel1:uniontype<string,int>>> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand this part ... Can you please elaborate what's the issue here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The nested, nested column I created: union<struct>> get interpreted as union<> for Avro type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this issue is unrelated to the fix implementation. Due to this I can't test nested nested case with Avro.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. since union types typically have multiple fields, curious if we have something like union<int, <struct<int, int>>
, what is it read as?
plugin/trino-hive/src/main/java/io/trino/plugin/hive/HiveType.java
Outdated
Show resolved
Hide resolved
Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: yluan.
|
7c8f6a5
to
12b6481
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some comments, but overall looks good.
please look at build failure and also git author details - currently it's showing cla not signed.
plugin/trino-hive/src/main/java/io/trino/plugin/hive/HiveType.java
Outdated
Show resolved
Hide resolved
private void testAvroUnionTypeDereference(String tableName) | ||
{ | ||
/* | ||
* On hive 1.2, when create AVRO table with nested format, such as: uniontype<struct<unionlevel1:uniontype<string,int>>> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. since union types typically have multiple fields, curious if we have something like union<int, <struct<int, int>>
, what is it read as?
testing/trino-product-tests/src/main/java/io/trino/tests/product/hive/TestReadUniontype.java
Outdated
Show resolved
Hide resolved
testing/trino-product-tests/src/main/java/io/trino/tests/product/hive/TestReadUniontype.java
Outdated
Show resolved
Hide resolved
9eb679c
to
2f5d20a
Compare
I tried this out for avro. looks like when we've just one type within uniontype, hive tries to be smart about it in terms of table schema by removing the union but then fails insertion. e.g. this fails
But if we actually have a union of more than one type - then it succeeds.
However, hive fails to read back results after this. But IMO Trino should still be able to read back. Could you please modify the test so that we've coverage for both ? |
I would keep the original test as well, since there is no constraint specifying union type must have more than one type. |
f89579c
to
e144e74
Compare
testing/trino-product-tests/src/main/java/io/trino/tests/product/hive/TestReadUniontype.java
Show resolved
Hide resolved
e144e74
to
d97ad7d
Compare
pushed, please take a further look! |
Summary of bugs we found using Hive inserting union columns for AVRO format table:
|
Description
Support dereferencing using field names such as "field0", "field1", "tag" for Hive's Union Type.
Additional context and related issues
#15017,
#3483,
e071da4
Release notes
( ) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
(x) Release notes are required, with the following suggested text: