-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle missing StorageDescriptor in Hive Glue tables #17655
Conversation
@ChunxuTang could you take a look and review this? |
1 similar comment
@ChunxuTang could you take a look and review this? |
@prithvip Sure! Glad to help. Will take a look today. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @v-jizhang, thanks a lot for your nice work!
I just left some minor suggestions. Feel free to have a look.
Additionally, as the PR was sent around two weeks ago, when you push new changes, could you also merge the current presto master branch to avoid potential merge conflicts? Thanks!
public static boolean isDeltaLakeTable(Map<String, String> tableParameters) | ||
{ | ||
return tableParameters.containsKey(SPARK_TABLE_PROVIDER_KEY) | ||
&& tableParameters.get(SPARK_TABLE_PROVIDER_KEY).toLowerCase(ENGLISH).equals(DELTA_LAKE_PROVIDER); | ||
} | ||
|
||
public static boolean isIcebergTable(Table table) | ||
{ | ||
return isIcebergTable(table.getParameters()); | ||
} | ||
|
||
public static boolean isIcebergTable(Map<String, String> tableParameters) | ||
{ | ||
return ICEBERG_TABLE_TYPE_VALUE.equalsIgnoreCase(tableParameters.get(ICEBERG_TABLE_TYPE_NAME)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any there any specific reasons that you want to keep these functions in the FileHiveMetastore
class? Looks like they are some helper functions. In that case, we can put them into the HiveUtil
class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HiveUtil is in presto-hive module and if put them there, to avoid circular dependency, they cannot be referenced in presto-hive-metastore.
@@ -106,6 +106,7 @@ | |||
import static com.google.common.base.MoreObjects.toStringHelper; | |||
import static com.google.common.base.Preconditions.checkArgument; | |||
import static com.google.common.collect.ImmutableList.toImmutableList; | |||
import static java.util.Locale.ENGLISH; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think most changes in this file could be moved to HiveUtil
, as the changes are mostly related to some helper functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HiveUtil is in presto-hive and because presto-hive depends on presto-hive-metastore, presto-hive-metastore can't use anything in HiveUtil
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha. Thanks for your clarification!
In that case, maybe we could move the utility functions to MetastoreUtil
? Most added functions here look to be utility functions that are independent of FileHiveMetastore
. What do you think?
.withDatabaseName(table.getSchemaName()) | ||
.withName(table.getTableName()); | ||
AWSGlueAsync glueClient = AWSGlueAsyncClientBuilder.defaultClient(); | ||
try (Transaction transaction = newTransaction()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this Transaction transaction = newTransaction()
necessary? Looks like it's not used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. It's not used.
@ChunxuTang @rohanpednekar: There is fix trinodb/trino#12130. I'd like to port it in this PR also. Thanks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @v-jizhang, thanks a lot for your work!
Sorry for my late reply. Let some minor suggestions~
BTW, I noticed that trinodb/trino#12130 also contains two new functions testIcebergTableNonNullStorageDescriptor
and testDeltaTableNonNullStorageDescriptor
in the TestGlueToTrinoConverter
. Would you mind also cherry-picking them here?
@@ -106,6 +106,7 @@ | |||
import static com.google.common.base.MoreObjects.toStringHelper; | |||
import static com.google.common.base.Preconditions.checkArgument; | |||
import static com.google.common.collect.ImmutableList.toImmutableList; | |||
import static java.util.Locale.ENGLISH; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha. Thanks for your clarification!
In that case, maybe we could move the utility functions to MetastoreUtil
? Most added functions here look to be utility functions that are independent of FileHiveMetastore
. What do you think?
478e040
to
f7832f8
Compare
@ChunxuTang Refactored utility functions and added tests. Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@v-jizhang LGTM! Thanks for your work! BTW, I noticed that there's a testing failure. Could you re-run the tests to pass all of them?
Hi @beinan, would you like to also take a quick look at this PR?
Cherry-pick of trinodb/trino#11092 Fixes prestodb#17643 Co-authored-by: Alex <jo.alex2144@gmail.com>
Cherry-pick of trinodb/trino#12130 Some table loaders (e.g. Iceberg FlinkSink) will update/add Glue storage descriptors even while writing iceberg and delta format tables. Although these storage descriptors are not technically needed by delta/iceberg, it is not incompatible. Instead of assuming that a Glue storage descriptor excludes delta/iceberg usage, we should instead check directly for the table_type metadata flag to determine whether or not a table is delta/iceberg. Co-authored-by: Eric Hwang <eric.hwang@starburstdata.com>
Cherry-pick of trinodb/trino#12164 Delta Lake connector expects that the metastore storage descriptor be fully translated if it exists. It only needs to return a dummy storage value if no storage descriptor has been configured. Co-authored-by: Eric Hwang <eric.hwang@starburstdata.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, thx!
Cherry-pick of trinodb/trino#11092
Fixes #17643
Co-authored-by: Alex jo.alex2144@gmail.com
Test plan - Test added