-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Accelerate system.metadata.table_comments with Iceberg Glue catalog #18517
Conversation
Apply whatever IntelliJ wants to have applied so that class reformat doesn't change the code.
Left over from unoptimized implementation of `testInformationSchemaColumns`.
core/trino-main/src/main/java/io/trino/connector/system/TableCommentSystemTable.java
Outdated
Show resolved
Hide resolved
8a3ed54
to
cca95db
Compare
cca95db
to
ce1caf9
Compare
core/trino-spi/src/main/java/io/trino/spi/connector/ConnectorMetadata.java
Outdated
Show resolved
Hide resolved
core/trino-spi/src/main/java/io/trino/spi/connector/ConnectorMetadata.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/catalog/glue/TrinoGlueCatalog.java
Outdated
Show resolved
Hide resolved
{ | ||
if (prefix.getTable().isPresent()) { | ||
// For single name referenced the default implementation is good enough | ||
return ConnectorMetadata.super.streamRelationComments(session, prefix); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd personally prefer to continue passing the prefix's table down to the TrinoGlueCatalog implementation and have it handle this case. It's the same number of API calls as going back to the super's implementation, but only because of caching.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, do we have a test that exercises the path where both schema and table are provided in the prefix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, do we have a test that exercises the path where both schema and table are provided in the prefix?
adding test for this case, search for // Pointed lookup
.
but only because of caching.
i didn't want to optimize this path as i haven't seen this in the wild.
I couldn't get rid of reliance on caching because of how GlueCatalog and GlueTableOperations interact, so even the optimized path had this unfortunate dependency on the cache.
i don't want to pass prefix's down to the TrinoGlueCatalog implementation, to save complexity at least here. the biggest downside of this approach is complexity. some of the complexity seems unavoidable...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i decided to go into opposite direction. see adjust iceberg to Fixup simplify streamRelationComments: take Optional<String> schemaName instead of SchemaTablePrefix
commit. I don't think tablePrefix
abstraction buys us anything, and it just pushes more complexity into connectors.
core/trino-spi/src/main/java/io/trino/spi/connector/ConnectorMetadata.java
Show resolved
Hide resolved
core/trino-spi/src/main/java/io/trino/spi/connector/ConnectorMetadata.java
Outdated
Show resolved
Hide resolved
@@ -545,8 +545,7 @@ public void testSystemMetadataTableComments() | |||
session, | |||
"SELECT * FROM system.metadata.table_comments WHERE schema_name = CURRENT_SCHEMA AND table_name LIKE 'test_select_s_m_t_comments%'", | |||
ImmutableMultiset.<GlueMetastoreMethod>builder() | |||
.addCopies(GET_TABLES, 3) | |||
.addCopies(GET_TABLE, tables * 2) | |||
.addCopies(GET_TABLES, 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- .addCopies(GET_TABLES, 3)
- .addCopies(GET_TABLE, tables * 2)
+ .addCopies(GET_TABLES, 1)
looks like a nice change. thank you @ebyhr for your hint #18315 (review)
cc @kokosing @atanasenko @alexjo2144
By general rule, the engine should not call plugins asking for information that it can derive on its own. Here, engine should not need to call the plugin to filter empty list of entities.
ce1caf9
to
0153dfb
Compare
|
503b8ff
to
b633c4e
Compare
} | ||
} | ||
else { | ||
List<RelationCommentMetadata> relationComments = metadata.listRelationComments( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should probably run access control checks again here in case the connector does not behave well in filtering out tables that should not be accessible.
I also saw there's an access control filterColumns
method, do we need to be applying that here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also saw there's an access control filterColumns method, do we need to be applying that here?
i don't think it's applicable.
Should probably run access control checks again here in case the connector does not behave well in filtering out tables that should not be accessible.
if we want that, we would need to let the connector declare whether AC was consulted or not. otherwise we would be calling AC twice, which is redundant, and considerable cost (in case of many relations).
i had this in this PR, see before Fixup simplify streamRelationComments: require relationFilter to be applied
commit. i didn't like the additional complexity though.
Note also that this is a matter of defining contract. We do not expect connectors to ignore schemaName
parameter (tablePrefix
in other methods). So we can choose not to expect connectors to ignore relationFilters
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's looking pretty close
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/catalog/glue/TrinoGlueCatalog.java
Outdated
Show resolved
Hide resolved
return Optional.of(Stream.concat( | ||
unfilteredResultList.stream() | ||
.filter(commentMetadata -> availableNames.contains(commentMetadata.name())), | ||
filteredResult.build().stream()) | ||
.iterator()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're doing a lot of eager work here. It doesn't matter yet because all the streams/iterators are collected to Lists farther up but if we switch the engine to do more of that lazily we'll want to come back to this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed. however, i didn't see a clean way to to this somewhat non-trivial logic, where we want to filter tables after capturing comments for cases where we have comment information at hand, and we want to filter tables before going to file system. Doing it more lazy would mean we consult access control more times, which is going to be more expensive.
(previous table_comments implementation would gather even more state in-memory)
The result order differs before/after this change. I'm not sure whether the order was ensured, but it returned in alphabetical order as far as I confirmed. Can we keep the original order? Order in
|
table_comments is a relation and relations do not define ordering over tuples they contain. In mathematical terms, an SQL relation is a multiset of tuples. The previously existing deterministic ordering was a side effect of |
73edb2e
to
67ed8fd
Compare
(squashed and renamed (per #18517 (comment))) |
/** | ||
* Gets comments for all relations (tables, views, materialized views) that match the specified prefix | ||
* (e.g. for all relations that would be returned by {@link #listTables(ConnectorSession, Optional)} when | ||
* {@code prefix} does not represent relation name). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method doesn't take "prefix" as the method argument. Should we reword the sentence?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, that's a left over. good catch!
Refactor implementation of the `system.metadata.table_comments` table to avoid`ConnectorMetadata.getTableMetadata` on per-table basis, which adds up and becomes expensive. This also allows avoiding separate listings for views and materialized views, and retrieval of unnecessary information (e.g. involving view text translation for Hive views).
With Glue, table listing already pulls a lot information about tables. For Trino-managed Iceberg tables this is sufficient information to answer `system.metadata.table_comments` queries, without having to fetch Glue tables again, one-by-one. Trino-manged Iceberg tables keep table comment up to date in Glue (along with additional information sufficient to verify it's indeed up to date). The approach can be generalized to Iceberg's own `GlueCatalog` later if the community is interested.
67ed8fd
to
afe1bd5
Compare
pushed javadoc fix (for #18517 (comment)) |
With Glue, table listing already pulls a lot information about tables.
For Trino-managed Iceberg tables this is sufficient information to
answer
system.metadata.table_comments
queries, without having to fetchGlue tables again, one-by-one. Trino-manged Iceberg tables keep table
comment up to date in Glue (along with additional information sufficient
to verify it's indeed up to date). The approach can be generalized to
Iceberg's own
GlueCatalog
later if the community is interested.Builds on #18315
release notes