-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support creating tables with column comment in Delta Lake #12455
Conversation
Please split this into two PRs. |
12c206c
to
495369a
Compare
CI hit #12471 |
495369a
to
7847baa
Compare
Verified Spark compatibility locally.
|
...elta-lake/src/main/java/io/trino/plugin/deltalake/transactionlog/DeltaLakeSchemaSupport.java
Outdated
Show resolved
Hide resolved
private final Optional<String> comment; | ||
|
||
@Deprecated | ||
public DeltaLakeColumnHandle( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please remove the usage of this constructor from the current codebase of trino-delta-lake
module? I still found 2
usages of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, there're a lot of usages more than 2. That is why I marked @Deprecated
instead of migrating them. I can migrate all if it's not burden for reviewers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DLCH is used as key in maps, comment is part of its equality.
We unfortunately need to update all usages appropriately, otherwise we cannot reason about the correctness of the codebase.
I would very much prefer not adding comment here, to the table handle.
It bite us a few times in Iceberg, so i would prefer to see if we can transport the comment to ColumnMetadata somehow else.
plugin/trino-accumulo/src/test/java/io/trino/plugin/accumulo/TestAccumuloConnectorTest.java
Outdated
Show resolved
Hide resolved
7847baa
to
18df43c
Compare
...roduct-tests/src/main/java/io/trino/tests/product/deltalake/TestDeltaLakeOssCreateTable.java
Outdated
Show resolved
Hide resolved
18df43c
to
0072698
Compare
@ebyhr please squash and ping me for a review. |
ee31f4e
to
417d9cd
Compare
@findepi Squashed commits. |
private final Optional<String> comment; | ||
|
||
@Deprecated | ||
public DeltaLakeColumnHandle( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DLCH is used as key in maps, comment is part of its equality.
We unfortunately need to update all usages appropriately, otherwise we cannot reason about the correctness of the codebase.
I would very much prefer not adding comment here, to the table handle.
It bite us a few times in Iceberg, so i would prefer to see if we can transport the comment to ColumnMetadata somehow else.
|
||
private static Optional<String> getComment(JsonNode node) | ||
{ | ||
return Optional.ofNullable(node.get("metadata")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why node.get("metadata")
is nullable?
@@ -1910,6 +1911,24 @@ public void testCreateTableWithTableComment() | |||
assertUpdate("DROP TABLE " + tableName); | |||
} | |||
|
|||
@Test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Deny creating tables with column comment if unsupported" looks solid & only slightly related to the Delta.
Let's separate PR for this.
7d63473
to
183d16e
Compare
@findepi Updated not to touch |
@@ -385,7 +386,7 @@ public ConnectorTableMetadata getTableMetadata(ConnectorSession session, Connect | |||
DeltaLakeTableHandle tableHandle = (DeltaLakeTableHandle) table; | |||
String location = metastore.getTableLocation(tableHandle.getSchemaTableName(), session); | |||
List<ColumnMetadata> columns = getColumns(tableHandle.getMetadataEntry()).stream() | |||
.map(DeltaLakeMetadata::getColumnMetadata) | |||
.map(column -> getColumnMetadata(column, getColumnComments(tableHandle.getMetadataEntry()))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getColumnComments(tableHandle.getMetadataEntry())
is performed once for every column, should be only once
@@ -494,7 +496,7 @@ public Stream<TableColumnsMetadata> streamTableColumns(ConnectorSession session, | |||
// intentionally skip case when table snapshot is present but it lacks metadata portion | |||
return metastore.getMetadata(metastore.getSnapshot(table, session), session).stream().map(metadata -> { | |||
List<ColumnMetadata> columnMetadata = getColumns(metadata).stream() | |||
.map(DeltaLakeMetadata::getColumnMetadata) | |||
.map(column -> getColumnMetadata(column, getColumnComments(metadata))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getColumnComments(tableHandle.getMetadataEntry())
is performed once for every column, should be only once
.build(); | ||
} | ||
|
||
public static Map<String, Optional<String>> getColumnComments(MetadataEntry metadataEntry) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a difference between missing entry and entry being Optional.empty()
?
maybe we just have a map of non-null comments, passed around as Map<String,String>
?
we would lose ability to validate map contains entries for the columns, but you don't do this anyway (by means of columnComments.getOrDefault...
)
.orElseThrow(() -> new IllegalStateException("Serialized schema not found in transaction log for " + metadataEntry.getName())); | ||
} | ||
|
||
private static Map<String, Optional<String>> getColumnComment(String json) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getColumnComment -> getColumnComments (plural)
} | ||
} | ||
|
||
private static Map.Entry<String, Optional<String>> columnComment(JsonNode node) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
inline
a11f6d9
to
dcf4f0d
Compare
@findepi Addressed comments. |
@@ -204,4 +204,32 @@ private static String getTableCommentOnDelta(String schemaName, String tableName | |||
.map(row -> row.get(1)) | |||
.findFirst().orElseThrow(); | |||
} | |||
|
|||
@Test(groups = {DELTA_LAKE_DATABRICKS, PROFILE_SPECIFIC_TESTS}) | |||
public void testCreateTableWithColumnComment() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a test that goes the other way, creating the comment in Delta and reading from Trino?
@@ -963,6 +969,7 @@ public void addColumn(ConnectorSession session, ConnectorTableHandle tableHandle | |||
handle.getMetadataEntry().getId(), | |||
columnsBuilder.build(), | |||
partitionColumns, | |||
ImmutableMap.of(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a test that creates a table with comments and then adds a column? Make sure this doesn't wipe the existing comments
@@ -887,6 +892,7 @@ public Optional<ConnectorOutputMetadata> finishCreateTable( | |||
randomUUID().toString(), | |||
handle.getInputColumns(), | |||
handle.getPartitionedBy(), | |||
ImmutableMap.of(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this match what we do in createTable
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's not needed as column comment is unsupported in CTAS
syntax level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, you're totally right. Thanks
dcf4f0d
to
04d0d3f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want to support alter table add column
with a comment here, or create a separate PR?
I want to separate PR for ADD COLUMN with a comment. |
04d0d3f
to
5442cfa
Compare
Rebased on upstream to resolve conflicts. |
Description
Support creating tables with column comment in Delta Lake
Documentation
( ) No documentation is needed.
( ) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.
Release notes
( ) No release notes entries required.
(x) Release notes entries required with the following suggested text: