-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add extra_properties to Delta Lake table properties #17595
Conversation
1b60439
to
2033a31
Compare
2033a31
to
a47e9a8
Compare
...rino-delta-lake/src/test/java/io/trino/plugin/deltalake/BaseDeltaLakeConnectorSmokeTest.java
Show resolved
Hide resolved
a47e9a8
to
29f371c
Compare
...n/trino-delta-lake/src/test/java/io/trino/plugin/deltalake/TestDeltaLakeTableProperties.java
Outdated
Show resolved
Hide resolved
.stream() | ||
.map(property -> property.getName()) | ||
.collect(Collectors.toSet()), | ||
DeltaLakeTableProperties.PROPERTY_NAMES); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we'll add/remove eventually a new table property, this test will still be 🟢
What do you intend to test here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we add a table property, but forget to update the DeltaLakeTableProperties.PROPERTY_NAMES
set, then this test will fail.
It's because the list of table properties are created when DeltaLakeTableProperties
is constructed, while validateExtraProperties
only uses DeltaLakeTableProperties.PROPERTY_NAMES
and can be static. We could inject DeltaLakeTableProperties
to DeltaLakeMetadata
instead and get rid of the PROPERTY_NAMES
?
throw new TrinoException(INVALID_TABLE_PROPERTY, format("Extra table property key cannot be null '%s'", extraProperties)); | ||
} | ||
Set<String> illegalExtraProperties = Sets.intersection( | ||
DeltaLakeTableProperties.PROPERTY_NAMES, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not quite sure whether extra_properties
is a Trino reserved property name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the Hive connector we disallow using the key extra_properties
in the extra_properties
.addAll(baseProperties.keySet()) |
Disallowing setting table properties through extra_properties
and through ordinary properties reduces some possible user confusion, but also limits the use of extra_properties
. For Hive this could make sense, as Trino table property names might overlap with Hive table property names with special meanings. For Delta it might not matter, as none of our current table properties have a special meaning as a Delta table property (special Delta table properties all seem to be prefixed with delta.
. Perhaps we should get rid of this validation of illegal extra_property
keys, and only check for the special delta.
keys?
throw new TrinoException(INVALID_TABLE_PROPERTY, format("Extra table property key cannot be null '%s'", extraProperties)); | ||
} | ||
Set<String> illegalExtraProperties = Sets.intersection( | ||
DeltaLakeTableProperties.PROPERTY_NAMES, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe inline PROPERTY_NAMES
map declaration (e.g. : reservedTrinoPropertyNames
) here in case you won't use it anywhere else.
...rino-delta-lake/src/test/java/io/trino/plugin/deltalake/BaseDeltaLakeConnectorSmokeTest.java
Show resolved
Hide resolved
@@ -645,6 +645,10 @@ The following table properties are available for use: | |||
* ``NONE`` | |||
|
|||
Defaults to ``NONE``. | |||
* - ``extra_properties`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the description of the ticket:
It's a bit unclear how we should handle table properties which can be set by other features, and through extra_properties.
To avoid such situations we should disallow table property names handled by Trino.
See
Lines 39 to 43 in df16c38
public static final String DELTA_CHECKPOINT_WRITE_STATS_AS_JSON_PROPERTY = "delta.checkpoint.writeStatsAsJson"; | |
public static final String DELTA_CHECKPOINT_WRITE_STATS_AS_STRUCT_PROPERTY = "delta.checkpoint.writeStatsAsStruct"; | |
public static final String DELTA_CHANGE_DATA_FEED_ENABLED_PROPERTY = "delta.enableChangeDataFeed"; | |
private static final String DELTA_CHECKPOINT_INTERVAL_PROPERTY = "delta.checkpointInterval"; |
DeltaLakeTableProperties
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Collected them in MetadataEntry.RESERVED_TRINO_PROPERTY_NAMES
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are also a few keys at
Lines 86 to 89 in 7b2df4a
public static final String APPEND_ONLY_CONFIGURATION_KEY = "delta.appendOnly"; | |
public static final String COLUMN_MAPPING_MODE_CONFIGURATION_KEY = "delta.columnMapping.mode"; | |
public static final String COLUMN_MAPPING_PHYSICAL_NAME_CONFIGURATION_KEY = "delta.columnMapping.physicalName"; | |
public static final String MAX_COLUMN_ID_CONFIGURATION_KEY = "delta.columnMapping.maxColumnId"; |
but it does not look like we support modifying them through Trino table properties.
It seems a bit difficult to enforce by other means than convention that reserved table properties are to be defined in MetadataEntry and added to RESERVED_TRINO_PROPERTY_NAMES
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to decide how to handle:
- Trino table properties of Delta tables
trino/plugin/trino-delta-lake/src/main/java/io/trino/plugin/deltalake/DeltaLakeTableProperties.java
Lines 48 to 53 in 7b2df4a
public static final String LOCATION_PROPERTY = "location"; public static final String PARTITIONED_BY_PROPERTY = "partitioned_by"; public static final String CHECKPOINT_INTERVAL_PROPERTY = "checkpoint_interval"; public static final String CHANGE_DATA_FEED_ENABLED_PROPERTY = "change_data_feed_enabled"; public static final String COLUMN_MAPPING_MODE_PROPERTY = "column_mapping_mode"; public static final String EXTRA_PROPERTIES = "extra_properties"; - Special Delta table properties which are modified when setting Trino table properties.
- Special Delta table properties which influences the way Trino reads or writes a table, but cannot be modified through Trino (e.g.,
delta.appendOnly
).
29f371c
to
7b2df4a
Compare
This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua |
@findinpath and @kasiafi could you work with @jkylling on next steps here .. assuming this is still in progress and a desired improvement. |
@jkylling was saying in a Slack discussion that he will address the PR comments as soon as he finds free time. |
This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua |
Closing this pull request, as it has been stale for six weeks. Feel free to re-open at any time. |
I am reopening this since it is still in progress. |
This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua |
Here is another one @jkylling that might have gone under your radar. Can you work with @findinpath to get this updated and ready. |
This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua |
This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua |
Closing this pull request, as it has been stale for six weeks. Feel free to re-open at any time. |
Description
Similar to #17172 this adds support for the
extra_properties
table property to Delta tables, such that additional table properties can be set on Delta tables.The additional properties will not be exposed in SHOW CREATE.
It's a bit unclear how we should handle table properties which can be set by other features, and through extra_properties. Currently we take the simplest approach and allow setting properties both through Trino specific table properties, and through
extra_properties
. For instance, it's possible to setdelta.enableChangeDataFeed
both implicitly by settingchange_data_feed_enabled
, and explicitly by settingextra_properties = MAP(ARRAY['delta.enableChangeDataFeed'], ARRAY['true'])
.Additional context and related issues
Fixes #17428
Release notes
( ) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
(x) Release notes are required, with the following suggested text: