Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix table props for Iceberg with orc_bloom_filter #20433

Merged
merged 1 commit into from
Feb 29, 2024

Conversation

oneonestar
Copy link
Member

Description

This PR fix table properties for Iceberg table with orc_bloom_filter:

Fix #20432

Additional context and related issues

The correct table properties should be:
orc.bloom.filter.columns -> write.orc.bloom.filter.columns
orc.bloom.filter.fpp -> write.orc.bloom.filter.fpp.

Spec: https://iceberg.apache.org/docs/latest/configuration/#write-properties

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
(x) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text:

# Section
* Fix some things. ({issue}`issuenumber`)

@cla-bot cla-bot bot added the cla-signed label Jan 20, 2024
@github-actions github-actions bot added the iceberg Iceberg connector label Jan 20, 2024
@oneonestar oneonestar added the bug Something isn't working label Jan 20, 2024
@oneonestar oneonestar force-pushed the fix.orc.bloom.filter branch from 387bc46 to 24bb55b Compare January 21, 2024 15:01
@oneonestar oneonestar marked this pull request as ready for review January 21, 2024 15:03
@@ -662,8 +687,8 @@ public static Map<String, String> createTableProperties(ConnectorTableMetadata t
if (!columns.isEmpty()) {
checkFormatForProperty(fileFormat.toIceberg(), FileFormat.ORC, ORC_BLOOM_FILTER_COLUMNS);
validateOrcBloomFilterColumns(tableMetadata, columns);
propertiesBuilder.put(ORC_BLOOM_FILTER_COLUMNS_KEY, Joiner.on(",").join(columns));
propertiesBuilder.put(ORC_BLOOM_FILTER_FPP_KEY, String.valueOf(getOrcBloomFilterFpp(tableMetadata.getProperties())));
propertiesBuilder.put(TableProperties.ORC_BLOOM_FILTER_COLUMNS, Joiner.on(",").join(columns));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

static import

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

io.trino.plugin.iceberg.IcebergTableProperties.ORC_BLOOM_FILTER_COLUMNS
org.apache.iceberg.TableProperties.ORC_BLOOM_FILTER_COLUMNS

We have a name conflict here so we can't static import both.

I thought about refactor the mapping logic between Trino session properties and Iceberg table properties.
The current mapping logic is a bit messy and across multiple classes.
However, the changes will be big so it's better to do it in another PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@oneonestar What do you think about suffixing the ORC Trino session properties with _PROPERTY (as is already done with the other properties) and importing ORC_BLOOM_FILTER_COLUMNS and ORC_BLOOM_FILTER_FPP from org.apache.iceberg.TableProperties into IcebergTableProperties, and then importing them here?

This approach would maintain the pattern that already exists, and it would make things easier for me in the #20393 PR to add the constant that holds the iceberg table properties.

Subsequently, we could revisit how we deal with both kinds of properties.

What are your thoughts on this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ndrluis Thanks for the suggestion.
Rename Trino session properties with _PROPERTY make a lot of sense.

importing ORC_BLOOM_FILTER_COLUMNS and ORC_BLOOM_FILTER_FPP from org.apache.iceberg.TableProperties into IcebergTableProperties, and then importing them here?

I don't understand why these properties are needed in IcebergTableProperties.

I did the change in eddb7e8 . Please take a look.

@findinpath
Copy link
Contributor

LGTM % comments

@oneonestar oneonestar requested a review from findinpath January 29, 2024 02:17
@oneonestar oneonestar force-pushed the fix.orc.bloom.filter branch 2 times, most recently from 459934b to eddb7e8 Compare January 29, 2024 14:17
@oneonestar oneonestar requested a review from ndrluis January 29, 2024 15:06
@findinpath findinpath requested review from findepi and ebyhr January 30, 2024 11:45
@oneonestar oneonestar force-pushed the fix.orc.bloom.filter branch 3 times, most recently from 42ffcff to 2e67cda Compare February 2, 2024 10:18
@oneonestar oneonestar force-pushed the fix.orc.bloom.filter branch from 2e67cda to 56e5374 Compare February 2, 2024 12:51
Copy link

This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua

@github-actions github-actions bot added the stale label Feb 28, 2024
@wendigo wendigo merged commit 76fe740 into trinodb:master Feb 29, 2024
43 checks passed
@github-actions github-actions bot added this to the 440 milestone Feb 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cla-signed iceberg Iceberg connector stale
Development

Successfully merging this pull request may close these issues.

Incorrect table properties set for Iceberg with orc_bloom_filter
5 participants