Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use unique table locations in Iceberg by default #12941

Conversation

alexjo2144
Copy link
Member

@alexjo2144 alexjo2144 commented Jun 22, 2022

Description

Enable the unique table location feature by default in the Iceberg connector

Is this change a fix, improvement, new feature, refactoring, or other?

Enabling an existing feature by default

Is this a change to the core query engine, a connector, client library, or the SPI interfaces? (be specific)

Iceberg connector

How would you describe this change to a non-technical end user or system administrator?

Related issues, pull requests, and links

Fixes: #5632

Documentation

( ) No documentation is needed.
(x) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.

Release notes

( ) No release notes entries required.
(x) Release notes entries required with the following suggested text:

# Iceberg
* Use unique table locations in Iceberg by default

@@ -46,7 +46,7 @@
private HiveCompressionCodec compressionCodec = ZSTD;
private boolean useFileSizeFromMetadata = true;
private int maxPartitionsPerWriter = 100;
private boolean uniqueTableLocation;
private boolean uniqueTableLocation = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test for the default value should be also adjusted.

Copy link
Member

@losipiuk losipiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM % failing test

@alexjo2144 alexjo2144 force-pushed the iceberg/default-use-unique-table-locations branch from f86b557 to 9d0eafb Compare June 23, 2022 20:15
@findepi
Copy link
Member

findepi commented Jun 23, 2022

@alexjo2144 you need to rebase

@alexjo2144 alexjo2144 force-pushed the iceberg/default-use-unique-table-locations branch from 9d0eafb to d107f5e Compare June 23, 2022 21:47
@alexjo2144
Copy link
Member Author

All set, thank you all for the reviews

assertThat(onTrino().executeQuery("SHOW CREATE TABLE test_dropped_partition_field"))
.containsOnly(
row("CREATE TABLE iceberg.default.test_dropped_partition_field (\n" +
Assertions.assertThat((String) onTrino().executeQuery("SHOW CREATE TABLE test_dropped_partition_field").row(0).get(0))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getOnlyElement

row("CREATE TABLE iceberg.default.test_dropped_partition_field (\n" +
Assertions.assertThat((String) onTrino().executeQuery("SHOW CREATE TABLE test_dropped_partition_field").row(0).get(0))
.matches(
"CREATE TABLE iceberg.default.test_dropped_partition_field \\(\n" +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use \\Q here too

Pattern locationPattern = Pattern.compile(".*location = 'hdfs://hadoop-master:9000(.*?)'.*", Pattern.DOTALL);
Matcher m = locationPattern.matcher((String) onTrino().executeQuery("SHOW CREATE TABLE " + tableName).row(0).get(0));
if (m.find()) {
verify(m.groupCount() == 1, "Location regex only expected one group but found " + m.groupCount());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

String location = m.group(1);
verify(!m.find(), "Unexpected second match");
return location;

@alexjo2144 alexjo2144 force-pushed the iceberg/default-use-unique-table-locations branch from d107f5e to b566c43 Compare June 24, 2022 14:01
@alexjo2144
Copy link
Member Author

Updated. @findepi might be worth running with secrets for Glue?

@findepi
Copy link
Member

findepi commented Jun 24, 2022

Test PR with secrets: #12981
(@nineinchnick's #12817 will make these PRs easier)

@alexjo2144
Copy link
Member Author

CI: #12950

@alexjo2144 alexjo2144 force-pushed the iceberg/default-use-unique-table-locations branch from b566c43 to 0ddd006 Compare June 28, 2022 07:54
@alexjo2144
Copy link
Member Author

Rebased for conflicts

@alexjo2144
Copy link
Member Author

@losipiuk CI hit #10631

@alexjo2144
Copy link
Member Author

@losipiuk @findepi I think this is good to go whenever you get a chance

@alexjo2144 alexjo2144 force-pushed the iceberg/default-use-unique-table-locations branch 2 times, most recently from 7179028 to 83992ad Compare July 29, 2022 13:58
@alexjo2144 alexjo2144 force-pushed the iceberg/default-use-unique-table-locations branch from 83992ad to 7b8067f Compare July 29, 2022 16:24
@alexjo2144 alexjo2144 force-pushed the iceberg/default-use-unique-table-locations branch from 7b8067f to 866ddb0 Compare July 29, 2022 20:08
@alexjo2144
Copy link
Member Author

Checks are passing ✔️

@findepi findepi merged commit fbdbefc into trinodb:master Aug 1, 2022
@findepi findepi mentioned this pull request Aug 1, 2022
@github-actions github-actions bot added this to the 392 milestone Aug 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

Iceberg table name on s3
5 participants