-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added ability to have unique table location for each iceberg table #6063
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few minor comments, otherwise looks good
presto-iceberg/src/main/java/io/prestosql/plugin/iceberg/IcebergConfig.java
Outdated
Show resolved
Hide resolved
presto-iceberg/src/main/java/io/prestosql/plugin/iceberg/IcebergConfig.java
Outdated
Show resolved
Hide resolved
presto-iceberg/src/main/java/io/prestosql/plugin/iceberg/IcebergMetadata.java
Outdated
Show resolved
Hide resolved
presto-iceberg/src/main/java/io/prestosql/plugin/iceberg/IcebergMetadataFactory.java
Outdated
Show resolved
Hide resolved
@electrum Thanks for the comments, I have performed code fixes for all of them |
@electrum is it possible to have this changes in the next release? |
@electrum could you please approve this PR in case of all looks good? |
@electrum I would be really appreciate for your review of this PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add test to AbstractTestIcebergSmoke to verify that some UUID as added to table location.
Does iceberg.unique-table-location
have any purpose beyond making it possible for Trino to delete data on drop of table ?
If not, it would be easier to have just one flag iceberg.delete-files-on-table-drop
and implement both unique table location and data removal behind that flag rather than requiring users to set 2 flags.
Should we also be providing this as a table property ? From user point of view it's easier to create a "managed" table using table property rather than changing config to set a catalog property.
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergConfig.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sshkvar thanks for working on this.
Can you please rebase?
We should also have a test for this that would cover
CREATE + DROP
CREATE + RENAME + DROP
showing that table data gets correctly dropped.
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergConfig.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding the test.
Overall LGTM. Some editorial comments in the test.
.../trino-iceberg/src/test/java/io/trino/plugin/iceberg/TestIcebergTableWithCustomLocation.java
Outdated
Show resolved
Hide resolved
.../trino-iceberg/src/test/java/io/trino/plugin/iceberg/TestIcebergTableWithCustomLocation.java
Outdated
Show resolved
Hide resolved
.../trino-iceberg/src/test/java/io/trino/plugin/iceberg/TestIcebergTableWithCustomLocation.java
Outdated
Show resolved
Hide resolved
.../trino-iceberg/src/test/java/io/trino/plugin/iceberg/TestIcebergTableWithCustomLocation.java
Outdated
Show resolved
Hide resolved
.../trino-iceberg/src/test/java/io/trino/plugin/iceberg/TestIcebergTableWithCustomLocation.java
Outdated
Show resolved
Hide resolved
cc @rdblue |
public static DistributedQueryRunner createIcebergQueryRunner(Map<String, String> extraProperties, | ||
FileFormat format, | ||
List<TpchTable<?>> tables, | ||
Optional<File> metastoreDirectory) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update formatting
- all parameters on one line OR each parameter on its own line (including the first one)
- for indentation please refer to https://github.com/airlift/codestyle/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the link to code style.
Formatting updated
public static DistributedQueryRunner createIcebergQueryRunner( | ||
Map<String, String> extraProperties, | ||
FileFormat format, | ||
List<TpchTable<?>> tables, | ||
Optional<File> metastoreDirectory) | ||
Optional<File> metastoreDirectory, | ||
Optional<Map<String, String>> icebergPropertiesOverride) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We call it connectorProperties
in other XxxQueryRunner
classes.
also, no need for optional, since empty map has the same meaning
Map<String, String> connectorProperties)
also, please add it right after extraProperties
(also seems usual)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
File tempDir = Files.createTempDirectory("test_iceberg_v2").toFile(); | ||
metastoreDir = new File(tempDir, "iceberg_data"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why two levels of nesting?
Would it be fine to have just
metastoreDir = Files.createTempDirectory("....");
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
however, it seems the test doesn't need to know the metastoreDir
, so instead of setting it up (and cleaning it later), it should be sufficient to pass Optional.empty()
for the metastoreDir
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need metastoreDir
because it is used in createTestingFileHiveMetastore(metastoreDir)
,
but I removed two levels nesting
|
||
return createIcebergQueryRunner( | ||
ImmutableMap.of(), | ||
ORC, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use default
new IcebergConfig().getFileFormat()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as a follow up we should remove format
parameter and use the new connectorProperties
instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replaced with new IcebergConfig().getFileFormat()
assertTrue(table.isPresent(), "Table should exists"); | ||
String location = table.get().getStorage().getLocation(); | ||
assertTrue(location.matches(format(".*%s-[0-9a-f]{32}", TABLE)), "Table location should have UUID suffix"); | ||
assertTrue(location.matches(format(".*%s-[0-9a-f]{32}", "table_with_uuid")), "Table location should have UUID suffix"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do not use assertTrue
with complex expressions. In case of failure, it produces unhelpful message.
Use
[org.assertj.core.api.Assertions.] assertThat(location).matches(....);
instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@sshkvar could you please remove the merge commits from the git history? |
done |
@sshkvar thanks! |
@findepi I combined changes to single commit |
Merged, thanks! |
Thanks you for the review! |
#5632
Added new iceberg configuration property
iceberg.unique-table-location
By default this
property = false
, so table directory will have the same name as table.In case
iceberg.unique-table-location = true
unique UUID will be added to the table directory name, so each table will have unique location