-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Iceberg table name on s3 #5632
Comments
As simple solution we can generate UUID and append it to the table name when building tableDefault location in IcebergMetadata Something like this: if (targetPath == null) {
String uniqueTableName = tableName + "_" + UUID.randomUUID();
targetPath = getTableDefaultLocation(database, hdfsContext, hdfsEnvironment, schemaName, uniqueTableName).toString();
} |
@rdblue what is the expected behavior in this scenario? |
This is the expected behavior. There are a few choices that cause this, but I think that they are all reasonable:
|
Hi @rdblue, thank for clarification. What do you think about my proposed solution, we can have some configuration property, and if this property set to true, Iceberg will append unique UUID to the table location? As a result on s3 for example we will have unique paths for each table |
The drop table behavior for the Presto Iceberg connector does need to be revisited. We call the metastore For Iceberg, I believe we should set |
I personally like the approach of using a unique directory for Iceberg locations. Having that as an option seems reasonable. But since that's not the "standard" behavior of Iceberg, and because that's not a guarantee in the Iceberg ecosystem, I don't think we should make it the default behavior, nor can we rely on it for correctness when dropping a table. |
@electrum I am fully agree with you that Unique name feature should't be default behavior for Iceberg connector, thats why I added new configuration option which will enable this functionality in case user need it (IcebergConfig), but by default it is false Also regarding dropping the table, I have faced another issue with it. Currently Iceberg Connector not deleting data at all from s3, please take a look #5616 |
I would be fine with adding a unique UUID to table locations. We do this for our tables in our custom catalog. It would just be a matter of updating how a default table location is generated in the catalog. Feel free to open a PR for Iceberg. |
Relates to #11400 (for Delta) @electrum @alexjo2144 @losipiuk @findinpath @phd3 i think we should enable unique locations by default before closing this issue. |
@findepi, this is typically up to the catalog to decide, but I think it's reasonable for Trino to add a UUID to table locations. |
It seems to me like Trino users most of the time would want unique table locations. I can get behind enabling it by default. |
Hi,
I am using presto (version 343) with Iceberg connector and found strange behaviour with table rename.
First of all I have created table with name test_table
Table successfully created and put metadata and parquet files to s3
Then I renamed table to
test_table_renamed
After it I created new table with same name as in step 1
test_table
Table successfully created but put metadata files and parquet files to the same directory as table from step 1
As a result we have 2 different table in Hive Metastore, but metadata files and parquet files placed in the same directory on s3
test_table
The text was updated successfully, but these errors were encountered: