Iceberg table name on s3 #5632

sshkvar · 2020-10-21T11:22:32Z

Hi,
I am using presto (version 343) with Iceberg connector and found strange behaviour with table rename.

First of all I have created table with name test_table

create table if not exists wiceberg.examples.test_table (
    c1 integer,
    c2 date,
    c3 double)
WITH (
    format = 'PARQUET');
    
INSERT into wiceberg.examples.test_table values (1, date '2020-10-16',3.3);

Table successfully created and put metadata and parquet files to s3

Then I renamed table to test_table_renamed

alter table wiceberg.examples.test_table rename to wiceberg.examples.test_table_renamed;

After it I created new table with same name as in step 1 test_table

create table if not exists wiceberg.examples.test_table (
    c1 integer,
    c2 date,
    c3 double)
WITH (
    format = 'PARQUET');
    
INSERT into wiceberg.examples.test_table values (1, date '2020-10-16',3.3);

Table successfully created but put metadata files and parquet files to the same directory as table from step 1
As a result we have 2 different table in Hive Metastore, but metadata files and parquet files placed in the same directory on s3 test_table

The text was updated successfully, but these errors were encountered:

sshkvar · 2020-10-21T13:32:08Z

As simple solution we can generate UUID and append it to the table name when building tableDefault location in IcebergMetadata

Something like this:

        if (targetPath == null) {
            String uniqueTableName = tableName + "_" + UUID.randomUUID();
            targetPath = getTableDefaultLocation(database, hdfsContext, hdfsEnvironment, schemaName, uniqueTableName).toString();
        }

electrum · 2020-10-21T21:18:58Z

@rdblue what is the expected behavior in this scenario?

rdblue · 2020-10-21T21:32:48Z

This is the expected behavior.

There are a few choices that cause this, but I think that they are all reasonable:

Rename should not change the location of a table, only the name. This makes sense because location may be controlled independently through a LOCATION clause in the create statement.
Iceberg allows you to create a table even if there are existing files in a directory. Iceberg maintains a tree of file references and there isn't a requirement that two trees don't share the same location prefix. It would be an artificial restriction to require an empty prefix, and some failure cases would prevent you from creating a table after recovering. For example, if you DROP and CREATE a table in a workflow, but a S3 outage prevents you from removing all the files on drop, then the next CREATE will be stuck.
Table location is set by default using the same convention that Hive uses.

sshkvar · 2020-10-22T06:55:37Z

Hi @rdblue, thank for clarification.
But in this case if we will have hive.metastore.thrift.delete-files-on-drop=true it potentially delete all tables which is placed in same directory ?

What do you think about my proposed solution, we can have some configuration property, and if this property set to true, Iceberg will append unique UUID to the table location? As a result on s3 for example we will have unique paths for each table
https://github.com/sshkvar/presto/pull/2/files

electrum · 2020-10-22T17:25:43Z

The drop table behavior for the Presto Iceberg connector does need to be revisited. We call the metastore drop_table() with deleteData=true, since that's the right behavior for Hive. hive.metastore.thrift.delete-files-on-drop=true is a hack that was put in place since there were some weird cases where the metastore wouldn't clean up the files when Presto called drop_table().

For Iceberg, I believe we should set deleteData=false and do the cleanup on the Presto side.

electrum · 2020-10-22T17:28:14Z

I personally like the approach of using a unique directory for Iceberg locations. Having that as an option seems reasonable. But since that's not the "standard" behavior of Iceberg, and because that's not a guarantee in the Iceberg ecosystem, I don't think we should make it the default behavior, nor can we rely on it for correctness when dropping a table.

sshkvar · 2020-10-23T05:25:10Z

@electrum I am fully agree with you that Unique name feature should't be default behavior for Iceberg connector, thats why I added new configuration option which will enable this functionality in case user need it (IcebergConfig), but by default it is false private boolean uniqueTableLocation = false;

Also regarding dropping the table, I have faced another issue with it. Currently Iceberg Connector not deleting data at all from s3, please take a look #5616

rdblue · 2020-10-23T16:39:40Z

I would be fine with adding a unique UUID to table locations. We do this for our tables in our custom catalog. It would just be a matter of updating how a default table location is generated in the catalog. Feel free to open a PR for Iceberg.

sshkvar · 2020-11-23T18:31:07Z

@rdblue @electrum Based on our discussion I have created pull-request for it
Will be really appreciate for the your review

Sorry for the long pause, I needed to discuss CLA with my management

findepi · 2022-06-10T09:04:24Z

Relates to #11400 (for Delta)

@electrum @alexjo2144 @losipiuk @findinpath @phd3 i think we should enable unique locations by default before closing this issue.
WDYT?

rdblue · 2022-06-10T15:18:02Z

@findepi, this is typically up to the catalog to decide, but I think it's reasonable for Trino to add a UUID to table locations.

alexjo2144 · 2022-06-21T19:28:41Z

It seems to me like Trino users most of the time would want unique table locations. I can get behind enabling it by default.

alexjo2144 · 2022-06-22T19:56:56Z

#12941

sshkvar added a commit to sshkvar/presto that referenced this issue Oct 22, 2020

Added ability to generate unique paths for tables trinodb#5632

5adaebf

sshkvar mentioned this issue Oct 22, 2020

Added ability to generate unique paths for tables … sshkvar/presto#1

Closed

sshkvar mentioned this issue Oct 22, 2020

Added ability to generate unique paths for tables sshkvar/presto#2

Open

sshkvar mentioned this issue Nov 23, 2020

Added ability to have unique table location for each iceberg table #6063

Merged

findepi added bug Something isn't working correctness labels Jul 23, 2021

findepi mentioned this issue Jul 27, 2021

Spark: Added ability to add uuid suffix to the table location in Hive catalog apache/iceberg#2850

Closed

findepi mentioned this issue Jun 10, 2022

Implement rename table support for Delta connector #11400

Closed

findepi mentioned this issue Jun 10, 2022

Iceberg Connector #1324

Closed

93 tasks

alexjo2144 mentioned this issue Jun 22, 2022

Use unique table locations in Iceberg by default #12941

Merged

findepi closed this as completed in #12941 Aug 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Iceberg table name on s3 #5632

Iceberg table name on s3 #5632

sshkvar commented Oct 21, 2020

sshkvar commented Oct 21, 2020

electrum commented Oct 21, 2020

rdblue commented Oct 21, 2020

sshkvar commented Oct 22, 2020 •

edited

Loading

electrum commented Oct 22, 2020

electrum commented Oct 22, 2020

sshkvar commented Oct 23, 2020

rdblue commented Oct 23, 2020

sshkvar commented Nov 23, 2020

findepi commented Jun 10, 2022

rdblue commented Jun 10, 2022

alexjo2144 commented Jun 21, 2022

alexjo2144 commented Jun 22, 2022

Iceberg table name on s3 #5632

Iceberg table name on s3 #5632

Comments

sshkvar commented Oct 21, 2020

sshkvar commented Oct 21, 2020

electrum commented Oct 21, 2020

rdblue commented Oct 21, 2020

sshkvar commented Oct 22, 2020 • edited Loading

electrum commented Oct 22, 2020

electrum commented Oct 22, 2020

sshkvar commented Oct 23, 2020

rdblue commented Oct 23, 2020

sshkvar commented Nov 23, 2020

findepi commented Jun 10, 2022

rdblue commented Jun 10, 2022

alexjo2144 commented Jun 21, 2022

alexjo2144 commented Jun 22, 2022

sshkvar commented Oct 22, 2020 •

edited

Loading