Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Iceberg DROP table not removing data from s3 #5616

Closed
sshkvar opened this issue Oct 20, 2020 · 12 comments · Fixed by #11062
Closed

Iceberg DROP table not removing data from s3 #5616

sshkvar opened this issue Oct 20, 2020 · 12 comments · Fixed by #11062
Assignees

Comments

@sshkvar
Copy link
Contributor

sshkvar commented Oct 20, 2020

Hi,
I am using presto (version 343) with Iceberg connector and found issue:

Tables which was created as CREATE TABLE ... or CREATE TABLE AS SELECT... then dropped - it will remove data only from Hive Metastore but not from s3.
Then if we try to create same table again - we will have table already exists exception

Example:

CREATE TABLE iceberg.test_table (id varchar(25));
INSERT INTO iceberg.test_table values ('1');
DROP TABLE iceberg.test_table

-- Table removed from Hive metastore, but files still exists in s3

Note:
In connector config we have hive.metastore.thrift.delete-files-on-drop=true

Issue investigation results

  1. CREATE TABLE - when we create table (see HiveTableOperations), we have hardcoded TableType=EXTERNAL_TABLE and parameter (EXTERNAL = TRUE)
             Table.Builder builder = Table.builder()
                        ...
                        .setTableType(TableType.EXTERNAL_TABLE.name())
                        ...
                        .setParameter("EXTERNAL", "TRUE")
                        ...
  1. DROP TABLE - we have 2 places which responsible for data deletion from s3
    2.1 First one is Hive Metastore - it will delete files from EXTERNAL table only if we passed external.table.purge=TRUE parameter to it
    2.2. Second place is ThriftHiveMetastore, but as we have hardcoded TableType=EXTERNAL_TABLE - this code will be skipped
  if (deleteFilesOnDrop && deleteData && isManagedTable(table) && !isNullOrEmpty(tableLocation)) {
                                deleteDirRecursive(hdfsContext, hdfsEnvironment, new Path(tableLocation));
                            }
#isManagedTable(table) -> false, because on table create we set tableType = EXTERNAL_TABLE

As a result data from s3 not deleted

@sshkvar
Copy link
Contributor Author

sshkvar commented Oct 22, 2020

Possible solution would be adding new IcebergTableProperty (type) which will be configurable (EXTERNAL_TABLE by default) but user will be able to change it to one of MANAGED_TABLE, EXTERNAL_TABLE, VIRTUAL_VIEW, MATERIALIZED_VIEW on table creation step

something like it https://github.com/prestosql/presto/pull/5656/files

@rdblue
Copy link
Contributor

rdblue commented Nov 24, 2020

@electrum, why would the Hive table type matter for whether the data for an Iceberg table gets deleted? Iceberg provides utilities for cleaning up the data if locations are mixed, and if the table owns a location then Presto should be able to simply delete recursively.

I can understand wanting to control this behavior. We would turn it off, for example, because we have a service that cleans up old data. But by default I would expect drop table to remove the data.

Another solution is to stop checking whether the path exists. Iceberg tables won't conflict with other uses of a prefix and can clean up the data referenced by the table, so other data can exist in the location without correctness issues for Iceberg. I would probably relax that constraint.

@sshkvar
Copy link
Contributor Author

sshkvar commented Nov 24, 2020

Metastore will delete files from s3 but only for tables with type MANAGED_TABLE.
I have created PR which contains new configuration property for Iceberg tables.
By default it is set to EXTERNAL_TABLE, but we can set it to MANAGED_TABLE and it this case metastore will delete files on table drop

@rdblue
Copy link
Contributor

rdblue commented Nov 24, 2020

@sshkvar, that is the correct behavior for Hive tables, but not for Iceberg tables. I think Iceberg tables should continue to use external and the Iceberg connector should clean up the files.

@sshkvar
Copy link
Contributor Author

sshkvar commented Nov 26, 2020

@rdblue @electrum based on our discussion I have created additional PR #6108.
This PR adding ability to recursively delete table data on table drop. This functionality configurable and disabled by default (current behavior)
So Iceberg connector is responsible for deleting table data, @rdblue @electrum what do you think about this changes, is it make sense?

@sshkvar
Copy link
Contributor Author

sshkvar commented Dec 3, 2020

@electrum @rdblue what do you think about PR #6108?

@rdblue
Copy link
Contributor

rdblue commented Dec 7, 2020

I'm going to put an item on the agenda for the next Iceberg community sync to talk about prefix ownership. There are some good questions here:

  1. Should we assume that a table owns a prefix and can delete it when it is removed?
  2. If yes, where should that happen? If not, should we have a config property to turn it on?

@sshkvar
Copy link
Contributor Author

sshkvar commented Dec 8, 2020

@rdblue I have created another merge request which adds ability to have unique table location for each table #6063, so based on it each table will have unique location and can delete it when it is removed. Also this PR #6108 adds ability to recursively delete table data on drop (disabled by default, but can be enabled in configuration).
So if both configuration will be set to true (iceberg.unique-table-location=true and iceberg.delete-files-on-table-drop=true) we will be able to safe delete table data

@SinghAsDev
Copy link

Curious, what was decided on this? Users from hive and spark-sql world are pretty used to the notion of managed table drops would drop the data, and external tables won't.

@rdblue
Copy link
Contributor

rdblue commented Jan 26, 2021

We decided there are valid use cases where the table owns its location and where it doesn't own its location. So the consensus was to create either a table flag or a catalog flag to control it. I think we should add a flag to the Iceberg catalog to determine whether it will drop the data location recursively.

@sshkvar
Copy link
Contributor Author

sshkvar commented Jan 26, 2021

@rdblue please take a look on this pull request #6108

I this pull request I added new configuration property to Iceberg catalog iceberg.delete-files-on-table-drop
This property is disabled by default. If iceberg.delete-files-on-table-drop=true it will drop table data location recursively.

@findepi
Copy link
Member

findepi commented Jul 16, 2021

So the consensus was to create either a table flag or a catalog flag to control it. I think we should add a flag to the Iceberg catalog to determine whether it will drop the data location recursively.

@rdblue was it decided where the flag is stored and what its name?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging a pull request may close this issue.

5 participants