-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement rename table support for Delta connector #11400
Comments
There are two kinds of Delta Lake tables
trino/plugin/trino-delta-lake/src/main/java/io/trino/plugin/deltalake/DeltaLakeMetadata.java Lines 591 to 599 in 1464cb8
The behavior of For "managed" tables, ones with implicit locations, we have few options Option AS_IS: we leave the table location as-is when renaming the table.Then table rename is O(1) (as it should). CREATE TABLE tmp;
ALTER TABLE tmp RENAME TO desired_name;
CREATE TABLE tmp; -- fails because <storage>/tmp location already exists Also, even without user re-using table names like above: CREATE TABLE tmp_123;
ALTER TABLE tmp RENAME TO desired_name; the files of Option RENAME_LOCATION: rename the table location in the same manner as table nameThis will be O(1) on some storages and O(data size) on e.g. S3. There is a caveat that the feature is probably unusable on S3, while S3 is probably the most popular storage these days. Option EXTONLY: limit RENAME TO this to "non-managed" tables, i.e. ones with explicit locationThis avoids the problem. If user provided table location explicitly, we shouldn't change it. There is a caveat that this requires users to think in terms of paths. So SQL-centric person cannot just do CREATE TABLE tmp_123;
ALTER TABLE tmp RENAME TO desired_name; they need to add some Option RANDOMIZE: randomize table locations and keep them as-is when renamingWhen creating implicit table location for a new table, append randomized suffix to the table name. This way, the location is either randomized, or provided explicitly by the user. There is a caveat that RENAME TO's behavior depends on the configuration at the time of CREATE TABLE. Option SHRUG: delegate the problem to the metastore and 🤷♂️Delegate the problem to the metastore. HMS will do renames on HDFS, and I-don't-currently-know what it will do with paths on S3. Glue will not do renames. There is a caveat that the behavior will be metastore-dependent, and may further depend on metastore configuration. This will be confusing and frustrating. |
@findepi: I agree that not moving the data is the best. Note that in current code tables with implicit location are created as We could set the table type to |
Also changing the table type to EXTERNAL will also NOT drop the data in |
@mdesmet the DROP behavior is exactly why we differentiate tables with explicit and implicit locations. |
I know that, but I'm really talking about the Hive Metastore's table type. That can be either In Delta lake we currently create tables without an explicit location as |
Relates to #5632 |
@findepi thanks for drafting the alternatives. I like the randomize idea too. where would it not be sane? |
Currently the delta connector does not support table renames.
Supporting the table rename would allow us to use the delta connector in a standard dbt setup. Which would be very beneficial as Delta is currently the only table format that supports row-level DELETE and INSERT that works with the standalone hive metastore. Hive transactional tables only work with specific hive versions (See this issue). Iceberg also doesn't support row-level DELETE and INSERT. Enabling the table rename in Delta is by consequence a low-hanging fruit.
Delta supports both external and managed tables. The table renames should ideally support both cases and work as expected as for example within Spark/Hive metastore. Managed tables are more suitable for a dbt setup as DROP table statements remove the data on the underlying storage and make running the dbt project idempotently and frictionless: rerunning the project won't require manually removing the data, otherwise table creation would error on data already exists in the relevant directories.
dbt works as follows:
initial load
create a working table orders_tmp, do it's work.
When done it will rename the production table orders and then rename the orders_tmp table to the production table name (in snowflake this is done atomically using CREATE OR REPLACE TABLE).
Then it will drop the old production table.
incremental loads
OOTB it contains some support for MERGE statements, however this is not yet supported within Trino. We could however apply INSERT, UPDATE and DELETE statements. knowing that it wouldn't offer the same atomicity as the MERGE statement.
The text was updated successfully, but these errors were encountered: