Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support ALTER TABLE RENAME TO in Delta #11401

Merged
merged 2 commits into from
Jun 25, 2022

Conversation

mdesmet
Copy link
Contributor

@mdesmet mdesmet commented Mar 9, 2022

Description

Is this change a fix, improvement, new feature, refactoring, or other?

New feature

Is this a change to the core query engine, a connector, client library, or the SPI interfaces? (be specific)

Change of the Delta connector and potentially all connectors based on the Hive connector

How would you describe this change to a non-technical end user or system administrator?

Implement support for table renames in the Delta connector

Related issues, pull requests, and links

Documentation

( ) No documentation is needed.
( ) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.

Release notes

( ) No release notes entries required.
( ) Release notes entries required with the following suggested text:

# Section
* Fix some things. ({issue}`issuenumber`)

@cla-bot cla-bot bot added the cla-signed label Mar 9, 2022
Copy link
Member

@electrum electrum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing the storage location for the table seems wrong, as any existing data for the table would be lost. Try changing the test to insert into the table before renaming, then verify the data can be read after the rename.

I think the only thing needed to implement rename is to call metastore.renameTable().

@mdesmet
Copy link
Contributor Author

mdesmet commented Mar 9, 2022

That's what I have tried first. What I saw is that "location" in the table properties still refers to the old location after running metastore.renameTable() while the data has been moved to a new location. This seems to work different from Iceberg, where data stays in the original location.

Error reading tail from s3://test-delta-lake-connector-test-1w16l3nx49/test/test_specific_delete_1m2jwgl7y1/c7910ba8-c927-4d7c-9cad-d92644c345a1

java.lang.AssertionError: Execution of 'actual' query failed: SELECT * FROM test.test_specific_delete_xwrdi15au6

In my test I'm already using a CTAS, so data will be inserted before the rename and is sucessfully retrieved after the rename.

@mdesmet mdesmet requested a review from alexjo2144 March 9, 2022 22:38
@mdesmet
Copy link
Contributor Author

mdesmet commented Mar 10, 2022

In table creation, if location is set, the table is considered as external (unmanaged table), if not the location is set based on the schema location if avaiable and the table becomes a managed table. Trino throws an exception if the schema location is not set.

This PR will also throw an exception if the schema location is not set, however there may be an issue if the schema contains both managed and unmanaged tables, the code would convert a before unmanaged table to a managed table by simply renaming it.

@findepi
Copy link
Member

findepi commented Mar 10, 2022

Changing the storage location for the table seems wrong, as any existing data for the table would be lost. Try changing the test to insert into the table before renaming, then verify the data can be read after the rename.

@electrum depends whether Delta's transaction log contains relative or absolute paths.
We write relative paths and we support relative path only (today). For relative, directory rename is safe (although expensive on S3)
But the spec allows (i think) both relative and absolute, and rename would be breaking.

For this reason i think we shouldn't rename the location.

  • no rename is cheaper, especially on S3
  • we won't have problems with absolute paths in transaction log

@mdesmet
Copy link
Contributor Author

mdesmet commented Mar 11, 2022

In spark it works as follows:

If managed -> update to the new storage location and send the table to hms, hms will do the location rename operation and rename the table
if unmanaged -> just do the rename in the catalog and keep the existing storage location, hms wil just do the table rename.

See the relevant spark code

I remember that's why in iceberg the table is set to unmanaged and the storage location is not updated when renaming the table. Maybe different from Iceberg is that Delta has both managed and unmanaged tables.

Why not support both cases, similar to Spark and add a session property as @alexjo2144 mentioned on Slack just as in Databricks that needs to be enabled to rename managed tables? Simply keeping the existing location would require us to convert a managed into a unmanaged table.

But the spec allows (i think) both relative and absolute, and rename would be breaking.

Indeed the log entries could be made with both relative path or absolute path.
See the Delta spec

However on first sight this not handled in Spark/Delta as well. Maybe Databricks does something additional there? this would only be a real issue when it's done for managed tables.

@findepi
Copy link
Member

findepi commented Mar 11, 2022

If managed -> update to the new storage location and send the table to hms, hms will do the location rename operation and rename the table
if unmanaged -> just do the rename in the catalog and keep the existing storage location, hms wil just do the table rename.

What about Glue?

@mdesmet
Copy link
Contributor Author

mdesmet commented Mar 14, 2022

What about Glue?

Glue at this moment doesn't support table renames. There is only an UpdateTable API but that doesn't perimit renaming the table. See AWS glue docs

It's also blocked within Trino in the GlueHiveMetastore class

@Override
public void renameTable(String databaseName, String tableName, String newDatabaseName, String newTableName)
{
    throw new TrinoException(NOT_SUPPORTED, "Table rename is not yet supported by Glue service");
}

@mdesmet
Copy link
Contributor Author

mdesmet commented Mar 15, 2022

I also looked in the Hive Metastore code if we are able to do a rename without changing the location for a managed table.

The data is moved when

// ii) Rename needs change the data location and move the data to the new location corresponding
// to the new name if:
// 1) the table is not a virtual view, and
// 2) the table is not an external table, and
// 3) the user didn't change the default location (or new location is empty), and
// 4) the table was not initially created with a specified location

relevant code

This means that we can't stop HMS to change the location unless we (temporarely?) convert it to a external table by setting the EXTERNAL parameter to "TRUE". So it would be up to 2 metadata operations, The first operation is to change the external property to TRUE for managed tables and rename the table, the second operation to revert the external property for managed tables.

Would that be an acceptable behaviour? Or if not, are there any other suggestions on what would be an acceptable behaviour?

@mdesmet mdesmet force-pushed the feature/delta_table_rename branch 2 times, most recently from 414881d to ec37cc8 Compare March 20, 2022 21:04
QueryAssertions.assertUpdate(queryRunner, queryRunner.getDefaultSession(), "CREATE TABLE " + tableName + " AS SELECT 123 x", OptionalLong.of(1), Optional.empty());

String renamedTable = "test_rename_new_" + randomTableSuffix();
QueryAssertions.assertQueryFails(queryRunner, getSession(), "ALTER TABLE " + tableName + " RENAME TO " + renamedTable, "Renaming managed tables will move the data on the underlying storage system. Set 'delta.rename-managed-tables.enabled' to enable this feature.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use here assertThatThrownBy instead of changing the visibility of the method QueryAssertions.assertQueryFails

Copy link
Contributor Author

@mdesmet mdesmet Mar 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would work, but isn't changing the visibility the better option here? Why is public static void assertUpdate(QueryRunner queryRunner, Session session, @Language("SQL") String sql, OptionalLong count, Optional<Consumer<Plan>> planAssertion) public? Changing the visibility would allow tests to be colocated to test features that require multiple queryrunners and have a similar recognizable API. Other alternative would be to move it to a separate test class.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assertThatThrownBy(() -> query) or leverage inherited AbstractTestQueryFramework#assertQueryFails.

(the fact that you couldn't make a static import could ring a bell:)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason why I'm doing this is because I create another QueryRunner in this specific test without the setting delta-rename-managed-tables.enabled=true. That's why i can't simply call the standard methods.

I will move this test to a separate class.

}

@Config("delta.rename-managed-tables.enabled")
public DeltaLakeConfig setRenameManagedTablesEnabled(boolean renameManagedTablesEnabled)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add an @ConfigDescription for the setting for explaining the reasoning for this setting.

@mdesmet
Copy link
Contributor Author

mdesmet commented Mar 22, 2022

I have changed the implementation to work as follow:

  • External tables -> do the standard HMS rename (under the hood also a alter table statement in HMS)
  • Managed tables -> As this will move the data it requires an opt-in delta.rename-managed-tables.enabled, We update the the location property in the parameters map and the SERDE parameter in the storage location in the Hive table and execute an alter table statement on HMS, HMS will then move the data and put the new location in the storage descriptor. This is also how it works in Spark.

To make it work as in Iceberg (not moving the data) we should make every table external, but then data won't be removed on DROP (which is different from how it's done in Iceberg). I think Delta in Trino should mimic how it works within the Spark/Databricks ecosystem, hence I consider supporting both managed tables and external tables the better option

Test cases have been added for both external and managed tables.

@mdesmet mdesmet force-pushed the feature/delta_table_rename branch from ec37cc8 to 01c656f Compare March 23, 2022 11:03
@@ -1504,6 +1507,43 @@ public void dropTable(ConnectorSession session, ConnectorTableHandle tableHandle
metastore.dropTable(session, handle.getSchemaName(), handle.getTableName());
}

@Override
public void renameTable(ConnectorSession session, ConnectorTableHandle tableHandle, SchemaTableName newTable)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep the names in line with what's in ConnectorMetadata

Suggested change
public void renameTable(ConnectorSession session, ConnectorTableHandle tableHandle, SchemaTableName newTable)
public void renameTable(ConnectorSession session, ConnectorTableHandle tableHandle, SchemaTableName newTableName)

Database schema = metastore.getDatabase(newTable.getSchemaName()).orElseThrow(() -> new SchemaNotFoundException(newTable.getSchemaName()));
Optional<String> schemaLocation = getSchemaLocation(schema);
if (schemaLocation.isEmpty()) {
throw new TrinoException(NOT_SUPPORTED, "The 'location' property must be specified either for the table or the schema");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe "Target schema must have a location when moving managed tables"

Comment on lines 1526 to 1527
String location = new Path(schemaLocation.get(), newTable.getTableName()).toString();
Path targetPath = new Path(location);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd flip these to avoid the double Path construction, alternatively inline the toString calls

Suggested change
String location = new Path(schemaLocation.get(), newTable.getTableName()).toString();
Path targetPath = new Path(location);
Path targetPath = new Path(schemaLocation.get(), newTable.getTableName());
String location = targetPath.toString();

}

@Test
public void testRenameManagedTableIsNotSupportedWhenRenameManagedTablesNotEnabled()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a test that shows the underlying files are actually moved when renaming managed tables?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default public void testRenameTable() from BaseConnectorSmokeTest tests renaming of a managed table. Now we don't actually verify if the files are moved, do you think that's useful? In the end you'll be testing what hive metastore is doing. If hive metastore starts doing something different, the read query would fail. We depend on the SERDE location in the storage descriptor for the reads.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I forgot about the base class tests. That's fine with me

@alexjo2144
Copy link
Member

Overall looks really good, thanks for working on this!

@mdesmet
Copy link
Contributor Author

mdesmet commented Mar 23, 2022

@findepi : on renaming Glue tables, actually I think it should be doable similar to how materialized views are replaced with 2 metadata operations (create and drop of the storage table).

image

If you think that's interesting we can create another issue to tackle that. There is definitely user demand as in dbt-trino they just introduced a PR for glue tables where they work around the rename swap of the production table with the new production table by dropping the production table and then running the production table CTAS (effectively having no access to the production table during the rebuild process).

@mdesmet mdesmet force-pushed the feature/delta_table_rename branch 3 times, most recently from 3f34c3b to c1bc7b6 Compare March 25, 2022 06:20
@alexjo2144
Copy link
Member

Can you add something to the Delta documentation page, including why managed table renames require an opt-in?

@mdesmet mdesmet force-pushed the feature/delta_table_rename branch from c1bc7b6 to 2877fe9 Compare March 30, 2022 13:47
@github-actions github-actions bot added the docs label Mar 30, 2022
@mdesmet mdesmet requested a review from alexjo2144 April 1, 2022 14:20
@mdesmet mdesmet force-pushed the feature/delta_table_rename branch from 2877fe9 to 6bb959d Compare April 1, 2022 17:40
@alexjo2144 alexjo2144 requested a review from findepi April 14, 2022 19:17
Renaming tables
^^^^^^^^^^^^^^^

When using Hive metastore, you can use the connector to :doc:`rename
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can rename Delta Lakes tables with ALTER TABLE.


* Tables created without explicit location

Renaming will move the data on the underlying storage system. As this
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renaming moves

Renaming will move the data on the underlying storage system. As this
can be an expensive operation, an explicit opt-in is required. Set
``delta.rename-managed-tables.enabled`` to ``true`` in your catalog
properties and restart your Trino server to activate this behavior.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an explicit opt-in is required with setting delta.rename-managed-tables.enabled to true.

Drop the rest


* Tables created with explicit location

Renaming is only a metadata operation and requires no additional
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

operation, and requires

Renaming is only a metadata operation and requires no additional
configuration.

Rename operations on Delta Lake tables backed by Glue metastore are not supported.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove .. we do no document what we do not support.

@mdesmet mdesmet force-pushed the feature/delta_table_rename branch from 3c9f1b9 to 67f138a Compare May 8, 2022 19:48
@mdesmet mdesmet requested a review from findinpath May 9, 2022 06:35
@mdesmet mdesmet requested review from mosabua and findepi May 17, 2022 21:26
Copy link
Member

@mosabua mosabua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs look good.. no input from my side about the rest..

Copy link
Member

@findepi findepi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay.
I am still unsure what the rename semantics for tables with implicit locations should be.
there is a bunch of options.
The RENAME of tables with explicit location is a clearly defined problem (location mustn't change).

@mdesmet I don't want to block this awesome enhancements for "external" tables. This is a very needed feature. I also don't want to jump the gun on the handing of "managed" tables (with implicit locations). Will add some considerations in a bit.
Can you please split this PR into two PRs? Here, keep support for renaming "external" tables only (with explicit location) and move the "managed" to a new one?

cc @alexjo2144 @hashhar @claudiusli @findinpath @losipiuk

Comment on lines 459 to 461
Renaming moves the data on the underlying storage system. As this
can be an expensive operation, an explicit opt-in is required with
setting ``delta.rename-managed-tables.enabled`` to ``true``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this true for all metastores, or only for HMS?

Comment on lines 119 to 122
* - ``delta.rename-managed-tables.enabled``
- Enable :ref:`renaming tables <renaming-tables>` created without
explicit location. A rename moves the data on the underlying storage
system. As moving data on the underlying storage system is a potentially
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apologies, but i am still unsure what the rename semantics for tables with implicit locations should be.
there is a bunch of options, and sending rename to metastore is one of them, but i am not yet convinced it's the best one.

@findepi
Copy link
Member

findepi commented May 24, 2022

I am still unsure what the rename semantics for tables with implicit locations should be.
there is a bunch of options.

Added Options as a comment here: #11400 (comment)

@mdesmet mdesmet force-pushed the feature/delta_table_rename branch 3 times, most recently from 6b121b2 to 57eaaaa Compare June 19, 2022 19:42
@mdesmet mdesmet requested a review from findepi June 19, 2022 22:08
@findepi findepi force-pushed the feature/delta_table_rename branch from 57eaaaa to e3919f1 Compare June 24, 2022 15:03
@findepi findepi force-pushed the feature/delta_table_rename branch 2 times, most recently from 03d7f57 to 562979c Compare June 24, 2022 15:42
@findepi findepi force-pushed the feature/delta_table_rename branch from 562979c to e83abac Compare June 24, 2022 15:53
@findepi
Copy link
Member

findepi commented Jun 25, 2022

CI #12950

@findepi findepi merged commit 0ee0930 into trinodb:master Jun 25, 2022
@github-actions github-actions bot added this to the 388 milestone Jun 25, 2022
@findepi findepi mentioned this pull request Jun 25, 2022
@colebow
Copy link
Member

colebow commented Jun 27, 2022

Did the docs changes get deleted from this PR / moved into a different one? We'll still need to document this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

Implement rename table support for Delta connector
7 participants