Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support snapshot queries on Iceberg system tables - custom IcebergSystemTableHandle extends SystemTableHandle #13854

Conversation

findinpath
Copy link
Contributor

Description

Add the ability of querying Iceberg system tables by using
time travel queries either by snapshot or by timestamp.

Previously there were supported versioned queries on system
tables because the snapshot id could be encoded within the
table name:

SELECT * FROM table1$files@1242141

The syntax mentioned previously is now deprecated and
replaced by a richer syntax:

SELECT * FROM table1$files FOR VERSION AS OF 1242141
SELECT * FROM table1$files FOR TIMESTAMP AS OF TIMESTAMP '2022-01-01'

SystemTableHandle has been transformed to an interface
and was moved to trino-spi for being able to add a
custom implementation for this interface for the Iceberg
connector which supports time travel on the metadata
tables.

Is this change a fix, improvement, new feature, refactoring, or other?

Bugfix

Is this a change to the core query engine, a connector, client library, or the SPI interfaces? (be specific)

Iceberg connector

How would you describe this change to a non-technical end user or system administrator?

The metadata tables exposed by the Iceberg connector for its tables should support snapshot/versioned queries in the same fashion as the Iceberg tables.

SELECT * FROM table1$files FOR VERSION AS OF 1242141
SELECT * FROM table1$files FOR TIMESTAMP AS OF TIMESTAMP '2022-01-01'

Related issues, pull requests, and links

Fixes #12736

This PR offers a different approach to #13497 by extending SystemTableHandle for Iceberg time travel queries needs.

Documentation

(x) No documentation is needed.
( ) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.

Release notes

() No release notes entries required.
(x) Release notes entries required with the following suggested text:

# Iceberg
* Support snapshot queries on Iceberg system tables

@cla-bot cla-bot bot added the cla-signed label Aug 25, 2022
@findinpath findinpath changed the title Support snapshot queries on Iceberg system tables Support snapshot queries on Iceberg system tables - custom IcebergSystemTableHandle extends SystemTableHandle Aug 25, 2022
@findinpath findinpath force-pushed the iceberg-metadata-tables-versioned-queries branch from f5780f3 to 7d40fd2 Compare August 26, 2022 04:49
Add the ability of querying Iceberg system tables by using
time travel queries either by snapshot or by timestamp.

Previously there were supported versioned queries on system
tables because the snapshot id could be encoded within the
table name:

```
SELECT * FROM table1$files@1242141
```

The syntax mentioned previously is now deprecated and
replaced by a richer syntax:

```
SELECT * FROM table1$files FOR VERSION AS OF 1242141
SELECT * FROM table1$files FOR TIMESTAMP AS OF TIMESTAMP '2022-01-01'
```

`SystemTableHandle` has been transformed to an interface
and was moved to `trino-spi` for being able to add a
custom implementation for this interface for the Iceberg
connector which supports time travel on the metadata
tables.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

Support time travel on Iceberg system tables
1 participant