-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document Iceberg time travel & snapshot queries #10515
Document Iceberg time travel & snapshot queries #10515
Conversation
4df43ec
to
4a9ae82
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, % nits.
IIRC @electrum had an opinion on whether to document snapshot querying at all. but can't find relevant PR/issue now.
|
||
The snapshot queries can be also used with special tables:: | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can we avoid double newlines for uniformity in this file?
If we have plans to use the new |
The functionality is already provided in Trino. I see currently no harm in documenting it. |
445b69d
to
a944b98
Compare
I agree with @hashhar. We should implement the new |
The existing Iceberg syntax is problematic since the number can mean both a specific snapshot ID or an "as of" timestamp. Due to the snapshot IDs being random and overlapping with the timestamp space, a query relying on it as a snapshot may return incorrect results if the timestamp value happens to be a snapshot ID. It should be easy to implement the |
sounds good, seems like this is close: #10258 @findinpath we'd need docs after the above gets merged, so we can update this one afterwards or just create a new one. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me. Found my way over to some of these Iceberg documentation PRs based on questions asked in the Iceberg slack workspace. 👋
Rolling back to a previous snapshot | ||
----------------------------------- | ||
Snapshots | ||
--------- | ||
|
||
Iceberg supports a "snapshot" model of data, where table snapshots are | ||
identified by an snapshot IDs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: an snapshot
-> a snapshot
.
This might be a good place to move the introduction of the hidden metadata table "$snapshots"
as well.
Though given you're not touching that paragraph, it's totally up to you 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modified to snapshots are identified by a snapshot ID
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad. I missed the comment from @phd3 above. I found my way here as there was a lot of discussion in the Iceberg slack workspace on Trino docs recently. But agreed with the comments above about potential collisions with snapshot ID and timestamps.
a944b98
to
d9ec944
Compare
d9ec944
to
5ade27d
Compare
5ade27d
to
5bece28
Compare
5bece28
to
4210d73
Compare
Description
Document Iceberg time travel & snapshot queries
Related issues, pull requests, and links
Fixes #12745
There are 2 ongoing PRs both trying to offer (with different approaches) the ability to do time travel on metadata tables:
instanceof
handling in IcebergMetadata #13497If any of the above mentioned PRs land, we'll need to update the documentation.
Documentation
( ) No documentation is needed.
(x) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.