Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Row Commit Versions #1715

Open
1 of 3 tasks
tomvanbussel opened this issue Apr 24, 2023 · 1 comment
Open
1 of 3 tasks

[Feature Request] Row Commit Versions #1715

tomvanbussel opened this issue Apr 24, 2023 · 1 comment
Labels
enhancement New feature or request

Comments

@tomvanbussel
Copy link
Collaborator

Feature request

Overview

The Delta specification was recently extended to include Row IDs, which can be used to uniquely identify a row across multiple versions of a table. We now propose this to extend this with a Row Commit Version, which can be used together with the Row ID to uniquely identify a version of a row, by storing the last commit version in which the row was either inserted or updated (but not copied to a different file).

Motivation

Can be used together with Row IDs to maintain derived tables.

Further details

See the design doc for further details.

Willingness to contribute

The Delta Lake Community encourages new feature contributions. Would you or another member of your organization be willing to contribute an implementation of this feature?

  • Yes. I can contribute this feature independently.
  • Yes. I would be willing to contribute this feature with guidance from the Delta Lake community.
  • No. I cannot contribute this feature at this time.
@tomvanbussel tomvanbussel added the enhancement New feature or request label Apr 24, 2023
@felipepessoto
Copy link
Contributor

Could you provide more information about the derived tables? I didn't get how it would work

And would it require materialized row id?

allisonport-db pushed a commit that referenced this issue May 24, 2023
This PR adds the protocol specification changes for the Row Commit Versions that are proposed in #1715.

In particular it makes the following changes:

- Renames the rowIds feature to rowTracking.
- Renames the delta.enableRowIds property to delta.enableRowTracking.
- Renames and moves the preservedRowIds flag in rowIdHighWaterMark to delta.rowTracking.preserved in the tags of commitInfo.
- Refactors the specification of Row IDs
- Adds the specification for Row Commit Versions.

Closes #1747

GitOrigin-RevId: ac774c4b92c53643d9f4f5b174270a94ab71e1e1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants