Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Iceberg materialized view refresh atomic #14145

Merged
merged 1 commit into from
Sep 16, 2022

Conversation

alexjo2144
Copy link
Member

Description

Include old file deletion and new file appends in the same transaction.

Non-technical explanation

Avoids a short time period where the MV is in an invalid state.

Release notes

( ) This is not user-visible and no release notes are required.
( ) Release notes are required, please propose a release note for me.
(x) Release notes are required, with the following suggested text:

# Iceberg
* Improve the commit protocol used when refreshing a materialized view.

@cla-bot cla-bot bot added the cla-signed label Sep 15, 2022
IcebergWritableTableHandle table = (IcebergWritableTableHandle) insertHandle;

Table icebergTable = transaction.table();
// delete before insert .. simulating overwrite
transaction.newDelete()
.deleteFromRowFilter(Expressions.alwaysTrue())
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@raunaqmorarka I wasn't totally sure about this alwaysTrue part. Do we always rewrite the entire content of a MV?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think incremental refreshes are supported only if io.trino.spi.connector.ConnectorMetadata#delegateMaterializedViewRefreshToConnector returns true (it returns false for Iceberg), and connector knows how to refresh incrementally.

BTW @alexjo2144 will this unregister all old data files, or will it scan any data file contents?

Also, does this preserve table version history correctly?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW @alexjo2144 will this unregister all old data files, or will it scan any data file contents?

This will unregister the data files from the manifest and they will not be scanned. It is the same code as is run by executeDelete just attached to the transaction instead of being run separately.

Also, does this preserve table version history correctly?

If we need history to be retained we should upgrade to 0.14.1 first, there's a bug fix in that we'd need. However, I was assuming history for an MV is not useful

Copy link
Member

@findepi findepi Sep 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we need history to be retained we should upgrade to 0.14.1 first, there's a bug fix in that we'd need

yeah, just realized it's coming now: #14074 (comment)

However, I was assuming history for an MV is not useful

perhaps; i am fine merging as is, without 0.14.1

@findepi
Copy link
Member

findepi commented Sep 15, 2022

@alexjo2144 mind CI

Include old file deletion and new file appends in the same transaction.
@findepi
Copy link
Member

findepi commented Sep 16, 2022

( just squashed )

@findepi findepi merged commit 1dc391a into trinodb:master Sep 16, 2022
@findepi findepi mentioned this pull request Sep 16, 2022
@github-actions github-actions bot added this to the 397 milestone Sep 16, 2022
@alexjo2144 alexjo2144 deleted the iceberg/mv-refresh branch September 19, 2022 14:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

2 participants