-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement incremental MV refresh for Iceberg connector #20813
Comments
Some of this might already be in place .. at least I remember that we had some of this but it was isolated to Iceberg tables only. It is still reflected in the doc which I am currently updating in #20665 with @martint and @findepi Also @anjalinorwood might know more. |
Yeah.. maybe you are right and the mention in the docs is just about detecting outdated data vs incrementally refreshing. The PR changes that whole section a bit since that is related to the grace period. Check out the paragraph "Updating the data in the materialized view..." in https://trino.io/docs/current/connector/iceberg.html#materialized-views and let me know if that makes sense. |
Yes, looking at the docs and the source code, they both suggest that there is no incremental refresh currently. |
Forgive me if i close this issue as a duplicate of a pre-existing #18673 |
No worries @findepi, makes sense, I missed the earlier issue. |
In Trino Iceberg, currently we do a full refresh of MVs in all cases, meaning we drop all the existing data from the MV storage table and then insert all the data again (essentially a DROP + CTAS operation combined).
We could try to be smarter during refresh. While incrementally refreshing a MV can be tricky for certain operation types such as join, aggregations or window functions, there are some easy wins such as single-table, predicate-only queries. More complicated cases can be tackled in subsequent implementation phases.
For the first phase, I am preparing a patch to make use of the
IncrementalAppendScan
class from iceberg to only process the delta changes during refresh, for the single-table, predicate-only scenario.The text was updated successfully, but these errors were encountered: