-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reading as of Snapshot ID fails on Metadata Tables after Iceberg Table Schema Update #6978
Comments
Interesting, this sounds like a serious bug. But I tried to reproduce this in latest master, but following test passed for me (add to TestMetadataTables):
Not sure what I'm missing here |
@szehon-ho This issue only happens if the schema of the table is updated (not the metadata table, but its corresponding iceberg table). (SimpleExtraColumnRecord is a SimpleRecord with just one extra string column)
This test fails on this call: |
Adding these conditions to these two locations in SparkTable.java fixes the issue:
|
Got it, missed that part, need time to look at that. Feel free to open a pr, and add the test case repro. |
Thank you @szehon-ho ! Happy to open up separate PRs for Spark-3.2 and Spark-3.1 (not sure if putting them all in the same PR or separately is better for cherry-picking for a release) |
I had a look at this and I think I understand what's going on here. In The error therefore makes sense to me, since the schema of the |
Hi @nastra really appreciate you taking the time to take a look at this PR. I was a bit confused at first, and it took some time to identify the issue and relate it to schema evolution, so here's an attempt at clearing up any confusion one might have when taking a look at this issue... Firstly, my understanding is that the concept of snapshot_id is consistent between the actual iceberg table, and its corresponding state of its metadata tables that represent it at a certain point in time. Time travel on metadata tables based on the snapshot_id is actually an advertised feature of Iceberg within its docs. Your observation that the schema of the filesTable didn't evolve and that that's the cause of the issue is absolutely correct. But I want to make a distinction between the schema_id of the actual iceberg table, the metadata table and their consistent snapshot_ids. Here's a table describing the timeline of events that is described by my test:
The schema ID of the actual table does not change until there is schema evolution. And this is consistent with @szehon-ho 's observation that there is no issue when you try to run time travel queries on the metadata table when there is no schema evolution, even when you add an extra row and increment the snapshot ID. However, when the schema evolves in the actual table, because #1508 makes a strong assumption that we are only looking at the schema ID of the actual table, we are using that schema ID to read the files metadata table, instead of using the schema ID of files table at that snapshot. Interesting thing is, that we are still able to query the files table for the first two snapshots in this series of events (6234 and 9023) even after the table has evolved, because the schema IDs within that snapshot are consistent across it's actual table and its metadata tables. |
I have a couple of thoughts around what the "correct" fix would be for this issue... as @szehon-ho mentioned on PR #6980 it would absolutely be nice for us to be able to support looking up metadata tables using their corresponding snapshot's schema as well. For additional context, the ability to read data using the snapshot's schema was introduced in 0.13.0 Release, which allowed users to view the actual table using the schema of the table in that point in time. This wasn't a feature that was supported for actual tables or metadata tables before. Also, the metadata table's schema can only be updated if a user operates on a specific iceberg table across multiple versions of iceberg jar. This can happen, but only happens with a clear intention to increment the infrastructure stack in a data pipeline, instead of a running pipeline consistently and passively invoking schema evolution on the table. Since this is a feature regression, I think it would be very important to put in a fix to at least make the metadata table readable upon schema evolution of the actual table by conditionally using the |
yea i agree with @syun64 , time travel for metadata tables is quite useful. For example, we can today query files table as of some timestamp, and it will return files of that timestamp. Same for other metadata. The problem comes from #1508 which for all time-travel, attempt to apply the schema at that time. While its a good idea, its just not supported yet for metadata tables, as they have dont have concept of historical schemas, though I feel they could. |
Apache Iceberg version
1.1.0 (latest release)
Query engine
Spark
Please describe the bug 🐞
Time travel / reading as of certain snapshot ID fails on Metadata Tables if there was ever a schema evolution introduced in the iceberg table. This seems like it could be an unwanted side effect of this PR that allows us to use the snapshot schema when reading a snapshot: #3722
Since schema evolution is not supported on metadata tables, we could patch this bug by using a condition that checks if the iceberg table is an instance of BaseMetadataTable before making the snapshotSchema call
Example query:
spark.read.format("iceberg").option("snapshot-id", 10963874102873L).load("db.table.files")
Example Error after Schema evolution:
The text was updated successfully, but these errors were encountered: