You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After removing a partition field from an Iceberg table (using Iceberg/Spark's Table Evolution API), trino is no longer able to read from the table. Minimal example below:
SparkSQL to initialize the table:
-- mycatalog is configured to point to a Hive Metastore Service, using a common name across Spark & Trino
spark-sql> CREATE SCHEMA mycatalog.partition_evolution;
spark-sql> CREATE TABLE mycatalog.partition_evolution.example (category STRING, n INT) USING ICEBERG PARTITIONED BY (category);
spark-sql>SELECT*FROMmycatalog.partition_evolution.example;
However, removing a partition field (either one) causes the query to fail for Trino
spark-sql> ALTER TABLE mycatalog.partition_evolution.example DROP PARTITION FIELD n;
-- spark still reads this fine
spark-sql>SELECT*FROMmycatalog.partition_evolution.example;
From what I can tell, this is because when Iceberg deletes a partition field, it creates a VoidTransform, the justification for this is provided here. When Trino attempts to determine the correct transform in this switch statement it does not match on "void".
After removing a partition field from an Iceberg table (using Iceberg/Spark's Table Evolution API), trino is no longer able to read from the table. Minimal example below:
SparkSQL to initialize the table:
Trino query works as expected:
Adding a new partition continues to work across both:
However, removing a partition field (either one) causes the query to fail for Trino
At this point the table becomes unreadable within trino with seemingly no way to recover to a readable state.
The text was updated successfully, but these errors were encountered: