Core: Fix drop partition field and schema field error #11387
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
fix #11314 and #10234 and #10487
In the previous code, each spec reads the latest schema. After deleting the corresponding field in schema, the historical spec cannot find the corresponding field in the current schema, and an error will occur.
If the user accidentally triggers this error, the entire table will be unreadable and no effective rollback method.
This patch persist the schema id for each spec. Spec can read the corresponding schema by schema id. So they can always find the corresponding field, and when schema updated, the current spec will apply latest schema. The rest of the spec's schema remains unchanged
But even if the partition field is deleted in the V1 table, it will not be deleted in the spec. Instead, its transform value is converted to void. Therefore, the latest spec is still not compatible with the latest schema which dropped the corresponding field.
So I prefer to forbid V1 table to drop columns which have been used as partition fields (for this purpose, I updated the PartitionSpec#checkCompatibility method so that it can detect the compatibility of void transform)
For V2 tables, it is safe to delete the corresponding column after deleting the partition field