Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disallow dropping partition column in Iceberg table explicitly #21294

Merged

Conversation

hantangwangd
Copy link
Member

@hantangwangd hantangwangd commented Nov 1, 2023

Description

When we try to drop a partition column in Iceberg table, we should firstly remove it from the partition spec, then drop the column in table schema. But currently Iceberg have some problems in this scenario, see issue apache/iceberg#4563. This PR explicitly disallow drop partition column until Iceberg fixed it.

Contributor checklist

  • Please make sure your submission complies with our development, formatting, commit message, and attribution guidelines.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

== NO RELEASE NOTE ==

@hantangwangd hantangwangd requested a review from a team as a code owner November 1, 2023 16:04
@tdcmeehan tdcmeehan self-assigned this Nov 1, 2023
@tdcmeehan
Copy link
Contributor

Isn't it true that this issue only affects prior (historical) specs, and not the latest? So for example, it doesn't affect brand new tables, and it doesn't affect newly added columns?

@hantangwangd
Copy link
Member Author

hantangwangd commented Nov 2, 2023

Isn't it true that this issue only affects prior (historical) specs, and not the latest? So for example, it doesn't affect brand new tables, and it doesn't affect newly added columns?

That's right, because dropping columns used in current partition spec would explicitly fail in Iceberg implementation.

So when we try to drop a partition column, we must firstly drop it from current partition spec. That means, the column exists in a historical spec now. Then we will meet various problems subsequently after dropping it from schema. In my experiment, simultaneously drop the column from partition spec and schema in a single transaction would meet problems in the issue too.

Furthermore, this PR can also prevent Iceberg table falling into this situation in case that, for example, if we support dropping column from partition spec only.

Copy link
Contributor

@tdcmeehan tdcmeehan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. LGTM!

Copy link
Contributor

@ZacBlanco ZacBlanco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

.flatMap(partitionSpec -> partitionSpec.fields().stream())
.anyMatch(field -> field.sourceId() == handle.getId());
if (shouldNotDropPartitionColumn) {
throw new PrestoException(NOT_SUPPORTED, "This connector does not support drop partition columns");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Personally, I would make the error message more specific

Suggested change
throw new PrestoException(NOT_SUPPORTED, "This connector does not support drop partition columns");
throw new PrestoException(NOT_SUPPORTED, "This connector does not support dropping columns which exist in any of the table's partition specs");

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed!

@hantangwangd hantangwangd force-pushed the iceberg_partition_evolution branch from 69b37e3 to 035f136 Compare November 3, 2023 23:45
@hantangwangd hantangwangd merged commit d1183bc into prestodb:master Nov 4, 2023
51 checks passed
@hantangwangd hantangwangd deleted the iceberg_partition_evolution branch November 4, 2023 00:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants