Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HUDI-3135] Make delete partitions lazy to be executed by the cleaner #4489

Merged
merged 5 commits into from
Mar 31, 2022

Conversation

XuQianJin-Stars
Copy link
Contributor

@XuQianJin-Stars XuQianJin-Stars commented Jan 1, 2022

What is the purpose of the pull request

As of now, delete partitions will ensure all file groups are deleted, but the partition as such is not deleted. So, get all partitions might be returning the deleted partitions as well. but no data will be served since all file groups are deleted. With this patch, we are fixing it. We are letting cleaner take care of deleting the partitions when all file groups pertaining to a partitions are deleted.

Brief change log

  • Fixed the CleanPlanActionExecutor to return meta info about list of partitions to be deleted. If there are no valid file groups for a partition, clean planner will include the partition to be deleted.
  • Fixed HoodieCleanPlan avro schema to include the list of partitions to be deleted
  • CleanActionExecutor is fixed to delete partitions if any (as per clean plan)
  • Same info is added to HoodieCleanMetadata
  • Metadata table when applying clean metadata, will check for partitions to be deleted and will update the "all_partitions" record for the deleted partitions.

Verify this pull request

This change added tests and can be verified as follows:

  • Added tests to TestHoodieBackedMetadata
  • Added tests to TestAlterTableDropPartition

Committer checklist

  • Has a corresponding JIRA in PR title & commit

  • Commit message is descriptive of the change

  • CI is green

  • Necessary doc changes done or have another open PR

  • For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

@XuQianJin-Stars XuQianJin-Stars force-pushed the HUDI-3135 branch 2 times, most recently from aa50fdf to 9bfe088 Compare January 2, 2022 04:11
@xushiyan xushiyan requested a review from nsivabalan January 2, 2022 04:30
Copy link
Member

@xushiyan xushiyan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@XuQianJin-Stars thanks for the patch. Question: without fully understanding the bug, from the look of it, this is a fix on metadata table, but will the problem persist if metadata is disabled? cc @nsivabalan please have a look when you got a chance.

@XuQianJin-Stars
Copy link
Contributor Author

XuQianJin-Stars commented Jan 2, 2022

@XuQianJin-Stars thanks for the patch. Question: without fully understanding the bug, from the look of it, this is a fix on metadata table, but will the problem persist if metadata is disabled? cc @nsivabalan please have a look when you got a chance.

  1. add two partitions dt='2021-10-01', dt='2021-10-02'
  2. drop one partition dt='2021-10-01'
  3. show partitions command
    The query result: dt='2021-10-01', dt='2021-10-02'
    The expected result is: dt='2021-10-02'

@nsivabalan nsivabalan added the priority:critical production down; pipelines stalled; Need help asap. label Jan 4, 2022
Copy link
Contributor

@nsivabalan nsivabalan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Can you add a UT to TestHoodieBackedMetadata to test DeletePartitions.

@nsivabalan nsivabalan changed the title [HUDI-3135] Fix Show Partitions Command's Result after drop partition [HUDI-3135] Fix Delete partitions with metadata table and fix show partitions in spark sql Jan 4, 2022
@nsivabalan nsivabalan self-assigned this Jan 6, 2022
@nsivabalan
Copy link
Contributor

@XuQianJin-Stars : may I know when you plan to address the feedback. We are looking to get this in for 0.10.1. would appreciate if you can find time to address them sooner. Let me know.

@XuQianJin-Stars
Copy link
Contributor Author

@XuQianJin-Stars : may I know when you plan to address the feedback. We are looking to get this in for 0.10.1. would appreciate if you can find time to address them sooner. Let me know.

well, sorry for late reply. I will update as soon as possible

@nsivabalan
Copy link
Contributor

thanks! no probs.

@XuQianJin-Stars
Copy link
Contributor Author

@hudi-bot run azure

Copy link
Member

@xushiyan xushiyan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall strategy looks good. Had a question about roll back.

Comment on lines 89 to 90
if (table.getMetaClient().getFs().exists(fullPartitionPath)) {
table.getMetaClient().getFs().delete(fullPartitionPath, true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tricky thing is: can we even roll back purge delete partitions? shall we make an exception for making delete partition irreversible?

Copy link
Contributor Author

@XuQianJin-Stars XuQianJin-Stars Jan 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tricky thing is: can we even roll back purge delete partitions? shall we make an exception for making delete partition irreversible?

Is this purge operation really irreversible, or do we need to put this soft delete in a specific location for easy rollback? And I want to thoroughly deal with this one when I want to implement the hidden partition in the next step.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I was also thinking about the same. from the current code, looks like it will be irresversible. Probably we should let cleaner clean up deleted partitions when it comes around cleaning up replaced files and delete partitions if all file groups have been cleaned up.
Or we can delete partition paths during clean action execution, just before transitioning clean.inflight to clean.completed.

table.getActiveTimeline().transitionCleanInflightToComplete(inflightInstant,

may be, I will sync up with Raymond on this and see how we can go about it.

@nsivabalan
Copy link
Contributor

I would really love to get this in for 0.10.1, but let's be cautious in such irreversible operations and we don't have good precedence to follow as well. Lets not hurry up. I will remove it from 0.10.1, but lets try to brainstorm and get a closure for this.

@nsivabalan nsivabalan added priority:major degraded perf; unable to move forward; potential bugs and removed priority:critical production down; pipelines stalled; Need help asap. labels Jan 10, 2022
Copy link
Contributor

@nsivabalan nsivabalan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi folks.
here is my take.
We definitely can't delete the partitions within SparkDeletePartitionCommitActionExecutor. bcoz, if operation partially fails, we don't know if partition is deleted or not. or We need to call out explicitly that this operation may or may not succeed. and users have to keep calling until delete_partition succeeds.

If not, we can mark all file groups as deleted.
and let cleaner take care of deleting the partitions for which all file groups are replaced/deleted.
But we may have to evolve the HoodieCleanMetadata may be which should be fine.

Comment on lines 89 to 90
if (table.getMetaClient().getFs().exists(fullPartitionPath)) {
table.getMetaClient().getFs().delete(fullPartitionPath, true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I was also thinking about the same. from the current code, looks like it will be irresversible. Probably we should let cleaner clean up deleted partitions when it comes around cleaning up replaced files and delete partitions if all file groups have been cleaned up.
Or we can delete partition paths during clean action execution, just before transitioning clean.inflight to clean.completed.

table.getActiveTimeline().transitionCleanInflightToComplete(inflightInstant,

may be, I will sync up with Raymond on this and see how we can go about it.

@zhangyue19921010
Copy link
Contributor

zhangyue19921010 commented Jan 23, 2022

Hi @nsivabalan I agree with your opinion that let cleaner delete files and partitions. Just a little concern that how can we deal with the scenario that we trigger a delete partition action with a async cleaner + enable metadata table. -> Async cleaner started and finished before replaced committed :

  1. Async cleaner finished to delete old replaced files.
  2. Current replaced committed.
  3. Meta table synced ==> partitions are deleted in meta data table.

We maybe get different result between getAllPartitions form metadata table and number of physical partitions.
Also wen could have a strict limit that delete partition only works with sync cleaner but wondering to know if it is possible.
Or this kind of out-of-sync will cause no damage.

updated: we could let cleaner sync metadata table and delete partitions in it. It could solve the consistency issue.

@nsivabalan
Copy link
Contributor

@zhangyue19921010 : you are bringing up a good point. if I am not wrong, you are talking about a scenario, where someone triggered delete_partition for partition X and later added new data to the same partition is it? if yes, I see there could be some inconsistencies especially w/ async cleaner.

t10: trigger delete partition of partitionX. async cleaner is enabled and so, actual delete partition is not yet synced to metadata table.
t15: add new data to partition X which gets synced to Metadata table right away.
t20: async cleaner completes and deletes partition X. which gets synced to metadata table and deletes partition X.

Or were you talking about some other scenario ?

@nsivabalan
Copy link
Contributor

@XuQianJin-Stars : we are looking to get this in for 0.11. Lets see if we can address feedback by this week so that we can land it by next week after review.

@nsivabalan nsivabalan added the priority:critical production down; pipelines stalled; Need help asap. label Feb 8, 2022
@XuQianJin-Stars
Copy link
Contributor Author

@XuQianJin-Stars : we are looking to get this in for 0.11. Lets see if we can address feedback by this week so that we can land it by next week after review.

Well, I'll try to get this done this week.

@XuQianJin-Stars XuQianJin-Stars changed the title [HUDI-3135] Fix Delete partitions with metadata table and fix show partitions in spark sql [WIP][HUDI-3135] Fix Delete partitions with metadata table and fix show partitions in spark sql Mar 26, 2022
@XuQianJin-Stars XuQianJin-Stars changed the title [WIP][HUDI-3135] Fix Delete partitions with metadata table and fix show partitions in spark sql [HUDI-3135] Fix Delete partitions with metadata table and fix show partitions in spark sql Mar 28, 2022
@nsivabalan nsivabalan changed the title [HUDI-3135] Fix Delete partitions with metadata table and fix show partitions in spark sql [HUDI-3135] Make delete partitions lazy to be executed by the cleaner Mar 30, 2022
@nsivabalan
Copy link
Contributor

@XuQianJin-Stars @xushiyan : We are in need of this patch for async indexer feature in metadata table. So, I took over and pushed an update. I have fixed the description to explain what I have done. Have cleaned up few things which was not required. Do review it when you get a chance.

@XuQianJin-Stars
Copy link
Contributor Author

@XuQianJin-Stars @xushiyan : We are in need of this patch for async indexer feature in metadata table. So, I took over and pushed an update. I have fixed the description to explain what I have done. Have cleaned up few things which was not required. Do review it when you get a chance.

hi @nsivabalan Thank you so much for the update, sorry for replying so late, interrupted by other things.

@hudi-bot
Copy link

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

Copy link
Member

@codope codope left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I've verfied this patch together with #5169 and the metadata partition does get physically deleted. I'll address a couple of minor comments here in #5169.

@@ -350,8 +355,12 @@ public CleanPlanner(HoodieEngineContext context, HoodieTable<T, I, K, O> hoodieT
}
}
}
// if there are no valid file groups for the partition, mark it to be deleted
if (fileGroups.isEmpty()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@codope codope self-assigned this Mar 31, 2022
@nsivabalan nsivabalan merged commit 80011df into apache:master Mar 31, 2022
vingov pushed a commit to vingov/hudi that referenced this pull request Apr 3, 2022
…apache#4489)

As of now, delete partitions will ensure all file groups are deleted, but the partition as such is not deleted. So, get all partitions might be returning the deleted partitions as well. but no data will be served since all file groups are deleted. With this patch, we are fixing it. We are letting cleaner take care of deleting the partitions when all file groups pertaining to a partitions are deleted.

- Fixed the CleanPlanActionExecutor to return meta info about list of partitions to be deleted. If there are no valid file groups for a partition, clean planner will include the partition to be deleted.
- Fixed HoodieCleanPlan avro schema to include the list of partitions to be deleted
- CleanActionExecutor is fixed to delete partitions if any (as per clean plan)
- Same info is added to HoodieCleanMetadata
- Metadata table when applying clean metadata, will check for partitions to be deleted and will update the "all_partitions" record for the deleted partitions.

Co-authored-by: sivabalan <n.siva.b@gmail.com>
@Zouxxyy
Copy link
Contributor

Zouxxyy commented Oct 21, 2022

@XuQianJin-Stars if disabled hudi metadata, the following bug still exists:

add two partitions dt='2021-10-01', dt='2021-10-02'
drop one partition dt='2021-10-01'
show partitions command
The query result: dt='2021-10-01', dt='2021-10-02'
The expected result is: dt='2021-10-02'

@Zouxxyy
Copy link
Contributor

Zouxxyy commented Oct 21, 2022

@XuQianJin-Stars

just set hoodie.metadata.enable = false and then run TestAlterTableDropPartition will fail

@nsivabalan
Copy link
Contributor

@Zouxxyy : yes, physical deletion of partitions is lazy. only when cleaner comes around next time, it will clean it up.
but if you try to list data for the deleted partition (2021-10-01), you should not see any data once drop one partition dt='2021-10-01' succeeds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants