-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[HUDI-3135] Make delete partitions lazy to be executed by the cleaner #4489
Conversation
aa50fdf
to
9bfe088
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@XuQianJin-Stars thanks for the patch. Question: without fully understanding the bug, from the look of it, this is a fix on metadata table, but will the problem persist if metadata is disabled? cc @nsivabalan please have a look when you got a chance.
|
9bfe088
to
5feae09
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Can you add a UT to TestHoodieBackedMetadata to test DeletePartitions.
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java
Outdated
Show resolved
Hide resolved
@XuQianJin-Stars : may I know when you plan to address the feedback. We are looking to get this in for 0.10.1. would appreciate if you can find time to address them sooner. Let me know. |
well, sorry for late reply. I will update as soon as possible |
thanks! no probs. |
@hudi-bot run azure |
edcd592
to
7ca735c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall strategy looks good. Had a question about roll back.
.../main/java/org/apache/hudi/table/action/commit/SparkDeletePartitionCommitActionExecutor.java
Outdated
Show resolved
Hide resolved
.../main/java/org/apache/hudi/table/action/commit/SparkDeletePartitionCommitActionExecutor.java
Outdated
Show resolved
Hide resolved
if (table.getMetaClient().getFs().exists(fullPartitionPath)) { | ||
table.getMetaClient().getFs().delete(fullPartitionPath, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tricky thing is: can we even roll back purge delete partitions? shall we make an exception for making delete partition irreversible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tricky thing is: can we even roll back purge delete partitions? shall we make an exception for making delete partition irreversible?
Is this purge
operation really irreversible, or do we need to put this soft delete in a specific location for easy rollback? And I want to thoroughly deal with this one when I want to implement the hidden partition in the next step.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I was also thinking about the same. from the current code, looks like it will be irresversible. Probably we should let cleaner clean up deleted partitions when it comes around cleaning up replaced files and delete partitions if all file groups have been cleaned up.
Or we can delete partition paths during clean action execution, just before transitioning clean.inflight to clean.completed.
Line 212 in 98ec215
table.getActiveTimeline().transitionCleanInflightToComplete(inflightInstant, |
may be, I will sync up with Raymond on this and see how we can go about it.
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java
Outdated
Show resolved
Hide resolved
fa5894f
to
8896e81
Compare
I would really love to get this in for 0.10.1, but let's be cautious in such irreversible operations and we don't have good precedence to follow as well. Lets not hurry up. I will remove it from 0.10.1, but lets try to brainstorm and get a closure for this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hi folks.
here is my take.
We definitely can't delete the partitions within SparkDeletePartitionCommitActionExecutor. bcoz, if operation partially fails, we don't know if partition is deleted or not. or We need to call out explicitly that this operation may or may not succeed. and users have to keep calling until delete_partition succeeds.
If not, we can mark all file groups as deleted.
and let cleaner take care of deleting the partitions for which all file groups are replaced/deleted.
But we may have to evolve the HoodieCleanMetadata may be which should be fine.
if (table.getMetaClient().getFs().exists(fullPartitionPath)) { | ||
table.getMetaClient().getFs().delete(fullPartitionPath, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I was also thinking about the same. from the current code, looks like it will be irresversible. Probably we should let cleaner clean up deleted partitions when it comes around cleaning up replaced files and delete partitions if all file groups have been cleaned up.
Or we can delete partition paths during clean action execution, just before transitioning clean.inflight to clean.completed.
Line 212 in 98ec215
table.getActiveTimeline().transitionCleanInflightToComplete(inflightInstant, |
may be, I will sync up with Raymond on this and see how we can go about it.
Hi @nsivabalan I agree with your opinion that let cleaner delete files and partitions. Just a little concern that how can we deal with the scenario that we trigger a delete partition action with a async cleaner + enable metadata table. -> Async cleaner started and finished before replaced committed :
We maybe get different result between updated: we could let cleaner sync metadata table and delete partitions in it. It could solve the consistency issue. |
@zhangyue19921010 : you are bringing up a good point. if I am not wrong, you are talking about a scenario, where someone triggered delete_partition for partition X and later added new data to the same partition is it? if yes, I see there could be some inconsistencies especially w/ async cleaner. t10: trigger delete partition of partitionX. async cleaner is enabled and so, actual delete partition is not yet synced to metadata table. Or were you talking about some other scenario ? |
@XuQianJin-Stars : we are looking to get this in for 0.11. Lets see if we can address feedback by this week so that we can land it by next week after review. |
Well, I'll try to get this done this week. |
ad5ed95
to
d7168e4
Compare
d7168e4
to
2ede26d
Compare
@XuQianJin-Stars @xushiyan : We are in need of this patch for async indexer feature in metadata table. So, I took over and pushed an update. I have fixed the description to explain what I have done. Have cleaned up few things which was not required. Do review it when you get a chance. |
hi @nsivabalan Thank you so much for the update, sorry for replying so late, interrupted by other things. |
4100259
to
227baaa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -350,8 +355,12 @@ public CleanPlanner(HoodieEngineContext context, HoodieTable<T, I, K, O> hoodieT | |||
} | |||
} | |||
} | |||
// if there are no valid file groups for the partition, mark it to be deleted | |||
if (fileGroups.isEmpty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
…apache#4489) As of now, delete partitions will ensure all file groups are deleted, but the partition as such is not deleted. So, get all partitions might be returning the deleted partitions as well. but no data will be served since all file groups are deleted. With this patch, we are fixing it. We are letting cleaner take care of deleting the partitions when all file groups pertaining to a partitions are deleted. - Fixed the CleanPlanActionExecutor to return meta info about list of partitions to be deleted. If there are no valid file groups for a partition, clean planner will include the partition to be deleted. - Fixed HoodieCleanPlan avro schema to include the list of partitions to be deleted - CleanActionExecutor is fixed to delete partitions if any (as per clean plan) - Same info is added to HoodieCleanMetadata - Metadata table when applying clean metadata, will check for partitions to be deleted and will update the "all_partitions" record for the deleted partitions. Co-authored-by: sivabalan <n.siva.b@gmail.com>
@XuQianJin-Stars if disabled hudi metadata, the following bug still exists: add two partitions dt='2021-10-01', dt='2021-10-02' |
just |
@Zouxxyy : yes, physical deletion of partitions is lazy. only when cleaner comes around next time, it will clean it up. |
What is the purpose of the pull request
As of now, delete partitions will ensure all file groups are deleted, but the partition as such is not deleted. So, get all partitions might be returning the deleted partitions as well. but no data will be served since all file groups are deleted. With this patch, we are fixing it. We are letting cleaner take care of deleting the partitions when all file groups pertaining to a partitions are deleted.
Brief change log
Verify this pull request
This change added tests and can be verified as follows:
Committer checklist
Has a corresponding JIRA in PR title & commit
Commit message is descriptive of the change
CI is green
Necessary doc changes done or have another open PR
For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.