Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AUDIT] Evaluate if SPARK-46957's shuffle cleanup fix is needed in the plugin #11181

Open
mythrocks opened this issue Jul 12, 2024 · 1 comment
Assignees
Labels
audit General label for audit related tasks

Comments

@mythrocks
Copy link
Collaborator

This concerns the shuffle cleanup fix introduced in SPARK-46957. Per description:

This is a long-standing bug in decommission where the migrated shuffle files can't be cleaned up from the executor. Normally, the shuffle files are tracked by taskIdMapsForShuffle during the map task execution. Upon receiving the RemoveShuffle(shuffleId) request from driver, executor can clean up those shuffle files by searching taskIdMapsForShuffle. However, for the migrated shuffle files by decommission, they lose the track in the destination executor's taskIdMapsForShuffle and can't be deleted as a result.

This might need evaluation by someone familiar with Spark Shuffle. It appears to be a missed condition where shuffle files might not be cleaned up.

@mythrocks mythrocks added the audit General label for audit related tasks label Jul 12, 2024
@abellina abellina self-assigned this Jul 12, 2024
@abellina
Copy link
Collaborator

I'll take a look @mythrocks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
audit General label for audit related tasks
Projects
None yet
Development

No branches or pull requests

2 participants