Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] A RapidsBuffer should not be spillable while it is acquired #9120

Open
abellina opened this issue Aug 28, 2023 · 0 comments
Open

[FEA] A RapidsBuffer should not be spillable while it is acquired #9120

abellina opened this issue Aug 28, 2023 · 0 comments
Assignees
Labels
feature request New feature or request performance A performance related task/issue reliability Features to improve reliability or bugs that severly impact the reliability of the plugin

Comments

@abellina
Copy link
Collaborator

The spill framework tracks spillability of buffers and batches by way of ref counting of the underlying cuDF objects (ColumnVector, HostColumnVector, and MemoryBuffer). This tracking works when we "leak" something from the spill framework (the common pattern of acquiring a SpillableColumnarBatch and returning the ColumnarBatch by increasing the ref counts), but it opens us up for race conditions between acquisition and spill.

The issue was brought up by @jlowe in a recent PR: #9098 (comment). The concern here is we could return a ColumnarBatch (host backed in this scenario) but we could be racing with the spillability callbacks. This puts us in the position where the ColumnarBatch returned is actually also getting copied to a lower tier, but not removed from the current tier (that's good, otherwise we'd have some really bad errors). The bad part is we just spilled something we shouldn't have.

The idea behind this issue is to prevent spill at "acquisition" time. All methods around the RapidsBuffer require acquireBuffer to be called before hand by the user, so we can use this mechanism to take this acquired buffer out of the spill pool until it is unacquired. Once unacquired we can revert back to the cuDF ref count approach, in fact we should likely ask the RapidsBuffer to update its spillability because it could still be not-spillable.

@abellina abellina added feature request New feature or request ? - Needs Triage Need team to review and classify performance A performance related task/issue reliability Features to improve reliability or bugs that severly impact the reliability of the plugin labels Aug 28, 2023
@abellina abellina self-assigned this Aug 28, 2023
@mattahrens mattahrens removed the ? - Needs Triage Need team to review and classify label Aug 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request performance A performance related task/issue reliability Features to improve reliability or bugs that severly impact the reliability of the plugin
Projects
None yet
Development

No branches or pull requests

2 participants