[FEA] A RapidsBuffer should not be spillable while it is acquired #9120
Labels
feature request
New feature or request
performance
A performance related task/issue
reliability
Features to improve reliability or bugs that severly impact the reliability of the plugin
The spill framework tracks spillability of buffers and batches by way of ref counting of the underlying cuDF objects (
ColumnVector
,HostColumnVector
, andMemoryBuffer
). This tracking works when we "leak" something from the spill framework (the common pattern of acquiring aSpillableColumnarBatch
and returning the ColumnarBatch by increasing the ref counts), but it opens us up for race conditions between acquisition and spill.The issue was brought up by @jlowe in a recent PR: #9098 (comment). The concern here is we could return a
ColumnarBatch
(host backed in this scenario) but we could be racing with the spillability callbacks. This puts us in the position where theColumnarBatch
returned is actually also getting copied to a lower tier, but not removed from the current tier (that's good, otherwise we'd have some really bad errors). The bad part is we just spilled something we shouldn't have.The idea behind this issue is to prevent spill at "acquisition" time. All methods around the
RapidsBuffer
requireacquireBuffer
to be called before hand by the user, so we can use this mechanism to take this acquired buffer out of the spill pool until it is unacquired. Once unacquired we can revert back to the cuDF ref count approach, in fact we should likely ask theRapidsBuffer
to update its spillability because it could still be not-spillable.The text was updated successfully, but these errors were encountered: