-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
colexec: add more redundancy to releasing disk resources #81562
Conversation
As a couple of recently-found issues showed, making sure that all disk resources are released can be tricky since disk-backed operators can form large graphs with multiple external operators supporting a single operation. This commit makes the release of disk resources more bullet-proof by auditing all users of the vectorized disk queues to make sure they are added to `OpWithMetaInfo.ToClose` which are closed on the flow cleanup. Since `Close` can be safely called multiple times, it adds some redundancy, leaning on the side of caution. In particular, the following changes are made: - external distinct and external hash aggregators are explicitly added to `ToClose` slice. They should already be now closed by the `diskSpillerBase`, but it doesn't hurt closing them explicitly. - window aggregator operator has been refactored so that it doesn't throw an error in its `Close` method - with the previous version it was possible to panic during the `Close` execution and possibly leak some resources. - signatures of the constructor methods have been adjusted to return `ClosableOperator` to make the need for closing be more explicit. - each router output is now a `Closer` and the consumer of each output is now resposible for closing it. Again, I'm pretty sure that each output will have been closed by that time the consumer explicitly tries to close the output, yet there is no harm in closing it twice. An additional minor cleanup is the removal of the usage of an embedded context in a couple `Close` implementations given that the function takes it as an argument. Release note: None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 28 of 28 files at r1, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @michae2)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 21 of 28 files at r1, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @yuzefovich)
TFTRs! bors r+ |
Build succeeded: |
As a couple of recently-found issues showed, making sure that all disk
resources are released can be tricky since disk-backed operators can
form large graphs with multiple external operators supporting a single
operation. This commit makes the release of disk resources more
bullet-proof by auditing all users of the vectorized disk queues to make
sure they are added to
OpWithMetaInfo.ToClose
which are closed on theflow cleanup. Since
Close
can be safely called multiple times, it addssome redundancy, leaning on the side of caution.
In particular, the following changes are made:
to
ToClose
slice. They should already be now closed by thediskSpillerBase
, but it doesn't hurt closing them explicitly.throw an error in its
Close
method - with the previous version it waspossible to panic during the
Close
execution and possibly leak someresources.
ClosableOperator
to make the need for closing be more explicit.Closer
and the consumer of each outputis now resposible for closing it. Again, I'm pretty sure that each
output will have been closed by that time the consumer explicitly tries
to close the output, yet there is no harm in closing it twice.
An additional minor cleanup is the removal of the usage of an embedded
context in a couple
Close
implementations given that the functiontakes it as an argument.
Release note: None