-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: pool some allocations during flow setup #42809
Conversation
Previously, the last processor ('headProc') would be removed from f.processors before flow.startInternal to be run in the same goroutine as the one that is doing the setup. However, this was problematic because if that processor implements 'Releasable' interface, it will not be returned to the pool on the flow clean up. Now this is fixed. Release note: None
74be757
to
a3bc4f8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for looking into this so rapidly! I’m hopeful it will move a needle on a benchmark but we’ll see 🤷♀️
I’m afk for the next few hours but here’s a passing question.
pkg/sql/flowinfra/flow.go
Outdated
// isVectorized indicates whether it is a vectorized flow. | ||
isVectorized bool | ||
// processors contains a subset of the processors in the flow - the ones that | ||
// Processors contains a subset of the processors in the flow - the ones that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than exporting all of this stuff would it be better to call into FlowBase.Cleanup() from the various flow implementations which embed it and still have those implementations implement a Cleanup method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you're right. At first I thought that cleaning up of memory monitoring infrastructure in vectorizedFlow
s forces us to actually copy-paste most of the logic, but rearranging the order of things slightly does the job. Done.
a3bc4f8
to
330c752
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for jumping on this so quickly! Looking forward to seeing if it show up in the top-level benchmarks. I can look into the table descriptors. It seems like there ought to be something straight-forward we can do.
Reviewed 1 of 1 files at r1, 5 of 6 files at r2.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @yuzefovich)
pkg/sql/colflow/vectorized_flow.go, line 125 at r2 (raw file):
f.bufferingMemAccounts = nil f.bufferingMemMonitors = nil vectorizedFlowPool.Put(f)
I'd prefer*f = vectorizedFlow{}
Is there a reason while you're holding on to the reference to the FlowBase
?
pkg/sql/rowflow/row_based_flow.go, line 399 at r2 (raw file):
// Release releases this rowBasedFlow back to the pool. func (f *rowBasedFlow) Release() { f.localStreams = nil
I'd prefer *f = rowBasedFlow{}
.
Is there a reason while you're holding on to the reference to the FlowBase
?
Previously, new structs for rowBasedFlow and vectorizedFlow would be allocated upon creation. This commit creates pools for both of them. flowinfra.Releasable interface is moved into execinfra package because now components from rowflow, rowexec, and colflow packages implement that. In order to actually be able to release the flow structs, I needed to create separate Cleanup methods (which still share most of the logic) which allows for removal of vectorized memory monitoring logic from the shared FlowCtx. Release note: None
330c752
to
834e866
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing out this issue!
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @jordanlewis)
pkg/sql/colflow/vectorized_flow.go, line 125 at r2 (raw file):
Previously, ajwerner wrote…
I'd prefer
*f = vectorizedFlow{}
Is there a reason while you're holding on to the reference to the
FlowBase
?
No reason, I forgot about such zeroing out, thanks. Done.
pkg/sql/rowflow/row_based_flow.go, line 399 at r2 (raw file):
Previously, ajwerner wrote…
I'd prefer
*f = rowBasedFlow{}
.Is there a reason while you're holding on to the reference to the
FlowBase
?
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r3.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @jordanlewis)
TFTR! bors r+ |
42809: sql: pool some allocations during flow setup r=yuzefovich a=yuzefovich **flowinfra: slightly tweak the setup of processors in the flow** Previously, the last processor ('headProc') would be removed from f.processors before flow.startInternal to be run in the same goroutine as the one that is doing the setup. However, this was problematic because if that processor implements 'Releasable' interface, it will not be returned to the pool on the flow clean up. Now this is fixed. Addresses: #42770. Release note: None **sql: pool flow allocations** sql: pool flow allocations Previously, new structs for rowBasedFlow and vectorizedFlow would be allocated upon creation. This commit creates pools for both of them. flowinfra.Releasable interface is moved into execinfra package because now components from rowflow, rowexec, and colflow packages implement that. In order to actually be able to release the flow structs, I needed to create separate Cleanup methods (which still share most of the logic) which allows for removal of vectorized memory monitoring logic from the shared FlowCtx. Release note: None Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
Build succeeded |
Not too shabby! I ran it with a higher concurrency the first time and it didn't show up as well:
I'm going to chalk it up as a real win. |
flowinfra: slightly tweak the setup of processors in the flow
Previously, the last processor ('headProc') would be removed from
f.processors before flow.startInternal to be run in the same goroutine
as the one that is doing the setup. However, this was problematic
because if that processor implements 'Releasable' interface, it will not
be returned to the pool on the flow clean up. Now this is fixed.
Addresses: #42770.
Release note: None
sql: pool flow allocations
sql: pool flow allocations
Previously, new structs for rowBasedFlow and vectorizedFlow would be
allocated upon creation. This commit creates pools for both of them.
flowinfra.Releasable interface is moved into execinfra package because
now components from rowflow, rowexec, and colflow packages implement
that.
In order to actually be able to release the flow structs, I needed to
create separate Cleanup methods (which still share most of the logic)
which allows for removal of vectorized memory monitoring logic from
the shared FlowCtx.
Release note: None