Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: pool some allocations during flow setup #42809

Merged
merged 2 commits into from
Nov 27, 2019

Conversation

yuzefovich
Copy link
Member

@yuzefovich yuzefovich commented Nov 26, 2019

flowinfra: slightly tweak the setup of processors in the flow

Previously, the last processor ('headProc') would be removed from
f.processors before flow.startInternal to be run in the same goroutine
as the one that is doing the setup. However, this was problematic
because if that processor implements 'Releasable' interface, it will not
be returned to the pool on the flow clean up. Now this is fixed.

Addresses: #42770.

Release note: None

sql: pool flow allocations

sql: pool flow allocations

Previously, new structs for rowBasedFlow and vectorizedFlow would be
allocated upon creation. This commit creates pools for both of them.
flowinfra.Releasable interface is moved into execinfra package because
now components from rowflow, rowexec, and colflow packages implement
that.

In order to actually be able to release the flow structs, I needed to
create separate Cleanup methods (which still share most of the logic)
which allows for removal of vectorized memory monitoring logic from
the shared FlowCtx.

Release note: None

Previously, the last processor ('headProc') would be removed from
f.processors before flow.startInternal to be run in the same goroutine
as the one that is doing the setup. However, this was problematic
because if that processor implements 'Releasable' interface, it will not
be returned to the pool on the flow clean up. Now this is fixed.

Release note: None
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@yuzefovich yuzefovich changed the title flowinfra: slightly tweak the setup of processors in the flow sql: pool some allocations during flow setup Nov 26, 2019
@yuzefovich yuzefovich force-pushed the release-head-proc branch 2 times, most recently from 74be757 to a3bc4f8 Compare November 27, 2019 00:14
Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking into this so rapidly! I’m hopeful it will move a needle on a benchmark but we’ll see 🤷‍♀️

I’m afk for the next few hours but here’s a passing question.

// isVectorized indicates whether it is a vectorized flow.
isVectorized bool
// processors contains a subset of the processors in the flow - the ones that
// Processors contains a subset of the processors in the flow - the ones that
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than exporting all of this stuff would it be better to call into FlowBase.Cleanup() from the various flow implementations which embed it and still have those implementations implement a Cleanup method?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right. At first I thought that cleaning up of memory monitoring infrastructure in vectorizedFlows forces us to actually copy-paste most of the logic, but rearranging the order of things slightly does the job. Done.

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for jumping on this so quickly! Looking forward to seeing if it show up in the top-level benchmarks. I can look into the table descriptors. It seems like there ought to be something straight-forward we can do.

Reviewed 1 of 1 files at r1, 5 of 6 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @yuzefovich)


pkg/sql/colflow/vectorized_flow.go, line 125 at r2 (raw file):

	f.bufferingMemAccounts = nil
	f.bufferingMemMonitors = nil
	vectorizedFlowPool.Put(f)

I'd prefer*f = vectorizedFlow{}

Is there a reason while you're holding on to the reference to the FlowBase?


pkg/sql/rowflow/row_based_flow.go, line 399 at r2 (raw file):

// Release releases this rowBasedFlow back to the pool.
func (f *rowBasedFlow) Release() {
	f.localStreams = nil

I'd prefer *f = rowBasedFlow{}.

Is there a reason while you're holding on to the reference to the FlowBase?

Previously, new structs for rowBasedFlow and vectorizedFlow would be
allocated upon creation. This commit creates pools for both of them.
flowinfra.Releasable interface is moved into execinfra package because
now components from rowflow, rowexec, and colflow packages implement
that.

In order to actually be able to release the flow structs, I needed to
create separate Cleanup methods (which still share most of the logic)
which allows for removal of vectorized memory monitoring logic from
the shared FlowCtx.

Release note: None
Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing out this issue!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @jordanlewis)


pkg/sql/colflow/vectorized_flow.go, line 125 at r2 (raw file):

Previously, ajwerner wrote…

I'd prefer*f = vectorizedFlow{}

Is there a reason while you're holding on to the reference to the FlowBase?

No reason, I forgot about such zeroing out, thanks. Done.


pkg/sql/rowflow/row_based_flow.go, line 399 at r2 (raw file):

Previously, ajwerner wrote…

I'd prefer *f = rowBasedFlow{}.

Is there a reason while you're holding on to the reference to the FlowBase?

Done.

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 2 of 2 files at r3.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @jordanlewis)

@yuzefovich
Copy link
Member Author

TFTR!

bors r+

craig bot pushed a commit that referenced this pull request Nov 27, 2019
42809: sql: pool some allocations during flow setup r=yuzefovich a=yuzefovich

**flowinfra: slightly tweak the setup of processors in the flow**

Previously, the last processor ('headProc') would be removed from
f.processors before flow.startInternal to be run in the same goroutine
as the one that is doing the setup. However, this was problematic
because if that processor implements 'Releasable' interface, it will not
be returned to the pool on the flow clean up. Now this is fixed.

Addresses: #42770.

Release note: None

**sql: pool flow allocations**

sql: pool flow allocations

Previously, new structs for rowBasedFlow and vectorizedFlow would be
allocated upon creation. This commit creates pools for both of them.
flowinfra.Releasable interface is moved into execinfra package because
now components from rowflow, rowexec, and colflow packages implement
that.

In order to actually be able to release the flow structs, I needed to
create separate Cleanup methods (which still share most of the logic)
which allows for removal of vectorized memory monitoring logic from
the shared FlowCtx.

Release note: None

Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
@craig
Copy link
Contributor

craig bot commented Nov 27, 2019

Build succeeded

@craig craig bot merged commit 834e866 into cockroachdb:master Nov 27, 2019
@ajwerner
Copy link
Contributor

name             old ops/s   new ops/s   delta
KV95-throughput  14.3k ± 2%  14.9k ± 2%  +4.08%  (p=0.008 n=5+5)

name             old ms/s    new ms/s    delta
KV95-P50          9.10 ± 3%   8.90 ± 0%    ~     (p=0.444 n=5+5)
KV95-Avg          2.72 ± 4%   2.54 ± 2%  -6.62%  (p=0.048 n=5+5)

Not too shabby! I ran it with a higher concurrency the first time and it didn't show up as well:

name             old ops/s   new ops/s   delta
KV95-throughput  15.0k ± 2%  15.6k ± 4%    ~     (p=0.056 n=5+5)

name             old ms/s    new ms/s    delta
KV95-P50          11.5 ± 5%   11.3 ± 7%    ~     (p=0.492 n=5+5)
KV95-Avg          3.40 ± 0%   3.30 ± 0%  -2.94%  (p=0.029 n=4+4)

I'm going to chalk it up as a real win.

@yuzefovich yuzefovich deleted the release-head-proc branch November 28, 2019 04:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants