Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: increase the load-based splitting QPS threshold from 250 to 2500 #39687

Conversation

ajwerner
Copy link
Contributor

@ajwerner ajwerner commented Aug 15, 2019

Recent improvements in write efficiency and batching lead to a re-evaluation
of the reasons for load-based splitting. Intuitively fewer splits ought to
offer increased batching opportunities while more splits ought to offer
increased concurrency. Above a given number it is not obvious that
increased concurrency will translate effectively to increased parallelism.

Experimental evidence shows that the right threshold for load-based splitting
is now closer to 2500 than 250. It also shows that over-splitting can have
negative effects on latency and throughput.

Load-based splitting remains important additionally for the opportunity it
provides to balance load. Load-balancing however is not currently a part of the
splitting heuristic.

The second commit in the PR adds roachtests which do not perform any manual splits.

Release note (performance improvement): Adjust load-based splitting QPS
threshold to avoid over-splitting.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

…2500

Recent improvements in write efficiency and batching lead to a reevaluation
of the reasons for load-based splitting. Intuitively fewer splits ought to
offer increased batching opportunities while more splits ought to offer
increased concurrency. Above a given number it is not obvious that
increased concurrency will translate effectively to increased parallelism.

Experimental evidence shows that the right threshold for load-based splitting
is now closer to 2500 than 250. It also shows that over-splitting can have
negative effects on latency and throughput.

Load-based splitting remains important additionally for the opportunity it
provides to balance load. Load-balancing however is not currently a part of the
splitting heuristic.

Release note (performance improvement): Adjust load-based splitting QPS
threshold to avoid over-splitting.
Before this PR we always ran our KV benchmarks with 1000 manual splits.
This is almost certainly too many. We preserve this value for the historical
continuity of the tests but add new configurations which do no manual
splitting.

Release note: None
@ajwerner ajwerner force-pushed the ajwerner/change-load-based-split-qps-threshold branch from b5f8d1b to 0099761 Compare August 15, 2019 14:20
@ajwerner ajwerner marked this pull request as ready for review August 15, 2019 14:23
@ajwerner
Copy link
Contributor Author

TFTR!

I ran a quick benchmark of kv0 with different QPS thresholds and here's what I got:

name                                                           ops/s
Cockroach-concurrency=1024-qps_thresh=250-reads=0-throughput   18.1k ± 1%
Cockroach-concurrency=1024-qps_thresh=2500-reads=0-throughput  20.7k ± 2%
Cockroach-concurrency=1024-qps_thresh=5000-reads=0-throughput  19.9k ± 1%

Seems to support that we've picked a good number.

bors r+

craig bot pushed a commit that referenced this pull request Aug 16, 2019
39687: storage: increase the load-based splitting QPS threshold from 250 to 2500 r=ajwerner a=ajwerner

Recent improvements in write efficiency and batching lead to a re-evaluation
of the reasons for load-based splitting. Intuitively fewer splits ought to
offer increased batching opportunities while more splits ought to offer
increased concurrency. Above a given number it is not obvious that
increased concurrency will translate effectively to increased parallelism.

Experimental evidence shows that the right threshold for load-based splitting
is now closer to 2500 than 250. It also shows that over-splitting can have
negative effects on latency and throughput.

Load-based splitting remains important additionally for the opportunity it
provides to balance load. Load-balancing however is not currently a part of the
splitting heuristic.

The second commit in the PR adds roachtests which do not perform any manual splits. 

Release note (performance improvement): Adjust load-based splitting QPS
threshold to avoid over-splitting.

Co-authored-by: Andrew Werner <ajwerner@cockroachlabs.com>
@craig
Copy link
Contributor

craig bot commented Aug 16, 2019

Build succeeded

@craig craig bot merged commit 0099761 into cockroachdb:master Aug 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants