Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvserver: introduce cpu rebalancing #95152

Closed
wants to merge 5 commits into from

Conversation

kvoli
Copy link
Collaborator

@kvoli kvoli commented Jan 12, 2023

This series of commits introduces using CPU as a replacement for QPS in rebalancing and splitting.

The results when using CPU in comparison to QPS can be found here (internal).

CPU load based rebalancing and splitting is not enabled by default in this PR.

resolves: #95380
resolves: #95377

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@kvoli kvoli force-pushed the 230110.cpu-store-rebalancing branch 11 times, most recently from 9be528b to 0743a60 Compare January 13, 2023 19:30
@kvoli kvoli changed the title kvserver: instrument store cpu based rebalancing kvserver: [wip] instrument store cpu based rebalancing Jan 13, 2023
@kvoli kvoli force-pushed the 230110.cpu-store-rebalancing branch 6 times, most recently from 9da25a6 to b32c697 Compare January 17, 2023 21:22
@kvoli kvoli changed the title kvserver: [wip] instrument store cpu based rebalancing kvserver: [wip] cpu rebalancing Jan 20, 2023
@kvoli kvoli changed the title kvserver: [wip] cpu rebalancing kvserver: [wip] add store cpu rebalancing Jan 20, 2023
@kvoli kvoli force-pushed the 230110.cpu-store-rebalancing branch 9 times, most recently from e77b07f to 9d8af22 Compare January 24, 2023 17:34
@kvoli kvoli force-pushed the 230110.cpu-store-rebalancing branch 3 times, most recently from 96eb4e4 to dd3b25f Compare January 24, 2023 23:54
@kvoli kvoli self-assigned this Jan 24, 2023
@kvoli kvoli changed the title kvserver: [wip] add store cpu rebalancing kvserver: add store cpu rebalancing Jan 24, 2023
@kvoli kvoli changed the title kvserver: add store cpu rebalancing kvserver: introduce cpu rebalancing Jan 24, 2023
@kvoli kvoli force-pushed the 230110.cpu-store-rebalancing branch 9 times, most recently from 9d76d35 to ded17a7 Compare January 26, 2023 20:08
Previously `Queries` was hardcoded as the dimension to use when creating
test ranges for use in store rebalancer tests. This patch enables
passing in any `dimension`.

Release note: None
Previously, when estimating the impact a lease transfer would have, we
would not differentiate between rebalance/transfer. This commit adds a
utility method `TransferImpact` to `RangeUsageInfo` which is now used
when making estimations about lease transers.

Currently, this is identical to `Load`, which was previously used
instead for transfers.

Release note: None
This patch allows the store rebalancer to use CPU in place of QPS when
balancing load on a cluster. This patch adds `cpu` as an option with the
cluster setting:

`kv.allocator.load_based_rebalancing.objective`

When set to `cpu`, rather than `qps`. The store rebalancer will perform
a mostly identical function, however target balancing the sum of all
replica's cpu time on each store, rather than qps. The default remains
as `qps` here.

Similar to QPS, the rebalance threshold can be set to allow controlling
the range above and below the mean store CPU is considered imbalanced,
either overfull or underfull respectively:

`kv.allocator.cpu_rebalance_threshold`: 0.1

In order to manage with mixed versions during upgrade and some
architectures not supporting the cpu sampling method, a rebalance
objective manager is introduced in `rebalance_objective.go`. The manager
mediates access to the rebalance objective and overwrites it in cases
where the objective set in the cluster setting cannot be supported.

resolves: cockroachdb#95380

Release note (ops change) Add option to balance cpu time (cpu)
instead of queries per second (qps) among stores in a cluster. This is
done by setting `kv.allocator.load_based_rebalancing.objective='cpu'`.
`kv.allocator.cpu_rebalance_threshold` is also added, similar to
`kv.allocator.qps_rebalance_threshold` to control the target range for
store cpu above and below the cluster mean.
This patch removes the deprecated 'lastSplitQPS' value throughout the
split/merge code. This field was deprecated in 22.1 in favor or
`maxSplitQPS` and stopped being consulted in 22.2.

Now only `maxSplitQPS` is consulted and set in `RangeStatsResponse`.

Release note: None
This commit adds the ability to peform load based splitting with replica
cpu usage rather than queries per second. Load based splitting now will
use either cpu or qps for deciding split points, depending on the
cluster setting `kv.allocator.load_based_rebalancing.objective`.

When set to `qps`, qps is used in deciding split points and when
splitting should occur; similarly, `cpu` means that request cpu against
the leasholder replica is to decide split points.

The split threshold when using `cpu` is the cluster setting
`kv.range_split.load_cpu_threshold` which defaults to `250ms` of cpu
time per second, i.e. a replica using 1/4 processor of a machine
consistently.

The merge queue uses the load based splitter to make decisions on
whether to merge two adjacent ranges due to low load. This commit also
updates the merge queue to be consistent with the load based splitter
signal. When switching between `qps` and `cpu`, the load based splitter
for every replica is reset to avoid spurious results.

resolves: cockroachdb#95377

Release note (ops change): Load based splitter now supports using request
cpu usage to split ranges. This is introduced with the previous cluster
setting `kv.allocator.load_based_rebalancing.objective`, which when set
to `cpu`, will use request cpu usage. The threshold above which
CPU usage of a range is considered for splitting is defined in the
cluster setting `kv.range_split.load_cpu_threshold`, which has a default
value of `250ms`.
@kvoli kvoli force-pushed the 230110.cpu-store-rebalancing branch from ded17a7 to 05cbe42 Compare January 27, 2023 17:47
@kvoli
Copy link
Collaborator Author

kvoli commented Jan 27, 2023

separated into #96127 and #96128

@kvoli kvoli closed this Jan 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

kvserver: store cpu rebalancing kvserver: cpu based load splitting
2 participants