-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvserver: introduce cpu rebalancing #95152
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
kvoli
force-pushed
the
230110.cpu-store-rebalancing
branch
11 times, most recently
from
January 13, 2023 19:30
9be528b
to
0743a60
Compare
kvoli
changed the title
kvserver: instrument store cpu based rebalancing
kvserver: [wip] instrument store cpu based rebalancing
Jan 13, 2023
kvoli
force-pushed
the
230110.cpu-store-rebalancing
branch
6 times, most recently
from
January 17, 2023 21:22
9da25a6
to
b32c697
Compare
kvoli
changed the title
kvserver: [wip] instrument store cpu based rebalancing
kvserver: [wip] cpu rebalancing
Jan 20, 2023
kvoli
changed the title
kvserver: [wip] cpu rebalancing
kvserver: [wip] add store cpu rebalancing
Jan 20, 2023
kvoli
force-pushed
the
230110.cpu-store-rebalancing
branch
9 times, most recently
from
January 24, 2023 17:34
e77b07f
to
9d8af22
Compare
kvoli
force-pushed
the
230110.cpu-store-rebalancing
branch
3 times, most recently
from
January 24, 2023 23:54
96eb4e4
to
dd3b25f
Compare
kvoli
changed the title
kvserver: [wip] add store cpu rebalancing
kvserver: add store cpu rebalancing
Jan 24, 2023
kvoli
changed the title
kvserver: add store cpu rebalancing
kvserver: introduce cpu rebalancing
Jan 24, 2023
kvoli
force-pushed
the
230110.cpu-store-rebalancing
branch
9 times, most recently
from
January 26, 2023 20:08
9d76d35
to
ded17a7
Compare
Previously `Queries` was hardcoded as the dimension to use when creating test ranges for use in store rebalancer tests. This patch enables passing in any `dimension`. Release note: None
Previously, when estimating the impact a lease transfer would have, we would not differentiate between rebalance/transfer. This commit adds a utility method `TransferImpact` to `RangeUsageInfo` which is now used when making estimations about lease transers. Currently, this is identical to `Load`, which was previously used instead for transfers. Release note: None
This patch allows the store rebalancer to use CPU in place of QPS when balancing load on a cluster. This patch adds `cpu` as an option with the cluster setting: `kv.allocator.load_based_rebalancing.objective` When set to `cpu`, rather than `qps`. The store rebalancer will perform a mostly identical function, however target balancing the sum of all replica's cpu time on each store, rather than qps. The default remains as `qps` here. Similar to QPS, the rebalance threshold can be set to allow controlling the range above and below the mean store CPU is considered imbalanced, either overfull or underfull respectively: `kv.allocator.cpu_rebalance_threshold`: 0.1 In order to manage with mixed versions during upgrade and some architectures not supporting the cpu sampling method, a rebalance objective manager is introduced in `rebalance_objective.go`. The manager mediates access to the rebalance objective and overwrites it in cases where the objective set in the cluster setting cannot be supported. resolves: cockroachdb#95380 Release note (ops change) Add option to balance cpu time (cpu) instead of queries per second (qps) among stores in a cluster. This is done by setting `kv.allocator.load_based_rebalancing.objective='cpu'`. `kv.allocator.cpu_rebalance_threshold` is also added, similar to `kv.allocator.qps_rebalance_threshold` to control the target range for store cpu above and below the cluster mean.
This patch removes the deprecated 'lastSplitQPS' value throughout the split/merge code. This field was deprecated in 22.1 in favor or `maxSplitQPS` and stopped being consulted in 22.2. Now only `maxSplitQPS` is consulted and set in `RangeStatsResponse`. Release note: None
This commit adds the ability to peform load based splitting with replica cpu usage rather than queries per second. Load based splitting now will use either cpu or qps for deciding split points, depending on the cluster setting `kv.allocator.load_based_rebalancing.objective`. When set to `qps`, qps is used in deciding split points and when splitting should occur; similarly, `cpu` means that request cpu against the leasholder replica is to decide split points. The split threshold when using `cpu` is the cluster setting `kv.range_split.load_cpu_threshold` which defaults to `250ms` of cpu time per second, i.e. a replica using 1/4 processor of a machine consistently. The merge queue uses the load based splitter to make decisions on whether to merge two adjacent ranges due to low load. This commit also updates the merge queue to be consistent with the load based splitter signal. When switching between `qps` and `cpu`, the load based splitter for every replica is reset to avoid spurious results. resolves: cockroachdb#95377 Release note (ops change): Load based splitter now supports using request cpu usage to split ranges. This is introduced with the previous cluster setting `kv.allocator.load_based_rebalancing.objective`, which when set to `cpu`, will use request cpu usage. The threshold above which CPU usage of a range is considered for splitting is defined in the cluster setting `kv.range_split.load_cpu_threshold`, which has a default value of `250ms`.
kvoli
force-pushed
the
230110.cpu-store-rebalancing
branch
from
January 27, 2023 17:47
ded17a7
to
05cbe42
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This series of commits introduces using CPU as a replacement for QPS in rebalancing and splitting.
The results when using CPU in comparison to QPS can be found here (internal).
CPU load based rebalancing and splitting is not enabled by default in this PR.
resolves: #95380
resolves: #95377