Implementing Spread Minimizing Token Generator #5855
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What this PR does:
This PR implements the Spread Minimize Token Generator algorithm based on grafana/dskit#321 by GL (thank you), but the implementation is slightly different.
The algorithm implemented in the DSKit package creates tokens in "order", and so, all the tokens need to be created using this algorithm and require a reshard for existing clusters.
The proposal in this PR is slightly different. Here, we create tokens based on the state of the ring right at a given moment, and so, we can achieve similar benefits as soon as new ingesters join the ring (no need to reshard all existing ingesters).
The algorithm is basically as follows:
When registering a new ingester, we first:
ED
) for each new token.ED
.ED
.One edge case to be aware of is if we are scaling multiple ingesters at the same time, we could end up generating multiple tokens close together. To avoid this, if multiple ingesters are being scaled up simultaneously, we only use the new strategy on one of them. This seems to be working well so far, but if we encounter any problems with this approach, we can force the ingesters to join the ring one by one.
If we decide to merge this PR, I recommend leaving this feature as experimental for now and, therefore, hidden from the documentation.
Which issue(s) this PR fixes:
Fixes #
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]