Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rulers: Add support to persist tokens in rulers #5987

Merged

Conversation

rapphil
Copy link
Contributor

@rapphil rapphil commented May 30, 2024

What this PR does:

Allow rulers to reuse tokens across restarts. This is important for us because when you are running a fleet with a large numbers of rulers that are updated one at a time, you get a lot of churn if the tokens are not persistent across restarts: rule groups that were owned by a ruler might change ownership several times because every time they find a new owner, this new owner is also restarted. By preserving the tokens the churn is reduced because after the restart the rules will return to the original ruler that they belong to.

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Signed-off-by: Raphael Silva <rapphil@gmail.com>
@rapphil rapphil force-pushed the rapphil-add-persistent-token-rulers branch from 0329b52 to c8fd360 Compare May 30, 2024 05:39
Signed-off-by: Raphael Silva <rapphil@gmail.com>
Copy link
Member

@friedrichg friedrichg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one minor nit

@@ -75,6 +76,7 @@ func (cfg *RingConfig) RegisterFlags(f *flag.FlagSet) {
f.DurationVar(&cfg.FinalSleep, "ruler.ring.final-sleep", 0*time.Second, "The sleep seconds when ruler is shutting down. Need to be close to or larger than KV Store information propagation delay")
f.IntVar(&cfg.ReplicationFactor, "ruler.ring.replication-factor", 1, "EXPERIMENTAL: The replication factor to use when loading rule groups for API HA.")
f.BoolVar(&cfg.ZoneAwarenessEnabled, "ruler.ring.zone-awareness-enabled", false, "EXPERIMENTAL: True to enable zone-awareness and load rule groups across different availability zones for API HA.")
f.StringVar(&cfg.TokensFilePath, "ruler.ring.tokens-file-path", "", "EXPERIMENTAL: File path where tokens are stored. If empty, tokens are not stored at shutdown and restored at startup.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the review.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder why we need to mark this feature as experimental anymore. Thought the initial feature was added long time ago.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yeya24 are you referring to this new flag for rulers or something else? if the former, this feature does not exist in rulers and therefore should be experimental before stabilization?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For other components. I think it might be the same for Ruler.
But I am okay to mark this as experimental for now. We can clean it up later in #5922

Signed-off-by: Raphael Silva <rapphil@gmail.com>
@rapphil rapphil requested a review from friedrichg May 30, 2024 22:48
@friedrichg friedrichg dismissed their stale review May 31, 2024 11:19

done, but the lint still has to pass

Signed-off-by: Raphael Silva <rapphil@gmail.com>
@rapphil
Copy link
Contributor Author

rapphil commented May 31, 2024

@friedrichg done! thanks!

Copy link
Contributor

@yeya24 yeya24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Just a nit about the changelog.

CHANGELOG.md Outdated Show resolved Hide resolved
rapphil added 2 commits May 31, 2024 18:26
Signed-off-by: Raphael Silva <rapphil@gmail.com>
Signed-off-by: Raphael Silva <rapphil@gmail.com>
@alanprot
Copy link
Member

LGTM as well.

@yeya24 yeya24 merged commit 7e07d24 into cortexproject:master May 31, 2024
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants