-
Notifications
You must be signed in to change notification settings - Fork 803
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve GetRules performance #5805
Conversation
b77ed02
to
28bb394
Compare
pkg/ruler/manager.go
Outdated
@@ -184,6 +198,29 @@ func (r *DefaultMultiTenantManager) syncRulesToManager(ctx context.Context, user | |||
} | |||
} | |||
|
|||
func (r *DefaultMultiTenantManager) getRulesManager(user string, ctx context.Context) (RulesManager, bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we only do one thing in this method to get manager from the map rather than creating a new one?
Create manager and make it up and running should be a separate function.
Can you please update the PR description and mention what this pr is trying to do? |
28bb394
to
d66adf6
Compare
d66adf6
to
28e46cc
Compare
…entirety of sync rules Signed-off-by: Anand Rajagopal <anrajag@amazon.com>
28e46cc
to
1161d31
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Signed-off-by: Anand Rajagopal <anrajag@amazon.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
…entirety of sync rules (cortexproject#5805) Signed-off-by: Anand Rajagopal <anrajag@amazon.com>
What this PR does:
Currently the manager's SyncRuleGroups and GetRules methods share the same lock. This means that if SyncRuleGroups becomes slow then GetRules will have to wait a long time to acquire the lock.
SyncRuleGroups can become slow when rule groups are updated with slow running rules because the rule group will wait for currently evaluating rule to finish before it stops
This PR fixes this problem by:
userManagerMtx
for it's intended purpose which is to protect theuserManagers
mapuseManagerMtx
to RWMutex so that exclusive lock is acquired only when a new manager is created or clean up occurs at the end ofSyncRuleGroups
. By doing this, the duration of exclusive lock is reduced and the scope is reducedsyncRuleMtx
to ensureSyncRuleGroups
is executed sequentiallyrules/manager.Update
is called. This is because each Prometheus rule manager can still be responsible for large number of rule groups and if many of those rule groups have slow running rules, then ListRules will be slow. The cached rule groups will be removed as soon as update is completeWhich issue(s) this PR fixes:
Fixes #5745
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]