Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove user from state key metric value #5453

Conversation

qinxx108
Copy link
Contributor

@qinxx108 qinxx108 commented Jul 13, 2023

What this PR does:
The metric helper will remove the userID label and aggregate the metric, however we have a key label value in the state replicator which cause a memory leak on alert manager since the key label value include the userId which is not getting removed. This PR will remove the userId from the key

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@qinxx108 qinxx108 force-pushed the remove-user-from-state-key-metric-value branch from 83dd411 to 0a9d0f4 Compare July 13, 2023 05:33
@@ -168,19 +168,19 @@ func newAlertmanagerMetrics() *alertmanagerMetrics {
partialMerges: prometheus.NewDesc(
"cortex_alertmanager_partial_state_merges_total",
"Number of times we have received a partial state to merge for a key.",
[]string{"user"}, nil),
[]string{"user", "key"}, nil),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What will be the value of the key label? I am okay to add this as seems this is what we have in AM so I am not worried about cardinality issue. Just out of curiosity.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed with @qinxx108 offline, this won't increase cardinality.

Copy link
Contributor

@yeya24 yeya24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@qinxx108 qinxx108 force-pushed the remove-user-from-state-key-metric-value branch from 0a9d0f4 to bdc940f Compare July 13, 2023 16:39
Copy link
Contributor

@alvinlin123 alvinlin123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed offline too, I think it would be cause less confusion to the reader of the metrics and code if we use the label name "type" instead of "key" for alertmanager state replication related metrics.

s.partialStateMergesFailed.WithLabelValues(key)
s.stateReplicationTotal.WithLabelValues(key)
s.stateReplicationFailed.WithLabelValues(key)
s.partialStateMergesTotal.WithLabelValues(getKeyWithoutUser(key))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better to do keyWithoutUser = getKeyWithoutUser(key) then use the new variable to pass into WithLabelValues


// getKeyWithoutUser used for trim the userID out of the metric label value.
func getKeyWithoutUser(key string) string {
if strings.IndexByte(key, ':') < 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why call IndexByte twice? We can save the result in a variable right?

@qinxx108 qinxx108 force-pushed the remove-user-from-state-key-metric-value branch from bdc940f to 278f184 Compare July 13, 2023 17:52
alanprot and others added 7 commits July 13, 2023 16:25
* Implementing Bucket index sync status

Signed-off-by: Alan Protasio <alanprot@gmail.com>

* fixing bug when returning from cache

Signed-off-by: Alan Protasio <alanprot@gmail.com>

* Addressing some comments

Signed-off-by: Alan Protasio <alanprot@gmail.com>

* Changelog

Signed-off-by: Alan Protasio <alanprot@gmail.com>

---------

Signed-off-by: Alan Protasio <alanprot@gmail.com>
Signed-off-by: Yijie Qin <qinyijie@amazon.com>
Signed-off-by: Yijie Qin <qinyijie@amazon.com>
Signed-off-by: Yijie Qin <qinyijie@amazon.com>
Signed-off-by: Yijie Qin <qinyijie@amazon.com>
Signed-off-by: Yijie Qin <qinyijie@amazon.com>
Signed-off-by: Yijie Qin <qinyijie@amazon.com>
Signed-off-by: Yijie Qin <qinyijie@amazon.com>
@qinxx108 qinxx108 force-pushed the remove-user-from-state-key-metric-value branch from 25e9b6f to d6f1b06 Compare July 13, 2023 20:25
qinxx108 added 3 commits July 13, 2023 16:35
Signed-off-by: Yijie Qin <qinyijie@amazon.com>
Signed-off-by: Yijie Qin <qinyijie@amazon.com>
Signed-off-by: Yijie Qin <qinyijie@amazon.com>
@alvinlin123 alvinlin123 enabled auto-merge (squash) July 13, 2023 21:27
@alvinlin123 alvinlin123 merged commit b3cab52 into cortexproject:master Jul 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants