-
Notifications
You must be signed in to change notification settings - Fork 804
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a limits config option to allow for dropping of the cluster label #1726
Conversation
I find it easier to review PRs like this when you split the pure-refactoring part from the behaviour-changing part, into separate commits. Otherwise I have to figure out your intention on every line and that takes longer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the code works, but I found it confusing. Probably I don't understand the overall design.
Given that "ha-tracker.drop-cluster-label" doesn't seem to be tied to HA, could the feature be done as a general label-removal mechanism? This would give us a portion of #771.
docs/arguments.md
Outdated
@@ -144,11 +144,13 @@ prefix these flags with `distributor.ha-tracker.` | |||
|
|||
### HA Tracker | |||
|
|||
HA tracking has two of it's own flags: | |||
HA tracking has three of it's own flags: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HA tracking has three of it's own flags: | |
HA tracking has three of its own flags: |
// Remove the replica label from a slice of LabelPairs if it exists. | ||
func removeReplicaLabel(replica string, labels *[]client.LabelAdapter) { | ||
// Remove the label labelname from a slice of LabelPairs if it exists. | ||
func removeLabel(labelName string, labels *[]client.LabelAdapter) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is very similar to labelPairs.removeBlanks()
, so they could be merged.
This one preserves order, which I don't think is necessary.
It might be a good idea to comment why this code does it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll have a look at the labelsPairs function in the morning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
labelPairs
is nested within the ingester
package. If we're going to merge the two functions should that exist in a new package under util
?
And the reason for preserving order is only that I've generally seen that be the case within Prometheus itself. For example within remote write, when processing a sample through external labelse and relabel rules, we preserve the sorted order of the labels.
Signed-off-by: Callum Styan <callumstyan@gmail.com>
testability. Signed-off-by: Callum Styan <callumstyan@gmail.com>
in Distributor. Signed-off-by: Callum Styan <callumstyan@gmail.com>
@bboreham thanks for the review! I've split out the refactoring into another commit. I also changed the |
Given that it isn't tied to HA, could the feature be done as a general label-removal mechanism? This would give us a portion of #771. |
I moved the cluster label dropping inside the check for |
Given that it doesn't cost me anything, yes I would prefer a generic "for tenant X, everywhere you see label Y, drop it". |
@cstyan Any updates with this PR? Its much appreciated, we're waiting on this to enable HA writing to cortex. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getting close
specific. Signed-off-by: Callum Styan <callumstyan@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
pkg/distributor/distributor.go
Outdated
@@ -280,6 +281,48 @@ func (d *Distributor) checkSample(ctx context.Context, userID, cluster, replica | |||
return true, nil | |||
} | |||
|
|||
// Validates a single series from a write request. Will remove HA labels if necessary. | |||
// Takes a pointer for a partial error so that we can get partial errors, errors during validation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you update the comment to remove the reference to the pointer?
pkg/util/validation/limits.go
Outdated
// HAClusterLabel returns the cluster label to look for when deciding whether to accept a sample from a Prometheus HA replica. | ||
func (o *Overrides) HAClusterLabel(userID string) string { | ||
return o.overridesManager.GetLimits(userID).(*Limits).HAClusterLabel | ||
// DropLabels returns whether the cluster label should be dropped when ingesting HA samples for the user. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Returns the list of labels to be dropped?
pkg/distributor/distributor.go
Outdated
if removeReplica { | ||
removeLabel(d.limits.HAReplicaLabel(userID), &ts.Labels) | ||
} | ||
for _, s := range d.limits.DropLabels(userID) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s
is a weird variable name in this instance, maybe call it label
/labelName
?
empty labels after dropping labels Signed-off-by: Callum Styan <callumstyan@gmail.com>
Thank you @cstyan! |
Fixes #1724
Adds an optional flag per user to drop the cluster label, in addition to dropping the replica label, when ingesting HA samples. Should only be used if that user has their own set of more than two labels that they use to uniquely identify Prometheus replicas.
There's a bit of a refactor here as well:
distributor.Push
was getting pretty long, I moved the internals of the validation loop into a functionvalidateSeries
. This also had the side effect of making that portion of the code testable. I've only added tests for the label removal bits. I assume the actual validation functionsValidateLabels
andValidateSamples
are tested already.Signed-off-by: Callum Styan callumstyan@gmail.com
cc @tomwilkie @gouthamve