-
Notifications
You must be signed in to change notification settings - Fork 803
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Prometheus rule query offset #6085
Add support for Prometheus rule query offset #6085
Conversation
6302760
to
36bfb13
Compare
There is already a per tenant limit called |
Yes that will be a duplicate after this PR. Can I simply remove the usages of that limit and mention it as deprecated in the changelog? |
Yes I think that works. Please also mark the flag as deprecated. You can find some examples in the codebase. |
pkg/ruler/rulespb/rules.proto
Outdated
google.protobuf.Duration queryOffset = 11 | ||
[(gogoproto.nullable) = false, (gogoproto.stdduration) = true]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm trying to use this branch to set a global rule query offset, and that setting isn't being honored in my local setup. In my cortex config file, I'm using:
limits:
ruler_query_offset: 1m
I think the issue is that this queryOffset
field needs to be nullable. Since it's non-nullable, it's being initialized asDuration(0)
in compat.go
, and the ruler only falls back to the global offset if the field is actually nil, which you can see here:
This change seems to fix it:
diff --git a/pkg/ruler/ruler.go b/pkg/ruler/ruler.go
index d53f78506..8fec45acb 100644
--- a/pkg/ruler/ruler.go
+++ b/pkg/ruler/ruler.go
@@ -910,6 +910,10 @@ func (r *Ruler) getLocalRules(userID string, rulesRequest RulesRequest, includeB
}
}
interval := group.Interval()
+ var queryOffset *time.Duration
+ if offset := group.QueryOffset(); offset != 0 {
+ queryOffset = &offset
+ }
groupDesc := &GroupStateDesc{
Group: &rulespb.RuleGroupDesc{
@@ -918,7 +922,7 @@ func (r *Ruler) getLocalRules(userID string, rulesRequest RulesRequest, includeB
Interval: interval,
User: userID,
Limit: int64(group.Limit()),
- QueryOffset: group.QueryOffset(),
+ QueryOffset: queryOffset,
},
EvaluationTimestamp: group.GetLastEvaluation(),
diff --git a/pkg/ruler/rulespb/compat.go b/pkg/ruler/rulespb/compat.go
index 4f916d09c..7526062ad 100644
--- a/pkg/ruler/rulespb/compat.go
+++ b/pkg/ruler/rulespb/compat.go
@@ -13,9 +13,10 @@ import (
// ToProto transforms a formatted prometheus rulegroup to a rule group protobuf
func ToProto(user string, namespace string, rl rulefmt.RuleGroup) *RuleGroupDesc {
- queryOffset := time.Duration(0)
+ var queryOffset *time.Duration
if rl.QueryOffset != nil {
- queryOffset = time.Duration(*rl.QueryOffset)
+ offset := time.Duration(*rl.QueryOffset)
+ queryOffset = &offset
}
rg := RuleGroupDesc{
Name: rl.Name,
@@ -48,13 +49,17 @@ func formattedRuleToProto(rls []rulefmt.RuleNode) []*RuleDesc {
// FromProto generates a rulefmt RuleGroup
func FromProto(rg *RuleGroupDesc) rulefmt.RuleGroup {
- queryOffset := model.Duration(rg.QueryOffset)
+ var queryOffset *model.Duration
+ if rg.QueryOffset != nil {
+ offset := model.Duration(*rg.QueryOffset)
+ queryOffset = &offset
+ }
formattedRuleGroup := rulefmt.RuleGroup{
Name: rg.GetName(),
Interval: model.Duration(rg.Interval),
Rules: make([]rulefmt.RuleNode, len(rg.GetRules())),
Limit: int(rg.Limit),
- QueryOffset: &queryOffset,
+ QueryOffset: queryOffset,
}
for i, rl := range rg.GetRules() {
diff --git a/pkg/ruler/rulespb/rules.proto b/pkg/ruler/rulespb/rules.proto
index 189c050eb..ed4e98a76 100644
--- a/pkg/ruler/rulespb/rules.proto
+++ b/pkg/ruler/rulespb/rules.proto
@@ -29,7 +29,7 @@ message RuleGroupDesc {
repeated google.protobuf.Any options = 9;
int64 limit =10;
google.protobuf.Duration queryOffset = 11
- [(gogoproto.nullable) = false, (gogoproto.stdduration) = true];
+ [(gogoproto.nullable) = true, (gogoproto.stdduration) = true];
}
// RuleDesc is a proto representation of a Prometheus Rule
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a related discussion of this in the original prometheus PR, here: prometheus/prometheus#14061 (comment)
Signed-off-by: Mustafain Ali Khan <mustalik@amazon.com>
36bfb13
to
820170a
Compare
Signed-off-by: Mustafain Ali Khan <mustalik@amazon.com>
820170a
to
feba91a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. This LGTM!
LGTM! |
@mustafain117 Let's make sure that we create a follow up PR to deprecate the old configuration. |
Hey folks, thanks for putting this change together. I'm still seeing the issue that I mentioned here #6085 (comment) on the version of this PR that merged to master. As far as I can tell, all rules are being initialized with a hardcoded query offset of 0, and that prevents them from falling back to the global rule query offset. I can dig into it and try to submit a follow-up PR, but at least wanted to give you all a heads up. |
This is a follow-up to cortexproject#6085, which added support for setting the `queryOffset` field on individual recording rules, as well as a global per-tenant ruler query offset that should be used when no individual rule offset is set. It turns out that compatability code to convert from a protobuf RuleGroup to a prometheus RuleGroup that's consumed by the ruler was coercing null value query offsets to explicit 0s, which meant that no rules would ever fallback to the global-offset. This PR fixes that issue, and cleans up handling of the query offset in a few other ruler files. Signed-off-by: Kevin Ingelman <ki@buoyant.io>
This is a follow-up to cortexproject#6085, which added support for setting the `query_offset` field on individual recording rule groups, as well as a per-tenant `ruler_query_offset` limit that should be used when no individual recording rule group offset is set. It turns out that compatibility code to convert from a protobuf RuleGroup to a prometheus RuleGroup was coercing null-value query offsets to explicit 0s, which meant that no rule groups would ever fall back to the per-tenant offset. This PR fixes that issue, and it cleans up handling of the query offset in a few other ruler files. Signed-off-by: Kevin Ingelman <ki@buoyant.io>
I created #6131 with a fix. |
* Fix rule group query offset issue This is a follow-up to #6085, which added support for setting the `query_offset` field on individual recording rule groups, as well as a per-tenant `ruler_query_offset` limit that should be used when no individual recording rule group offset is set. It turns out that compatibility code to convert from a protobuf RuleGroup to a prometheus RuleGroup was coercing null-value query offsets to explicit 0s, which meant that no rule groups would ever fall back to the per-tenant offset. This PR fixes that issue, and it cleans up handling of the query offset in a few other ruler files. Signed-off-by: Kevin Ingelman <ki@buoyant.io> * Revert change to rules API response Signed-off-by: Kevin Ingelman <ki@buoyant.io> --------- Signed-off-by: Kevin Ingelman <ki@buoyant.io>
What this PR does:
This PR adds support for
query_offset
field on RuleGroup and globalrule_query_offset
configuration in Cortex Ruler.This feature was added in Prometheus 2.53.0
Which issue(s) this PR fixes:
Fixes #6055
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]