-
Notifications
You must be signed in to change notification settings - Fork 810
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix query offset issue on individual rule groups #6131
Conversation
This is a follow-up to cortexproject#6085, which added support for setting the `query_offset` field on individual recording rule groups, as well as a per-tenant `ruler_query_offset` limit that should be used when no individual recording rule group offset is set. It turns out that compatibility code to convert from a protobuf RuleGroup to a prometheus RuleGroup was coercing null-value query offsets to explicit 0s, which meant that no rule groups would ever fall back to the per-tenant offset. This PR fixes that issue, and it cleans up handling of the query offset in a few other ruler files. Signed-off-by: Kevin Ingelman <ki@buoyant.io>
Thanks for testing this change actively and the PR @klingerf. I think we shouldn't add https://github.com/prometheus/prometheus/blob/main/web/api/v1/api.go#L1347 Good catch of the |
Signed-off-by: Kevin Ingelman <ki@buoyant.io>
@yeya24 Great, thanks for reviewing. I've backed out the API response change. |
QueryOffset: group.QueryOffset, | ||
}, | ||
// We are keeping default value for EvaluationTimestamp and EvaluationDuration since the backup is not evaluating |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yeya24 I also wasn't sure about this change, but added the new field for consistency. That said, the comment below this says:
We are keeping default value for EvaluationTimestamp and EvaluationDuration since the backup is not evaluating
So maybe it deserves to be treated like EvaluationTimestamp and EvaluationDuration for the purpose of backups? If that's the case I can revert this change and update the comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, the backup stores the rule inmemory to serve List rules API requests only. Since query offset is not part of the response then we probably don't need to store that. I am also fine to store it incase it is used in the future.
@rajagopalanand or @rapphil please correct me if I was wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems we also use the same path to reply to rpc requests which will export the entire obj. This would add it to the rpc response.
https://github.com/cortexproject/cortex/blob/master/pkg/ruler/ruler.go#L1230
I am not sure how well used is the rpc api, but maybe we want to maintain both the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the additional info. So it sounds like this change is fine to leave as is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems we also use the same path to reply to rpc requests which will export the entire obj
This RPC seems only being used when listing rules, which doesn't need this field.
But let's leave it as is. There is another field used as response https://github.com/cortexproject/cortex/blob/master/pkg/ruler/api.go#L185 so we are good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for the contribution!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
This is a follow-up to #6085, which added support for setting the
query_offset
field on individual recording rule groups, as well as a per-tenantruler_query_offset
limit that should be used when no individual recording rule group offset is set.It turns out that compatibility code to convert from a protobuf RuleGroup to a prometheus RuleGroup was coercing null-value query offsets to explicit 0s, which meant that no rule groups would ever fall back to the per-tenant offset.
This PR fixes that issue, and it cleans up handling of the query offset in a few other ruler files.
Testing
To test this locally, I started cortex with a 1m per-tenant
ruler_query_offset
limit:Then I created some recording rules that did not have their
query_offset
field set:By contrast, for the version of cortex that's on master right now, with the same setup, I see:
Having
query_offset
set in that API response means that the ruler never falls back to the per-tenant offset, as intended.Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]