Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix support for keep_firing_for field in alert rules #5823

Merged
merged 3 commits into from
Mar 23, 2024

Conversation

mustafain117
Copy link
Contributor

What this PR does:
This field was added in prometheus prometheus/prometheus#11827
And was pulled into Cortex 1.15.0 but currently does not work, this PR fixes this.
Which issue(s) this PR fixes:
Fixes #

Checklist

  • Tests updated
  • [N/A] Documentation added
  • [N/A] CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Signed-off-by: Mustafain Ali Khan <mustalik@amazon.com>
@mustafain117 mustafain117 marked this pull request as ready for review March 20, 2024 20:58
Copy link
Contributor

@emanlodovice emanlodovice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.

Copy link
Contributor

@yeya24 yeya24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks this is nice catch.

@mustafain117 I wonder if we can also update the integration test to make sure this feature really works? https://github.com/cortexproject/cortex/blob/master/integration/ruler_test.go
Unit tests probably not the best place to capture that.

pkg/ruler/rulespb/compat_test.go Show resolved Hide resolved
@pull-request-size pull-request-size bot added size/L and removed size/M labels Mar 21, 2024
pkg/ruler/api.go Outdated
Annotations labels.Labels `json:"annotations"`
State string `json:"state"`
ActiveAt *time.Time `json:"activeAt"`
KeepFiringSince *time.Time `json:"keepFiringSince"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To match prometheus I think this needs to be json:"keepFiringSince,omitempty

Copy link
Contributor Author

@mustafain117 mustafain117 Mar 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thanks for catching this. Updated.

I also verified the API response matches prometheus.

keepFiringSince is only included in the response when expected:

Response when alert is not resolved/active (keep firing for value not used):

"groups": [
 {
        "name": "keep-firing2",
        "file": "rg-keep-firing",
        "rules": [
          {
            "state": "firing",
            "name": "alerting_rule_keepfiring60mins2",
            "query": "absent(recording_rule)",
            "duration": 0,
            "keepFiringFor": 3600,
            "labels": {},
            "annotations": {},
            "alerts": [
              {
                "labels": {
                  "alertname": "alerting_rule_keepfiring60mins2"
                },
                "annotations": {},
                "state": "firing",
                "activeAt": "2024-03-21T08:22:03.514363333Z",
                "value": "1e+00"
              }
            ],
            "health": "ok",
            "lastError": "",
            "type": "alerting",
            "lastEvaluation": "2024-03-21T08:26:03.515540067Z",
            "evaluationTime": 0.002836834
          },
      ...] 

Response when alert expression is resolved but keep firing for value is being used:

"groups": [
      {
        "name": "keep-firing2",
        "file": "rg-keep-firing",
        "rules": [
          {
            "state": "firing",
            "name": "alerting_rule_keepfiring60mins2",
            "query": "absent(recording_rule)",
            "duration": 0,
            "keepFiringFor": 3600,
            "labels": {},
            "annotations": {},
            "alerts": [
              {
                "labels": {
                  "alertname": "alerting_rule_keepfiring60mins2"
                },
                "annotations": {},
                "state": "firing",
                "activeAt": "2024-03-21T08:22:03.514363333Z",
                "keepFiringSince": "2024-03-21T08:27:03.514363333Z",
                "value": "1e+00"
              }
            ],
            "health": "ok",
            "lastError": "",
            "type": "alerting",
            "lastEvaluation": "2024-03-21T08:29:03.515574705Z",
            "evaluationTime": 0.002663176
          },
          ...]

Signed-off-by: Mustafain Ali Khan <mustalik@amazon.com>

ruleDesc := desc.Rules[0]

assert.Equal(t, "test_rule", ruleDesc.Alert)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we also verify the keep firing for in alerts field?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added an integration test that validates the KeepFiringSince field in alerts

@mustafain117
Copy link
Contributor Author

Thanks this is nice catch.

@mustafain117 I wonder if we can also update the integration test to make sure this feature really works? https://github.com/cortexproject/cortex/blob/master/integration/ruler_test.go Unit tests probably not the best place to capture that.

Added an integration test for this feature

Signed-off-by: Mustafain Ali Khan <mustalik@amazon.com>
Copy link
Contributor

@emanlodovice emanlodovice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@mustafain117 mustafain117 requested a review from yeya24 March 22, 2024 20:01
Value string `json:"value"`
}

func alertRuleWithKeepFiringFor(groupName string, ruleName string, expression string, keepFiring model.Duration) rulefmt.RuleGroup {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd love to see we don't duplicate code for this since it is almost the same functionality for ruleGroupWithRule.
But it is not a blocker. We can address later to use builder pattern.

Copy link
Contributor

@yeya24 yeya24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work.

@yeya24 yeya24 merged commit 5a7ba2f into cortexproject:master Mar 23, 2024
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants