-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Alerting] Gracefully restore failed rules from pre-7.11 #117593
Comments
Pinging @elastic/kibana-alerting-services (Team:Alerting Services) |
This currently happens at the end of the rule execution here. By returning a
++ I think this will be sufficient to make the rule run again and update the task's |
After speaking with @kobelb, it feels better to fix if the change turns small. It can be common for rules pre 7.11 to fail continuously, and people got the habit of disabling/enabling them for the fix. Adding to 8.1/8.2 plans. |
In this issue we have determined that for rules created prior to 7.11, the associated task manager document does not contain the
schedule
field:Example pre-7.11 rule task doc
When rules are running normally and Kibana is upgraded to 7.11+, after the next normal execution, the task manager doc will be updated with the
schedule
field.However, if a rule has reached its
maxAttempts
value of3
, when Kibana is upgraded to 7.11+, the task managerupdateByQuery
script will mark these rules asfailed
because it has noschedule
and the number of attempts has reached the limit. We want to make sure these rules continue running so we propose to do 2 things to mitigate:schedule
field is missing. Resetattempts
to0
andstatus
toidle
. This should ensure that task manager can start claiming these tasks again.schedule
field. If it does not, update the task document to include it. This should ensure that the alerting rule task will not reach this state again.The text was updated successfully, but these errors were encountered: