[Alerting] Add telemetry to count number of failed
or unrecognized
rule tasks
#122985
Labels
Feature:Alerting/RulesFramework
Issues related to the Alerting Rules Framework
Team:ResponseOps
Label for the ResponseOps team (formerly the Cases and Alerting teams)
As part of #119650 we've identified the current scenarios where a rule task might become
failed
orunrecognized
(causing the rule to silently stop being claimed) and determined that these cases are very limited. However, we should add telemetry to track the counts of rule tasks in these states to try to pro-actively identify when we've introduced a bug or regression that causes this state to start occurring more frequently.The text was updated successfully, but these errors were encountered: