[Monitoring] Re-evaluate how alertInstanceId
is used in rules
#109151
Labels
bug
Fixes for quality problems that affect the customer experience
Team:Monitoring
Stack Monitoring team
When we first created Kibana rules in stack monitoring, we were under the assumption that we could create unique
alertInstanceIds
to maintain separate throttle periods for a single alert firing under separate circumstances (such as a disk usage alert firing on unique throttle periods based on the list of nodes affected). This does not work well with the fact thatalertInstanceIds
that do not schedule actions are forgotten.Recent work changed how these work and actually helped address this by ensuring we always create the same
alertInstanceIds
each time the rule runes, but it looks like we might have issues if we aren't always scheduling actions for these (which looks to be the case).I'm opening this issue because this wasn't understood when these rules were first created and folks on the @elastic/stack-monitoring team might want to reconsider how these rules are designed as a result. Feel free to close if this is now understood and handled appropriately.
The text was updated successfully, but these errors were encountered: