Alert instances view to display data from event log instead of current instances #57446

mikecote · 2020-02-12T13:31:10Z

We currently display the current alert instances now that #56842 got merged. This issue is the following step which would be to display from history with a start and stop column showing the duration of each instance.

elasticmachine · 2020-02-12T13:31:12Z

Pinging @elastic/kibana-alerting-services (Team:Alerting Services)

pmuellr · 2020-04-16T16:27:00Z

Seems like an nice "extension" to the current UI would be to show all the alert instances seen over some interval like a day (fixed for now, customizable later) and their most recent start/stop/durations.

Here's the current view:

Changes would be that you would see more instances - every instance that scheduled actions within the last 24 hours. And we'd start seeing inactive instances, not just active ones.

This will likely end up making us add some filtering/sorting - you might want to sort by instance/status/start/duration, and you might want to filter by instance and status, as a separate issue/PR ...

We'll have a semantic issue if we find an instance that was resolved within the time period we query over, but not the start of that instance. For these we're back into an "unknown" state, though we do know the minimum amount of time it has been active (the resolved time - the start time of the query we run) - duration could be something like "at least two hours" or "over two hours", kinda thing.

mikecote · 2020-05-09T15:04:05Z

@mdefazio to provide latest mockup of what the alert instances look like in the alert details page once data is pulled from event log.

pmuellr · 2020-05-11T16:22:48Z

I'd be happy to pair with someone on this; I'm very familiar with the event log :-)

gmmorris · 2020-05-13T10:07:23Z

I took a quick look at the current code to get a mental model of the pieces we need to put together:

We currently have a component called AlertInstancesRoute that wraps the AlertInstances and has one job - which is to load the state of the alert and pass it into the AlertInstances when it's mounted. I think it would make sense to start by adding an event-log api to with_bulk_alert_api_operations.
Once we have a new api in with_bulk_alert_api_operations we can use it in AlertInstancesRoute to fetch the default events and pass a function through to AlertInstances that will allow us to refresh the events that are passed to the AlertInstances whenever the filtering props are changed.

Hopefully by then we'll have a fresh design and we can change the table in AlertInstances to render the events rather than the state.

One problem we need to address is that we would, presumably, calculate the duration by looking back along the api response at how long an instance persists across execution cycles. The problem here is that if the time window specified by the user dictates how many events we fetch from the API then we can only ever evaluate duration as far back as that time window. This means that if, for example, the user specifies they want to see a time frame of "last 15 minutes" then the longest duration any single instance could have is 15 minutes, even if it has been going off for an hour.

@mdefazio We need to think about how we might want to express that a specific instance runs all the way back to the edge of the time frame... possibly exceeding it. 🤔

mdefazio · 2020-05-13T12:51:08Z

Sorry I'm a bit late on this. Here's the current-ish mockup. (Not showing Andrea's updates to the top section). Let me know what makes sense to show in the table and we can update the mockup.

resolves elastic#57446

arisonl · 2020-06-23T19:54:23Z

Should the status values correspond to the states offered by the alert (e.g. ok, warning, minor, major etc.) rather than active/inactive?
What does Duration show if the selected period from the time picker includes multiple occurrences (start-end) of the same instance (e.g. long selected period)? Are we only showing the last one? Conversely, what does Duration show if it includes a partial occurence of an instance (e.g. short selected period)? Should we have an End field next to Start as well?

arisonl · 2020-06-24T10:07:36Z

Notes on the chart: #56280 (comment)

resolves elastic#57446 Adds a new API (AlertClient and HTTP endpoint) `getAlertStatus()` which returns data calculated from the event log. The data returned in this PR is fairly minimal - just enough to replace the current instance details view data. In the future, we can add the following sorts of things: - alert execution errors - counts of alert execution - sequences of active instance time spans - if we can get the alert SO into the action execution, we could also provide action execution errors

resolves #57446 Adds a new API (AlertClient and HTTP endpoint) `getAlertStatus()` which returns alert data calculated from the event log.

…#68437) resolves elastic#57446 Adds a new API (AlertClient and HTTP endpoint) `getAlertStatus()` which returns alert data calculated from the event log.

…#75036) resolves #57446 Adds a new API (AlertClient and HTTP endpoint) `getAlertStatus()` which returns alert data calculated from the event log.

mikecote added Feature:Alerting Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) labels Feb 12, 2020

mikecote changed the title ~~Alert instances view to display data from history instead of current instances~~ Alert instances view to display data from event log instead of current instances Apr 16, 2020

pmuellr mentioned this issue Apr 22, 2020

[Metrics Alerts][discuss] Alert History #58295

Closed

pmuellr self-assigned this May 12, 2020

pmuellr mentioned this issue May 13, 2020

[Alerting] use event log to populate alert details view #66351

Closed

7 tasks

pmuellr added a commit to pmuellr/kibana that referenced this issue May 26, 2020

[Alerting] use event log to populate alert details view

ab7f0eb

resolves elastic#57446

This was referenced Jun 5, 2020

[EventLog] Populate alert instances view with event log data #68437

Merged

[EventLog] make use of EQL in event log query #68641

Closed

mikecote mentioned this issue Aug 11, 2020

Alerting GA #74788

Closed

36 tasks

pmuellr closed this as completed in #68437 Aug 14, 2020

pmuellr added a commit that referenced this issue Aug 14, 2020

[EventLog] Populate alert instances view with event log data (#68437)

67e28ac

resolves #57446 Adds a new API (AlertClient and HTTP endpoint) `getAlertStatus()` which returns alert data calculated from the event log.

stacey-gammon added the ReleaseStatus Item of high enough importance that it should be called out in release status meetings label Sep 17, 2020

kobelb added the needs-team Issues missing a team label label Jan 31, 2022

botelastic bot removed the needs-team Issues missing a team label label Jan 31, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Alert instances view to display data from event log instead of current instances #57446

Alert instances view to display data from event log instead of current instances #57446

mikecote commented Feb 12, 2020 •

edited

Loading

elasticmachine commented Feb 12, 2020

pmuellr commented Apr 16, 2020

mikecote commented May 9, 2020

pmuellr commented May 11, 2020

gmmorris commented May 13, 2020 •

edited

Loading

mdefazio commented May 13, 2020

arisonl commented Jun 23, 2020 •

edited

Loading

arisonl commented Jun 24, 2020

Alert instances view to display data from event log instead of current instances #57446

Alert instances view to display data from event log instead of current instances #57446

Comments

mikecote commented Feb 12, 2020 • edited Loading

elasticmachine commented Feb 12, 2020

pmuellr commented Apr 16, 2020

mikecote commented May 9, 2020

pmuellr commented May 11, 2020

gmmorris commented May 13, 2020 • edited Loading

mdefazio commented May 13, 2020

arisonl commented Jun 23, 2020 • edited Loading

arisonl commented Jun 24, 2020

mikecote commented Feb 12, 2020 •

edited

Loading

gmmorris commented May 13, 2020 •

edited

Loading

arisonl commented Jun 23, 2020 •

edited

Loading