From eb2f71886f2d7800f6fd19ea013a5553d39ff33f Mon Sep 17 00:00:00 2001 From: Ravi Kesarwani <64450378+ravikesarwani@users.noreply.github.com> Date: Mon, 2 Aug 2021 12:41:45 -0400 Subject: [PATCH] Update SM doc for alert per object Update stack monitoring doc to account for alert notification now being send for each node, index, or cluster based on the rule type, instead of always per cluster (PR# 102544) --- docs/user/monitoring/kibana-alerts.asciidoc | 27 ++++++++------------- 1 file changed, 10 insertions(+), 17 deletions(-) diff --git a/docs/user/monitoring/kibana-alerts.asciidoc b/docs/user/monitoring/kibana-alerts.asciidoc index 837248e0cf41d..beaae1fdb71b6 100644 --- a/docs/user/monitoring/kibana-alerts.asciidoc +++ b/docs/user/monitoring/kibana-alerts.asciidoc @@ -32,17 +32,15 @@ To review and modify all available rules, click *Enter setup mode* on the This rule checks for {es} nodes that run a consistently high CPU load. By default, the condition is set at 85% or more averaged over the last 5 minutes. -The rule is grouped across all the nodes of the cluster by running checks on a -schedule time of 1 minute with a re-notify interval of 1 day. +The default rule checks on a schedule time of 1 minute with a re-notify interval of 1 day. [discrete] [[kibana-alerts-disk-usage-threshold]] == Disk usage threshold This rule checks for {es} nodes that are nearly at disk capacity. By default, -the condition is set at 80% or more averaged over the last 5 minutes. The rule -is grouped across all the nodes of the cluster by running checks on a schedule -time of 1 minute with a re-notify interval of 1 day. +the condition is set at 80% or more averaged over the last 5 minutes. The default rule +checks on a schedule time of 1 minute with a re-notify interval of 1 day. [discrete] [[kibana-alerts-jvm-memory-threshold]] @@ -50,16 +48,14 @@ time of 1 minute with a re-notify interval of 1 day. This rule checks for {es} nodes that use a high amount of JVM memory. By default, the condition is set at 85% or more averaged over the last 5 minutes. -The rule is grouped across all the nodes of the cluster by running checks on a -schedule time of 1 minute with a re-notify interval of 1 day. +The default rule checks on a schedule time of 1 minute with a re-notify interval of 1 day. [discrete] [[kibana-alerts-missing-monitoring-data]] == Missing monitoring data This rule checks for {es} nodes that stop sending monitoring data. By default, -the condition is set to missing for 15 minutes looking back 1 day. The rule is -grouped across all the {es} nodes of the cluster by running checks on a schedule +the condition is set to missing for 15 minutes looking back 1 day. The default rule checks on a schedule time of 1 minute with a re-notify interval of 6 hours. [discrete] @@ -67,9 +63,8 @@ time of 1 minute with a re-notify interval of 6 hours. == Thread pool rejections (search/write) This rule checks for {es} nodes that experience thread pool rejections. By -default, the condition is set at 300 or more over the last 5 minutes. The rule -is grouped across all the nodes of the cluster by running checks on a schedule -time of 1 minute with a re-notify interval of 1 day. Thresholds can be set +default, the condition is set at 300 or more over the last 5 minutes. The default rule +checks on a schedule time of 1 minute with a re-notify interval of 1 day. Thresholds can be set independently for `search` and `write` type rejections. [discrete] @@ -78,8 +73,7 @@ independently for `search` and `write` type rejections. This rule checks for read exceptions on any of the replicated {es} clusters. The condition is met if 1 or more read exceptions are detected in the last hour. The -rule is grouped across all replicated clusters by running checks on a schedule -time of 1 minute with a re-notify interval of 6 hours. +default rule checks on a schedule time of 1 minute with a re-notify interval of 6 hours. [discrete] [[kibana-alerts-large-shard-size]] @@ -87,9 +81,8 @@ time of 1 minute with a re-notify interval of 6 hours. This rule checks for a large average shard size (across associated primaries) on any of the specified index patterns in an {es} cluster. The condition is met if -an index's average shard size is 55gb or higher in the last 5 minutes. The rule -is grouped across all indices that match the default pattern of `-.*` by running -checks on a schedule time of 1 minute with a re-notify interval of 12 hours. +an index's average shard size is 55gb or higher in the last 5 minutes. The default rule +matches the pattern of `-.*` by running checks on a schedule time of 1 minute with a re-notify interval of 12 hours. [discrete] [[kibana-alerts-cluster-alerts]]