Skip to content

Commit

Permalink
Update SM doc for alert per object (#107420)
Browse files Browse the repository at this point in the history
Update stack monitoring doc to account for alert notification now being send for each node, index, or cluster based on the rule type, instead of always per cluster (PR# 102544)
  • Loading branch information
ravikesarwani authored and Ravi Kesarwani committed Aug 3, 2021
1 parent 035a35c commit 895587d
Showing 1 changed file with 10 additions and 17 deletions.
27 changes: 10 additions & 17 deletions docs/user/monitoring/kibana-alerts.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -32,44 +32,39 @@ To review and modify all available rules, click *Enter setup mode* on the

This rule checks for {es} nodes that run a consistently high CPU load. By
default, the condition is set at 85% or more averaged over the last 5 minutes.
The rule is grouped across all the nodes of the cluster by running checks on a
schedule time of 1 minute with a re-notify interval of 1 day.
The default rule checks on a schedule time of 1 minute with a re-notify interval of 1 day.

[discrete]
[[kibana-alerts-disk-usage-threshold]]
== Disk usage threshold

This rule checks for {es} nodes that are nearly at disk capacity. By default,
the condition is set at 80% or more averaged over the last 5 minutes. The rule
is grouped across all the nodes of the cluster by running checks on a schedule
time of 1 minute with a re-notify interval of 1 day.
the condition is set at 80% or more averaged over the last 5 minutes. The default rule
checks on a schedule time of 1 minute with a re-notify interval of 1 day.

[discrete]
[[kibana-alerts-jvm-memory-threshold]]
== JVM memory threshold

This rule checks for {es} nodes that use a high amount of JVM memory. By
default, the condition is set at 85% or more averaged over the last 5 minutes.
The rule is grouped across all the nodes of the cluster by running checks on a
schedule time of 1 minute with a re-notify interval of 1 day.
The default rule checks on a schedule time of 1 minute with a re-notify interval of 1 day.

[discrete]
[[kibana-alerts-missing-monitoring-data]]
== Missing monitoring data

This rule checks for {es} nodes that stop sending monitoring data. By default,
the condition is set to missing for 15 minutes looking back 1 day. The rule is
grouped across all the {es} nodes of the cluster by running checks on a schedule
the condition is set to missing for 15 minutes looking back 1 day. The default rule checks on a schedule
time of 1 minute with a re-notify interval of 6 hours.

[discrete]
[[kibana-alerts-thread-pool-rejections]]
== Thread pool rejections (search/write)

This rule checks for {es} nodes that experience thread pool rejections. By
default, the condition is set at 300 or more over the last 5 minutes. The rule
is grouped across all the nodes of the cluster by running checks on a schedule
time of 1 minute with a re-notify interval of 1 day. Thresholds can be set
default, the condition is set at 300 or more over the last 5 minutes. The default rule
checks on a schedule time of 1 minute with a re-notify interval of 1 day. Thresholds can be set
independently for `search` and `write` type rejections.

[discrete]
Expand All @@ -78,18 +73,16 @@ independently for `search` and `write` type rejections.

This rule checks for read exceptions on any of the replicated {es} clusters. The
condition is met if 1 or more read exceptions are detected in the last hour. The
rule is grouped across all replicated clusters by running checks on a schedule
time of 1 minute with a re-notify interval of 6 hours.
default rule checks on a schedule time of 1 minute with a re-notify interval of 6 hours.

[discrete]
[[kibana-alerts-large-shard-size]]
== Large shard size

This rule checks for a large average shard size (across associated primaries) on
any of the specified index patterns in an {es} cluster. The condition is met if
an index's average shard size is 55gb or higher in the last 5 minutes. The rule
is grouped across all indices that match the default pattern of `-.*` by running
checks on a schedule time of 1 minute with a re-notify interval of 12 hours.
an index's average shard size is 55gb or higher in the last 5 minutes. The default rule
matches the pattern of `-.*` by running checks on a schedule time of 1 minute with a re-notify interval of 12 hours.

[discrete]
[[kibana-alerts-cluster-alerts]]
Expand Down

0 comments on commit 895587d

Please sign in to comment.