Skip to content

Commit

Permalink
add smartmon_sata failure as a mixin in kube-prometheus
Browse files Browse the repository at this point in the history
  • Loading branch information
iminfinity committed Nov 21, 2024
1 parent 5af854d commit 53dbbc7
Show file tree
Hide file tree
Showing 3 changed files with 37 additions and 5 deletions.
6 changes: 1 addition & 5 deletions build/kube-prometheus/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -162,11 +162,7 @@ available in https://github.com/prometheus/alertmanager/blob/main/doc/alertmanag
}
```

- To add a custom prometheus rule as a mixin, create a mixin.libsonnet file in the relevant folder under the `mixins` folder and generate the `prometheus.yaml` for it. One way to do generate a YAML file from a .libsonnet file is using a jsonnet command similar to this:

```
jsonnet -e '(import "mixin.libsonnet").prometheusAlerts' | gojsontoyaml > prometheus.yaml
```
- To add a custom prometheus rule as a mixin, create a mixin.libsonnet file in the relevant folder under the `mixins` folder.

- In the case when your mixin is supposed to trigger a Prometheus alert and <b>all you want is to test</b> whether it works, do this:

Expand Down
7 changes: 7 additions & 0 deletions build/kube-prometheus/common-template.jsonnet
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,7 @@ local default_vars = {
'velero',
'whoami',
'zfs-localpv',
'smartmon',
],
prometheus_operator_resources: {
limits: { memory: '80Mi' },
Expand Down Expand Up @@ -169,6 +170,7 @@ local default_vars = {
'argo-cd-sync-state': true,
rabbitmq: false,
'monitor-prometheus-stack': false,
smartmon: false,
},
mixin_configs: {
// Example:
Expand Down Expand Up @@ -245,6 +247,11 @@ local mixins = remove_nulls([
(import 'mixins/monitoring/mixin.libsonnet'),
vars,
),
addMixin(
'smartmon',
(import 'mixins/smartmon/mixin.libsonnet'),
vars,
),
]);

local scrape_namespaces = std.uniq(std.sort(std.flattenArrays(
Expand Down
29 changes: 29 additions & 0 deletions build/kube-prometheus/mixins/smartmon/mixin.libsonnet
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
{
_config+:: {
selector: '',
},

prometheusAlerts+:: {
groups+: [
{
name: 'SmartMon',
rules: [
{
alert: 'SmartMonUdmaCrcErrorCountRawValue',
expr: 'sum by (instance, device) (smartctl_device_attribute{attribute_name="UDMA_CRC_Error_Count", attribute_value_type="raw"} >= smartctl_device_attribute{attribute_name="UDMA_CRC_Error_Count", attribute_value_type="worst"})',
'for': '3h',
labels: {
severity: 'critical',
alert_id: 'SmartMonUdmaCrcErrorCountRawValue',
},
annotations: {
description: 'Disk **{{ .Labels.device }}** has disk sata failure on instance **{{ .Labels.instance }}**
UDMA_CRC_Error_Count - The number of errors related to data transfer over the interface. A value of **{{ .Value }}** is concerning and indicates potential issues with the data cable or connections.',
summary: 'The device **{{ .Labels.device }}** has disk sata failures.'
},
},
],
},
],
},
}

0 comments on commit 53dbbc7

Please sign in to comment.