Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate email addresses #22

Open
wants to merge 17 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions .fpm
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
-s dir
--name check_zpool
--license "Apache 2.0"
--architecture all
--depends bash
--description "Simple check-script for Cron/Zabbix/NRPE/Nagios to get the status of all zpool volumes"
--url "https://github.com/Klintrup/check_zpool"
--maintainer "Søren Klintrup <github@klintrup.dk>"
check_zpool.sh=/usr/bin/check_zpool man/check_zpool.8=/usr/share/man/man8/check_zpool.1
25 changes: 25 additions & 0 deletions .github/workflows/package.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
on:
pull_request:
branches: [ "main" ]

name: package

permissions:
contents: read

jobs:
build-packages:
name: Build packages
runs-on: ubuntu-latest
strategy:
matrix:
target: [deb, rpm, pkg]
env:
version: ${{ github.sha }}
steps:
- uses: actions/checkout@v4
- name: action-fpm
uses: Klintrup/github-action-fpm@update-action
with:
fpm_opts: "--verbose -t ${{ matrix.target }} -p check_zpool-${version}.${{ matrix.target }} --depends zfsutils-linux --version 1.0.0"
- run: ls -l
44 changes: 29 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# monitor zfs from nagios/NRPE or cron on FreeBSD
# monitor zfs from Cron/Zabbix/NRPE/Nagios

[![Codacy Badge](https://app.codacy.com/project/badge/Grade/af682a2e5ff34d13b4fba76798eb37a8)](https://app.codacy.com/gh/Klintrup/check_zpool/dashboard)
[![License Apache 2.0](https://img.shields.io/github/license/Klintrup/check_zpool)](https://github.com/Klintrup/check_zpool/blob/main/LICENSE)
Expand All @@ -9,37 +9,51 @@

## Synopsis

Simple check-script for NRPE/nagios to get the status of various zpool volumes
Simple check-script for Cron/Zabbix/NRPE/Nagios to get the status of all zpool volumes
in a box, and output the failed volumes if any such exist.

## Syntax

### Direct/integrate

```bash
check_zpool.sh [email] [email]
```

If no arguments are specified, the script will assume its run for NRPE. If one
or more email addresses are specified, the script will send an email in case an
array reports an error.
If no arguments are specified, the script will assume its run for NRPE/Nagios/Zabbix.
If one or more email addresses are specified, the script will send an email in case
an array reports an error.

### Cron

```bash
0 6 * * * root /path/to/check_zpool.sh first.user@organisation.com second.user@organisation.com
```

This runs the script at 6 AM day and sends out an email if any of the zpools in the system has a status other than "online"

## Output

`tank: DEGRADED / data: rebuilding / system: ok`
`tank: FAULTED / data: degraded / system: online`

Failed/rebuilding volumes will always be first in the output string, to help
diagnose the problem when receiving the output via pager/sms.

## Output examples
### Output states

| output | description |
| ------------- | ------------------------------------------------------------------------------------------------------------------------------- |
| ok | The device is reported as ok by zpool |
| DEGRADED | The RAID volume is degraded, it's still working but without the safety of RAID, and in some cases with severe performance loss. |
| rebuilding | The RAID is rebuilding, will return to OK when done |
| unknown state | Volume is in an unknown state. Please report this as an issue on [GitHub](https://github.com/Klintrup/check_zpool/issues) |
| output | exit code | description |
| --------- | --------- | ------------------------------------------------------------------------------------------------------------------------- |
| online | 0 | The device is online and functioning normally. |
| degraded | 1 | The device is experiencing a non-fatal fault, which may be causing degraded performance. |
| FAULTED | 2 | An unrecoverable error has occurred. The device cannot be opened. |
| OFFLINE | 2 | The device has been taken offline by the administrator. |
| REMOVED | 2 | The device was physically removed while the system was running. |
| UNAVAIL | 2 | The device cannot be opened because the system is currently running a resilvering or scrubbing operation. |
| SUSPENDED | 2 | The device is inaccessible, possibly because the system is in the process of resilvering or scrubbing the device. |
| UNKNOWN | 3 | Volume is in an unknown state. Please report this as an issue on [GitHub](https://github.com/Klintrup/check_zpool/issues) |

## Compatibility

Should work on all versions of FreeBSD with zfs.
Compatible with all versions of FreeBSD and Linux that support ZFS.

Tested on FreeBSD 8.0-10.1
Specifically tested on FreeBSD 8.0+ and Ubuntu 22.04 LTS
1 change: 1 addition & 0 deletions SECURITY.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@

| Version | Supported |
| ------- | --------- |
| 1.4.x | ✅ |
| 1.3.x | ✅ |
| <= 1.2 | ❌ |

Expand Down
185 changes: 154 additions & 31 deletions check_zpool.sh
Original file line number Diff line number Diff line change
@@ -1,63 +1,186 @@
#!/bin/sh
# NRPE check for zpool
#!/usr/bin/env sh
# zpool monitoring script
# Written by: Søren Klintrup <github at klintrup.dk>
# Get your copy from https://github.com/Klintrup/check_zpool

set -u
set -e

PATH="/sbin:/bin:/usr/sbin:/usr/bin"
unset ERRORSTRING
unset OKSTRING
unset ERR
ERR="0"
ERRORSTRING=""
WARNINGSTRING=""
OKSTRING=""

if [ -x "/sbin/zpool" ]; then
DEVICES="$(zpool list -H -o name)"
else
ERRORSTRING="zpool binary does not exist on system"
ERR=3
fi
# Function to validate email addresses
#
# This function takes one or more email addresses as input and validates them
# against a regular expression pattern. It checks if the email addresses are
# in the correct format: <username>@<domain>.<tld>
#
# Parameters:
# One or more email addresses to validate
#
# Example usage:
# validate_email "john.doe@example.com" "jane.smith@example.com"
#
# Returns:
# - Valid email addresses are written to stdout
# - Invalid email addresses are written to stderr along with an error message
validate_email() {
for _check_zpool__validate_email__input in "${@}"; do
if ! echo "${_check_zpool__validate_email__input}" | grep -qE '^[a-zA-Z0-9._%+-]{1,}@[a-zA-Z0-9.-]{1,}\.[a-zA-Z]{2,}$'; then
echo "${_check_zpool__validate_email__input} is not a valid email address" >&2
else
echo "${_check_zpool__validate_email__input}"
fi
done
unset _check_zpool__validate_email__input
}

for DEVICE in ${DEVICES}; do
# Function to set the error code
#
# This function sets the error code based on the current error code and the new error code provided.
# It validates the new error code and assigns the corresponding numeric value to it.
#
# Parameters:
# The current error code
# The new error code to be set. can be either numeric or a string
#
# Exit Codes:
# 0 - ok
# 1 - warning
# 2 - error
# 3 - unknown
set_error_code() {
current_error_code="${1}"
new_error_code="${2}"
if [ -z "${current_error_code}" ] || [ -z "${new_error_code}" ]; then
echo "No error code or new error code given" >&2
exit 1
fi
case "${new_error_code}" in
ok)
new_error_code=0
;;
warning)
new_error_code=1
;;
error)
new_error_code=2
;;
unknown)
new_error_code=3
;;
[0-3])
;;
*)
echo "Invalid error code: ${new_error_code}" >&2
exit 1
;;
esac

if [ "${new_error_code}" -eq 3 ]; then
if [ "${current_error_code}" -eq 0 ]; then
echo "${new_error_code}"
else
echo "${current_error_code}"
fi
elif [ "${current_error_code}" -lt "${new_error_code}" ]; then
echo "${new_error_code}"
else
echo "${current_error_code}"
fi
}


# Checks if the zpool binary exists on the system and retrieves a list of devices if it does.
# If the zpool binary does not exist, an error message is assigned to the ERRORSTRING variable and the ERR variable is set to 3.
get_zpool_devices() {
if [ -x "/sbin/zpool" ]; then
zpool list -H -o name
else
ERRORSTRING="zpool binary does not exist on system"
ERR=3
fi
}

# Checks the health status of each device in a ZFS pool.
# Iterates over each device obtained from the `get_zpool_devices` function,
# and retrieves the health status using the `zpool list` command.
# The health status is then evaluated and categorized into different states,
# such as "unknown", "faulted", "offline", "suspended", "removed", "unavail",
# "degraded", and "online".
# Depending on the health status, the script sets an error code and appends
# the device name to the corresponding error or warning string.
# The final error and warning strings are stored in the variables `ERRORSTRING`
# and `WARNINGSTRING`, respectively.
# The error code is stored in the `ERR` variable.
for DEVICE in $(get_zpool_devices); do
DEVICESTRING="$(zpool list -H -o health "${DEVICE}")"
if [ "$(echo "${DEVICESTRING}" | tr '[:upper:]' '[:lower:]' | sed -Ee 's/.*(degraded|faulted|offline|online|removed|unavail).*/\1/')" = "" ]; then
ERRORSTRING="${ERRORSTRING} / ${DEVICE}: unknown state"
if ! [ "${ERR}" = 2 ]; then ERR=3; fi
if [ "$(echo "${DEVICESTRING}" | tr '[:upper:]' '[:lower:]' | sed -Ee 's/.*(degraded|faulted|suspended|offline|online|removed|unavail).*/\1/')" = "" ]; then
ERRORSTRING="${ERRORSTRING} / ${DEVICE}: UNKNOWN"
ERR=$(set_error_code "${ERR}" "unknown")
else
case $(echo "${DEVICESTRING}" | tr '[:upper:]' '[:lower:]' | sed -Ee 's/.*(degraded|faulted|offline|online|removed|unavail).*/\1/') in
degraded)
ERR=2
ERRORSTRING="${ERRORSTRING} / ${DEVICE}: DEGRADED"
;;
case $(echo "${DEVICESTRING}" | tr '[:upper:]' '[:lower:]' | sed -Ee 's/.*(degraded|faulted|suspended|offline|online|removed|unavail).*/\1/') in
faulted)
ERR=2
ERR=$(set_error_code "${ERR}" "error")
ERRORSTRING="${ERRORSTRING} / ${DEVICE}: FAULTED"
;;
offline)
ERR=2
ERR=$(set_error_code "${ERR}" "error")
ERRORSTRING="${ERRORSTRING} / ${DEVICE}: OFFLINE"
;;
suspended)
ERR=$(set_error_code "${ERR}" "error")
ERRORSTRING="${ERRORSTRING} / ${DEVICE}: SUSPENDED"
;;
removed)
ERR=2
ERR=$(set_error_code "${ERR}" "error")
ERRORSTRING="${ERRORSTRING} / ${DEVICE}: REMOVED"
;;
unavail)
ERR=2
ERR=$(set_error_code "${ERR}" "error")
ERRORSTRING="${ERRORSTRING} / ${DEVICE}: UNAVAIL"
;;
degraded)
ERR=$(set_error_code "${ERR}" "warning")
WARNINGSTRING="${WARNINGSTRING} / ${DEVICE}: degraded"
;;
online)
ERR=$(set_error_code "${ERR}" "ok")
OKSTRING="${OKSTRING} / ${DEVICE}: online"
;;
esac
fi
done
if [ "${1}" ]; then
if [ "${ERRORSTRING}" ]; then
echo "${ERRORSTRING} ${OKSTRING}" | sed s/"^\/ "// | mail -s "$(hostname -s): ${0} reports errors" -E "${*}"

# Checks the status of zpool volumes on the current host and sends an email notification if there are any errors.
# If the script is called with an argument, it assumes that it is being run from cron, and sends an email notification to the specified recipients.
# If the script is called without an argument, it assumes that it is being run from a monitoring system, prints the status of the zpool volumes to the console, and exits with the corresponding error code.
if [ "${#}" -ge "1" ]; then
if [ "${ERRORSTRING}" ] || [ "${WARNINGSTRING}" ]; then
if ! command -v mail >/dev/null 2>&1; then
echo "mail command is not installed" >&2
ERR=$(set_error_code "${ERR}" "unknown")
exit "${ERR}"
fi
recipients=$(validate_email "${@}")
(
echo "zpool volumes on $(hostname -s) has errors:"
echo ""
echo "${ERRORSTRING} ${WARNINGSTRING} ${OKSTRING}" | sed s/"^\/ "// | sed -E "s%\/ %\n%g"
echo ""
echo "This is an automated message, do not reply."
) | mail -s "$(hostname -s): ${0} reports errors" -E "${recipients}"
fi
else
if [ "${ERRORSTRING}" ] || [ "${OKSTRING}" ]; then
echo "${ERRORSTRING} ${OKSTRING}" | sed -E s/"^[[:blank:]]{1,}\/ "//
exit ${ERR}
if [ "${ERRORSTRING}" ] || [ "${OKSTRING}" ] || [ "${WARNINGSTRING}" ]; then
echo "${ERRORSTRING} ${WARNINGSTRING} ${OKSTRING}" | sed -E s/"^[[:blank:]]{1,}\/ "//
exit "${ERR}"
else
echo no zpool volumes found
exit 3
ERR=$(set_error_code "${ERR}" "unknown")
exit "${ERR}"
fi
fi
34 changes: 34 additions & 0 deletions man/check_zpool.8
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
.TH check_zpool 8 "xx.yy.zz"
.SH NAME
Simple check-script for Cron/Zabbix/NRPE/Nagios to get the status of all zpool volumes
.SH SYNOPSIS
.B check_zpool
[email]
[email]
.SH DESCRIPTION
.B check_zpool
checks status of all ZFS pools in a system and either outputs the status to stdout or sends an email to one or more recipient(s)
.SH OPTIONS
.TP
.BR \fBemail\fR
Set one or more email addresses to notify of any issues.
.SH USAGE
If no email address is specified, status is printed to stdout.
.SH EXIT STATUS
.TP
0
everything is normal
.TP
1
a warning status occured
.TP
2
a critical error has occured, data is unaccessible
.TP
3
an unknown error has happened
.Sh SEE ALSO
.Xr zpool 8 ,

.SH AUTHOR
Søren Klintrup (github@klintrup.dk)
Loading