This simple Python 3 utility checks that servers are alive and responding. I use it with Twilio to text me when one of my servers go down, for whatever reason.
It's invoked using python3 heartbeat.py
, reads its configuration from a configuration file heartbeat.yaml
, and saves its state in .heartbeat.state.json
.
Example configuration file:
tests:
- http:
name: Google
url: https://www.google.com/
alerts:
- twilio:
account_sid: XXXXXXXXXXXXXX
auth_token: XXXXXXXXXXXXXX
from_number: +17775551212
to_number: +12224443131
The configuration file is divided into two sections: tests and alerts.
Tests are checks that are performed when the heartbeat
utility runs.
Any tests that fail will trigger alerts. Supported tests include http
, tcp
, and shell
.
Common test configuration settings:
name | The name of the test, used in the default messages |
down_message | Message sent when the test fails. Defaults to $name is down . |
up_message | Message sent when the test passes after failing. Defaults to $name is up . |
ignore_fail_count | Number of test failures ignored before sending an alert. Defaults to 0. |
alert_period_hours | Number of hours between successive alerts while the test is still failing. Defaults to 1.0. |
http
requests the given URL and confirms that the response status code is 200.
url | The URL to query, required. Must be HTTP or HTTPS. |
headers | Dictionary of additional headers to send. |
timeout | Stop waiting for a response after a given number of seconds. |
Example:
- tests:
- http:
name: Home Assistant
url: https://hass.local/api/
headers:
x-ha-access: XXXXXXXX
Content-Type: application/json
tcp
opens a connection to the host and port and confirms that a connection is established.
host | Host name or IP address to connect to, required. |
port | Port number to open, required. |
timeout | Seconds to wait for a connection before timing out. |
Example:
- tests:
- tcp:
name: SSH server
host: my.server.io
port: 22
shell
runs a command and confirm that the exit code is zero.
command | Command to execute using the current shell, required. |
timeout | Seconds to wait for the child process to exit. |
Example:
- tests:
- shell:
name: Internet connection
command: curl https://www.google.com
Alerts are notifications sent when tests fail. Supported alerts include twilio
and shell
.
Sends alerts from Gmail - creating an application specific password is required.
sent_from | Gmail address, required. |
gmail_password | Gmail's 3rd party password, required. |
to | Email address sending to, required. |
subject | The subject, required. |
twilio
uses the Twilio REST API to send a text message. Sign up for a free account (including a SMS number) at https://www.twilio.com/. Note that repeated alerts can rack up charges once your trial is exhausted, so be careful with your test settings.
Note that the twilio
module must be installed to use twilio alerts, using pip install twilio
.
account_sid | Twilio account SID from the dashboard, required. |
auth_token | Twilio auth token from the dashboard, required. |
from_number | Twilio phone number from the dashboard, required. |
to_number | Destination phone number, required. |
shell
runs a command using the current shell to deliver the alert.
command | Command to execute using the current shell, required. |
The alert message can be inserted into command
using $message
.
In a test's up_message
and down_message
, named values can be inserted using $<name>
.
state | Current state of the test, either passing or failing . |
fail_count | Number of consecutive failures. |
last_pass_time | Time at which the test most recently passed. |
last_fail_time | Time at which the test most recently failed. |
Additionally, any value from the test's configuration can be inserted using similar syntax.
Example:
- tests:
- http:
name: My site
url: http://test.org
down_message: $name ($url) has been down since $last_pass_time.
Create a file /etc/systemd/system/heartbeat.service
:
[Unit]
Description=Server heartbeat
Wants=heartbeat.timer
[Service]
Type=oneshot
User=xxx
WorkingDirectory=/home/xxx
ExecStart=/usr/bin/python3 /home/xxx/heartbeat/heartbeat.py
[Install]
WantedBy=multi-user.target
And another file /etc/systemd/system/heartbeat.timer
:
[Unit]
Description=Server heartbeat
Requires=heartbeat.service
[Timer]
Unit=heartbeat.service
OnCalendar=*-*-* *:*:00
[Install]
WantedBy=timers.target
And run:
sudo systemctl enable heartbeat.timer
sudo systemctl start heartbeat.timer
Run every minute via crontab -e
:
* * * * * /usr/local/bin/python3 /home/me/heartbeat.py
Storing heartbeat.yaml and the state in the home directory:
* * * * * cd /home/me/.heartbeat && /usr/local/bin/python3 heartbeat.py
Specify a custom configuration and state file:
* * * * * /usr/local/bin/python3 /home/me/heartbeat.py -c /home/me/custom.yaml -s /home/me/.heartbeat.json