Observer is a tool to monitor websites. You can configure uptime, status and regex based checks for websites and store the check results into a PostgreSQL database via Kafka.
- Python 3.7+
- PostgreSQL 11 (It probably works with < 11 versions as well, but wasn't tested)
- Kafka 2.4
ℹ️ If you are a developer, please proceed to the Development Setup section.
The project follows the standard setup.py based installation.
cd observer
python -m pip install --upgrade pip
pip install setuptools wheel pipenv
python setup.py install
or in a single step:
./install.sh
This will install three executables into your $PATH
observer_source
observer_sink
observer_fixtures
More details on how to use those in the section below.
Observer has two components:
- Source, and
- Sink
Source is a tool to monitor websites and publish the monitoring results into a Kafka topic. The website checks have the following config:
- url: https://google.com
frequency: 3
regex:
- name: check_google_exists
pattern: .*google.*
url
: The url to monitor
frequency
: How often to monitor the above url (The value should be in seconds. Only whole numbers are allowed)
regex
: There can be multiple regex checks. Each regex check should have a unique name and a pattern which will be used to check
the content of the webpage returned in the above url. The monitoring result will contain True/False for each regex check
to denote if the pattern is found on the page or not.
There is a sample checks.yml
file inside the config directory.
The source also needs access to a PostgreSQL database. The config for the same is provided as an example in the config.ini
file inside the config
directory.
To run the source, we need to specify a directory from which our tool can read configuration from.
This can be provided using a ENV variable OBSERVER_CONFIG_DIR
. By default, the configs are read from the config
directory, but this needs to be overridden with actual values before running. So do the following steps:
- Create a directory to store configuration
- Create a
checks.yml
file and add the required checks to monitor. - Create a
config.ini
file and fill the parameters for connecting to Kafka and PostgreSQL.
EXPORT OBSERVER_CONFIG_DIR=<config_dir_you_created_above>
observer_source
This will start the source tool and it will start monitoring the websites you configured. The logs are always sent to the stderr. It is recommended to run this tool and the sink tool using a processor manager such as supervisor.
Sink is a tool to complement the Source. It will read the monitoring
results from Kafka and store them into a PostgreSQL database. To run the Sink, you need
to follow the almost same procedure as mentioned above. You don't need to create the checks.yml while configuring Sink.
If both Source and Sink are running on the same node, they can share the config.ini
configuration. There is an option to
supply the DB password via a ENV var (OBSERVER_DB_PASSWORD
) if required.
Before starting Sink, you need to initialize the database.
To make the DB initialization easier, we also provide another tool.
WARNING: Run this only once before you setup Sink.
EXPORT OBSERVER_CONFIG_DIR=<config_dir_you_created_above>
# the below will do a dry-run
observer_fixtures
# To run the actual initialization
observer_fixtures --live
Once the database is ready, the sink can be started as:
EXPORT OBSERVER_CONFIG_DIR=<config_dir_you_created_above>
observer_sink
This project uses Pipenv for development
checkout the project and cd into the project dir.
pipenv install --dev
To run the unit tests:
pipenv run pytest
Running unitests still needs internet access to access websites mentioned in the check configs and a PostgreSQL database. The configuration for the same is provided via the config.TEST.ini. The configurations are pretty self explanatory.
There is an end to end integration test in test_source_and_sink_integration.py which is skipped by default,
since it requires SSL configuration to talk to Kafka cluster. To run it, you need to supply the
CA certificate and Access key and certificate and configure the filenames in the
config file. Again, the config.TEST.ini
has good reference values.
The project has no published binaries yet. But platform specific wheels can be built using the setuptools in the future.