Skip to content

CrunchyData/crunchy-watch

Repository files navigation

Crunchy Watch

crunchy logo

Overview

Crunchy Watch is deprecated starting in January 2019. For Kube/Openshift HA/failover we recommend the Crunchy Postgres Operator project (https://github.com/crunchydata/postgres-operator.git).

Crunchy Watch is an application that watches a PostgreSQL master and looks for a failure, at which point it will perform a failover scenario.

Failover scenarios are extensible. Sample failover scenarios are provided including:

  • trigger a failover on a random replica

  • trigger a failover on a replica using metadata labels

  • trigger a failover on a replica that is further ahead than others

Crunchy Watch is packaged into a Docker container which can execute in a pure Docker 1.12, Kubernetes 1.9, and Openshift 3.11 environments.

You can also run Crunchy Watch outside of a container as a binary.

Crunchy provides a commercially supported version of this container built upon RHEL 7 and the Crunchy supported PostgreSQL. Contact Crunchy for more details at link:https://www.crunchydata.com.

Usage

Crunchy Watch is designed to operate on multiple platforms. Therefore, it is necessary to specify the platform at startup.

$> crunchy-watch <platform>

Supported Platforms:

Name Value

Kubernetes

kube

Example:

$> crunchy-watch kube

Options

All options can be configured via a command-line flag or an environment variable.

Flag values will take precedence over values defined by an environment variable.

There are general options that apply across all platforms. As well, each platform provides their own specific options. The details for each are provided below.

General

Option Environment Variable Default Description

--primary

CRUNCHY_WATCH_PRIMARY

host of the primary PostgreSQL instance

--primary-port

CRUNCHY_WATCH_PRIMARY_PORT

5432

port of the primary PostreSQL instance

--replica

CRUNCHY_WATCH_REPLICA

host of the replica PostgreSQL instance

--replica-port

CRUNCHY_WATCH_REPLICA_PORT

5432

port of the replica PostgreSQL instance

--username

CRUNCHY_WATCH_USERNAME

postgres

login user to connect to PostgreSQL

--password

CRUNCHY_WATCH_PASSWORD

login user’s password to connect to PostgreSQL

--database

CRUNCHY_WATCH_DATABASE

postgres

database to connect

--target-type

CRUNCHY_WATCH_TARGET_TYPE

pod

Failover tatget type can be POD or Deployment

--timeout

CRUNCHY_WATCH_TIMEOUT

10s

connection timeout - valid time units are "ns", "us", "ms", "s", "m", "h"

--max-failures

CRUNCHY_WATCH_MAX_FAILURES

0

maximum number of failures before performing a failover

--healthcheck-interval

CRUNCHY_WATCH_HEALTHCHECK_INTERVAL

10s

interval between healthchecks - valid time units are "ns", "us", "ms", "s", "m", "h"

--failover-wait

CRUNCHY_WATCH_FAILOVER_WAIT

50s

time to wait for failover to process - valid time units are "ns", "us", "ms", "s", "m", "h"

--pre-hook

CRUNCHY_WATCH_PRE_HOOK

failover hook to execute before processing failover

--post-hook

CRUNCHY_WATCH_POST_HOOK

failover hook to execute after processing failover

--debug

CRUNCHY_DEBUG

when set to true, causes debug level messages to be output

Kubernetes

Name Environment Variable Default Description

--kube-namespace

CRUNCHY_WATCH_KUBE_PROJECT

default

the kubernetes namespace

--kube-failover-strategy

CRUNCHY_WATCH_KUBE_FAILOVER_STRATEGY

default

the kubernetes failover strategy

Build

Building crunchy-watch, supporting plugin modules and docker image are accomplished using make and the provide Makefile.

Requirements for Building from Source

  • Go 1.10 or greater

  • Docker 1.12 or greater

Centos Build Steps

These steps assume your normal userid is someuser and you are installing on a clean minimal Centos7 install.

Install Docker

sudo yum -y install docker
sudo groupadd docker
sudo systemctl enable docker
sudo systemctl start docker
sudo usermod -a -G docker someuser
newgrp docker
docker ps

Install Build Dependencies

sudo yum -y install gettext git golang

Setup Project Settings and Structure

export GOPATH=$HOME/cdev
export PATH=$PATH:$GOPATH/bin
export CCP_IMAGE_PREFIX=crunchydata
export CCP_BASEOS=centos7
export CCP_PGVERSION=11
export CCP_PG_FULLVERSION=11.1
export CCP_VERSION=2.2.0
export CCP_IMAGE_TAG=$CCP_BASEOS-$CCP_PG_FULLVERSION-$CCP_VERSION
export WATCH_CLI=kubectl
export WATCH_NAMESPACE=demo
export WATCH_ROOT=$GOPATH/src/github.com/crunchydata/crunchy-watch
export WATCH_IMAGE_PREFIX=crunchydata
export WATCH_IMAGE_TAG=centos7-2.1.1

In the case of Openshift:

export WATCH_CLI=oc

Then, build the project structure as follows:

mkdir -p $GOPATH/src $GOPATH/bin $GOPATH/pkg
mkdir -p $GOPATH/src/github.com/crunchydata/
cd $GOPATH/src/github.com/crunchydata
git clone https://github.com/CrunchyData/crunchy-watch.git
cd crunchy-watch
git checkout master

Configure storage for the Kube and Openshift examples by setting the following environment variables:

For NFS:

export CCP_STORAGE_CAPACITY=400M
export CCP_NFS_IP=192.168.122.212
export CCP_STORAGE_MODE=ReadWriteMany
export CCP_SECURITY_CONTEXT='"supplementalGroups": [65534]'
export CCP_STORAGE_PATH=/nfsfileshare

For HostPath:

export CCP_STORAGE_CAPACITY=400M
export CCP_STORAGE_MODE=ReadWriteMany
export CCP_STORAGE_PATH=/data

Create the demo namespace:

$ kubectl create -f $WATCH_ROOT/conf/demo-namespace.json
namespace "demo" created
$ kubectl get namespace demo
NAME      STATUS    AGE
demo      Active    7s

Then set the namespace as the current location to avoid using the wrong namespace:

$ kubectl config set-context $(kubectl config current-context) --namespace=demo

Get Project Dependencies

make setup

Build from Source

make

Build the Docker Image

Note
To build the RHEL based image, you will need the Crunchy repo keys to be copied to the $GOPATH/src/github.com/crunchydata/crunchy-watch directory. This is because the RHEL image is based on the Crunchy RPM packages.
cp CRUNCHY-GPG-KEY.public  $GOPATH/src/github.com/crunchydata/crunchy-watch
cp crunchypg*.repo $GOPATH/src/github.com/crunchydata/crunchy-watch
make docker-image

Targets

Target Description

all

(default) calls clean, resolve and build targets

build

builds crunchy-watch binary

modules

builds all plugin modules

kube-module

builds kubernetes plugin module

clean

cleans all build related artifacts, including dependencies.

resolve

resolves all build related dependencies

docker-image

build docker image - Note: requires CCP_BASEOS, CCP_PGVERSION, CCP_PG_FULLVERSION and CCP_VERSION to be defined.

setup

downloads required tools and docker image related dependencies

Extending Crunchy Watch

Crunchy Watch is designed with extension of its function and supported platforms in mind.

Extending by Plugin

Crunchy Watch makes use of the golang plugin package. Therefore it is possible to build support for new platforms separate from each other.

To integrate with the plugin system the following interface must be met:

type FailoverHandler interface {
	Failover() error
	SetFlags(*flag.FlagSet)
}

Failover() is called to process the failover logic for the platform that the plugin supports.

SetFlags(*flag.FlagSet) is called immediately after the plugin is loaded. This allows for plugin to define options/flags that are unique to its operation.

As well, it must be built with the -buildmode=plugin option. See an example of this in the project Makefile

Extending by Hook

Crunchy Watch provides both a pre and post failover hook. These hooks will be executed in a shell environment created by the crunchy-watch process. Therefore they can be any executable or script that can be called by the user running the crunchy-watch process.

To configure the execution of these hooks, a fully qualified path to the executable or script must be provided by either the --pre-hook or --post-hook flags. Or by defining the CRUNCHY_WATCH_PRE_HOOK or CRUNCHY_WATCH_POST_HOOK environment variables.

Example:

$> crunchy-watch kube --pre-hook=/tmp/watch-pre-hook

Or,

$> CRUNCHY_WATCH_PRE_HOOK=/tmp/watch-pre-hook crunchy-watch kube

Examples

Crunchy-watch depends on an RBAC policy to be setup for the service account it uses. As a cluster-admin, you will need to run the examples/run-rbac.sh script a single time to create the necessary service account with the correct RBAC roles.

. /home/some-normal-user/.bashrc
export PATH=$PATH:/home/some-normal-user/cdev/bin
./run-rbac.sh

Then as a normal user account, you can run the crunchy watch examples.

There are 2 primary examples for using crunchy-watch provided. Both examples work for both Kubernetes and Openshift environments. Setting the WATCH_CLI environment variable to oc for Openshift or kubectl for Kubernetes is required to run the examples.

The first example has crunchy-watch watching 2 pods, a primary and a replica pod. Failover is performed on the primary pod.

To run the pod example, first start up the sample pods:

cd examples/sample-pods
./run.sh

To run crunchy-watch for watching this set of pods, run:

cd examples/kube
./run.sh

To trigger a failover of the primary Pod to the replica Pod enter the following:

$WATCH_CLI delete pod pr-primary
$WATCH_CLI logs watch --follow

To verify watch logs for the folowing:

ERRO[2018-09-06T13:38:50Z] Could not reach 'pr-primary' (Attempt: 1)
INFO[2018-09-06T13:38:50Z] Executing pre-hook: /hooks/watch-pre-hook
INFO[2018-09-06T13:38:50Z] Processing Failover: Strategy - latest
INFO[2018-09-06T13:38:50Z] Deleting existing primary...
INFO[2018-09-06T13:38:50Z] Deleted old primary
INFO[2018-09-06T13:38:50Z] Choosing failover replica...
INFO[2018-09-06T13:38:50Z] Chose failover target (pr-replica)
INFO[2018-09-06T13:38:50Z] Promoting failover replica...
DEBU[2018-09-06T13:38:50Z] executing cmd: [/opt/cpm/bin/promote.sh] on pod pr-replica in namespace demo container: postgres
INFO[2018-09-06T13:38:50Z] Relabeling failover replica...
DEBU[2018-09-06T13:38:50Z] label: name
DEBU[2018-09-06T13:38:50Z] label: replicatype
INFO[2018-09-06T13:38:50Z] Executing post-hook: /hooks/watch-post-hook
INFO[2018-09-06T13:39:00Z] Health Checking: 'pr-primary'

To clean up the example:

cd $WATCH_ROOT/examples/sample-pods
./cleanup.sh
cd $WATCH_ROOT/examples/kube
./cleanup.sh

The 2nd example of crunchy-watch demonstrates failover of a Deployment. The sample Deployments used in the example are started as follows:

cd $WATCH_ROOT/examples/sample-deployments
./run.sh

Run the crunchy-watch Deployment example as follows:

cd $WATCH_ROOT/examples/kube-deployments
./run.sh

To trigger a failover of the primary Deployment to the replica Deployment enter the following:

$WATCH_CLI delete deploy watchprimary
$WATCH_CLI logs watch --follow

To verify watch the logs for:

INFO[2018-09-06T15:13:12Z] Health Checking: 'watchprimary'
ERRO[2018-09-06T15:13:22Z] dial tcp 10.99.3.81:5432: i/o timeout
ERRO[2018-09-06T15:13:22Z] Could not reach 'watchprimary' (Attempt: 1)
INFO[2018-09-06T15:13:22Z] Executing pre-hook: /hooks/watch-pre-hook
INFO[2018-09-06T15:13:22Z] Processing Failover: Strategy - latest
INFO[2018-09-06T15:13:22Z] Deleting existing primary...
INFO[2018-09-06T15:13:22Z] deleting deployment
WARN[2018-09-06T15:13:22Z] deployments.extensions "watchprimary" not found
INFO[2018-09-06T15:13:22Z] Deleted old primary
INFO[2018-09-06T15:13:22Z] Choosing failover replica...
INFO[2018-09-06T15:13:22Z] Chose failover target (watchreplica-56c48c7f4b-68fcb)
INFO[2018-09-06T15:13:22Z] Promoting failover replica...
DEBU[2018-09-06T15:13:22Z] executing cmd: [/opt/cpm/bin/promote.sh] on pod watchreplica-56c48c7f4b-68fcb in namespace demo container: postgres
INFO[2018-09-06T15:13:22Z] Relabeling failover replica...
.
.
.
INFO[2018-09-06T15:14:28Z] Health Checking: 'watchprimary'
INFO[2018-09-06T15:14:28Z] Successfully reached 'watchprimary'

To clean up the example:

cd $WATCH_ROOT/examples/sample-deployments
./cleanup.sh
cd $WATCH_ROOT/examples/kube-deployments
./cleanup.sh

The examples on Openshift require the pg-watcher Service Account to have special priviledges, see the run.sh script for the 'oc adm' commands required to grant those priviledges. Customize this priviledge for your local requirements.