healthgroup
is a tool to check the health of services with an auxiliary group of health checks.
If one of the checks in a group fails, healthgroup returns the 503
status code.
Use cases where healthgroup
can be helpful are up to your imagination :) The typical use case is a situation where you want to make an HTTP(S) Load Balancer aware of a service's health that is not placed directly in the backend, and between the LB and the service is a hop, e.g. proxy.
- healthgroup
Usage of healthgroup:
--config string config file (default is $HOME/.healthgroup.yaml)
--configmap string ConfigMap with the configuration file, e.g. namespace/configmap
--in-cluster use in-cluster config. Use always in a case when the app is running on a Kubernetes cluster
--kubeconfig string absolute path to the kubeconfig file (default "/Users/tczekajlo/.kube/config")
healthgroup --config ./test/testdata/config.yaml
[...]
{"level":"info","ts":1681571453.370432,"caller":"healthcheck/healthcheck.go:133","msg":"external health check","request_id":"131d2a2d-8e75-4926-8776-fe2e0470ce31","url":"https://google.com","status":200}
{"level":"info","ts":1681571453.3706791,"caller":"logger/logger.go:70","msg":"request","request_id":"131d2a2d-8e75-4926-8776-fe2e0470ce31","duration_ms":319,"status":200,"url":"/health/kubernetes/default/kubernetes","agent":"curl/7.86.0","ip":"127.0.0.1","method":"GET"}
In this section, you can learn how to configure healthgroup
. The configuration is read in the following order:
- environment variables
- configuration file
The environment variables always take precedence over the configuration file.
Variable | Description |
---|---|
HG_CONSUL_TIMEOUT |
Timeout specifies a time limit for requests made to the Consul server. A Timeout of zero means no timeout. |
HG_CONSUL_ADDRESS |
This is the address of the Consul server. |
HG_CONSUL_SCHEME |
This is the URI scheme for the Consul server. |
HG_CONSUL_TOKEN |
This is the API access token required when access control lists (ACLs) are enabled. |
HG_CONSUL_INSECURE_SKIP_VERIFY |
This is a boolean value (default true ) to specify SSL certificate verification; setting this value to false is not recommended for production use. |
HG_CONSUL_KEY_FILE |
Path to a client key file to use for TLS. |
HG_CONSUL_CERT_FILE |
Path to a client cert file to use for TLS. |
HG_CONSUL_CA_FILE |
Path to a CA file to use for TLS when communicating with Consul. |
HG_LOG_LEVEL |
Defines log level. |
HG_CONSUL_ENABLED |
Defines if requests to Consul should be enabled. |
HG_KUBERNETES_ENABLED |
Defines if request to Kubernetes should be enabled. |
HG_CONCURRENCY |
Defines how many health checks can be executed in parallel. |
HG_SERVER_PORT |
Defines a port on which to listen to. |
HG_SERVER_ADDRESS |
Defines a bind address of the server. |
In this section you can find a configuration file with default values. The configuration file is read from the $HOME/.healthgroup.yaml
location by default. You can use the --config
flag to define a path to the configuration file.
Under the configs/config.yaml
path, you can find an example of the configuration file with all parameters.
# example
server:
address: 127.0.0.1
port: 8080
concurrency: 5
kubernetes:
enabled: true
httpHealthCheck:
- timeout: 3s
type: https
host: google.com
Parameter | Description | Type | Default |
---|---|---|---|
server.port |
Defines a port on which to listen to | int |
8080 |
server.idleTimeout |
The maximum amount of time to wait for the next request (when keep-alive is enabled) | string |
5s |
server.address |
Settings bind address | string |
0.0.0.0 |
kubernetes.enabled |
Defines if Kubernetes discovery service should be enabled | bool |
true |
httpHealthCheck |
Defines auxiliary HTTP(S) health checks | httpHealthCheck[] |
[] |
consul.token |
The API access token | string |
"" |
consul.timeout |
Timeout specifies a time limit for requests made to the Consul server. A Timeout of zero means no timeout, e.g. 2s , 30s , 1h |
string |
2s |
consul.scheme |
The URI scheme for the Consul server (available: http | https ) |
string |
http |
consul.keyFile |
Path to a client key file to use for TLS | string |
"" |
consul.insecureSkipVerify |
Specify SSL certificate verification | bool |
false |
consul.enabled |
Defines if Consul discovery service should be enabled | bool |
false |
consul.certFile |
Path to a client cert file to use for TLS | string |
"" |
consul.caFile |
Path to a CA file to use for TLS when communicating with Consul | string |
"" |
consul.address |
The address of the Consul server | string |
127.0.0.1:8500 |
concurrency |
Defines how many health checks can be executed in parallel per request | int |
5 |
Health check parameter | Description | Type | Default |
---|---|---|---|
host |
Address of the target to which the probe should connect | string |
"" |
insecureSkipVerify |
Whether to verify SSL certificate | bool |
false |
namespace |
The namespace name that the check should be group with | string |
"" |
port |
Port of the target host | int |
null |
requestPath |
The request path to which the probe should connect | string |
/ |
service |
The service name that the check should be group with | string |
"" |
timeout |
Timeout specifies a time limit for requests made to the Consul server. A Timeout of zero means no timeout | string |
0s |
type |
Type of the check, available: http , https , http2 |
string |
http |
discovery |
Specifies a discovery service for which the check should be executed, available: consul , kubernetes |
string |
"" |
It's possible to read the configuration file directly from a Kubernetes ConfigMap. The configs/configmap-healthgroup.yaml
file shows an example of ConfigMap that includes the configuration file for healgroup. The configuration file has to be passed under the config.yaml
key.
Using the --configmap
flag you can define which ConfigMap should be used. The flag value has the following syntax: namespace/configmap_name
.
By default, all auxiliary checks are executed along with the main one (every time when the /health/*
endpoint is called). If you want to assign a particular check to a given service, namespace, or discovery, you can do it by defining the service
or/and namespace
, or/and discovery
parameters in the health check specification.
For instance, if you'd like to check health of the https://google.com
every time when you check the health of the mytest
Kubernetes service that is located in the staging
namespace. It's what the configuration would look like:
httpHealthCheck:
- type: https
host: google.com
namespace: staging
service: mytest
discovery: kubernetes
If one of the grouping parameters is omitted, it works like a wildcard.
httpHealthCheck:
# Execute the health check if the service is a Kubernetes service,
# located in the staging namespace, and the name of the service is `mytest`.
- type: https
host: google.com
namespace: staging
service: mytest
discovery: kubernetes
# Execute the health check if the service name is `mytest`.
- type: https
host: example.com
service: mytest
# Execute the health check every time.
- type: https
host: github.com
Below you can find a list of endpoints supported by healthgroup
.
The Kubernetes endpoint checks the status of a Kubernetes service. If the Kubernetes services don't have any healthy endpoints then the endpoint returns the 503
status code.
If any of the auxiliary health checks failed, the endpoint returns the 503
status code.
Method | Path | Produces |
---|---|---|
GET |
/health/kubernetes/:namespace/:service |
application/json |
namespace
(string: <required>)
- Specifies the name of the namespace where the Kubernetes service is located.service
(string: <required>
- Specifies the name of the Kubernetes service.
curl -i http://localhost:8080/health/kubernetes/default/kubernetes
HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: 53
X-Request-Id: 88ab63e5-5cbb-49eb-9840-e855e3fab1d6
{
"success": true,
"message": "all health checks passed"
}
The Consul endpoint checks the status of a Consul service. If the Consul services return zero healthy instances then the endpoint returns the 503
status code.
If any of the auxiliary health checks failed, the endpoint returns the 503
status code.
Method | Path | Produces |
---|---|---|
GET |
/health/consul/:service |
application/json |
GET |
/health/consul/:namespace/:service |
application/json |
namespace
(string: <optional>)
- Specifies the name of the namespace where the Consul service is located. The parameter works only with Consul Enterprise.service
(string: <required>)
- Specifies the name of the Consul service.
tag
(string: "")
- Specifies the tag to filter the list of instances for a given service.
curl -i http://127.0.0.1:8080/health/consul/redis
HTTP/1.1 503 Service Unavailable
Content-Type: application/json
Content-Length: 52
X-Request-Id: 84ab63e5-5cbb-49eb-9840-e855e3fab1d6
{
"success": false,
"message": "Service is not healthy"
}
Run linting
make lint
Run tests
make test
Whenever you need help regarding the available actions, use the make help
command.