Skip to content

Commit

Permalink
feat: add alertmanager sink (#107)
Browse files Browse the repository at this point in the history
* feat: alertmanager sink

Signed-off-by: Tyler Gillson <tyler.gillson@gmail.com>

* feat: alertmanager sink

Signed-off-by: Tyler Gillson <tyler.gillson@gmail.com>

* refactor: reduce VR update error verbosity

Signed-off-by: Tyler Gillson <tyler.gillson@gmail.com>

* feat: alertmanager sink

Signed-off-by: Tyler Gillson <tyler.gillson@gmail.com>

* docs: add screenshot to README.md

Signed-off-by: Tyler Gillson <tyler.gillson@gmail.com>

* docs: clarify detail & failure annotations

Signed-off-by: Tyler Gillson <tyler.gillson@gmail.com>

* feat: add basic auth support

Signed-off-by: Tyler Gillson <tyler.gillson@gmail.com>

* feat: add TLS support

Signed-off-by: Tyler Gillson <tyler.gillson@gmail.com>

* refactor: remove unused var from sink.Configure; add unit tests

Signed-off-by: Tyler Gillson <tyler.gillson@gmail.com>

* test: increase coverage

Signed-off-by: Tyler Gillson <tyler.gillson@gmail.com>

* test: increase coverage

Signed-off-by: Tyler Gillson <tyler.gillson@gmail.com>

* refactor: misc. tidying

Signed-off-by: Tyler Gillson <tyler.gillson@gmail.com>

* docs: update README

Signed-off-by: Tyler Gillson <tyler.gillson@gmail.com>

* chore: log removal of path from alertmanager endpoint

Signed-off-by: Tyler Gillson <tyler.gillson@gmail.com>

---------

Signed-off-by: Tyler Gillson <tyler.gillson@gmail.com>
  • Loading branch information
TylerGillson committed Nov 15, 2023
1 parent 36ce4a1 commit 855e70e
Show file tree
Hide file tree
Showing 13 changed files with 442 additions and 12 deletions.
2 changes: 1 addition & 1 deletion .vscode/launch.json
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,6 @@
// "KUBECONFIG": "",
"KUBEBUILDER_ASSETS": "/Users/tylergillson/spectrocloud/repos/oss/spectrocloud-labs/validation/validator/bin/k8s/1.27.1-darwin-arm64"
}
},
},
]
}
80 changes: 76 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,22 +23,94 @@ Plugins:

## Installation
```bash
helm repo add validator https://spectrocloud-labs.github.io/validator/
helm repo add validator https://spectrocloud-labs.github.io/validator
helm repo update
helm install validator validator/validator -n validator --create-namespace
```

## Sinks
Validator can be configured to emit updates to various event sinks whenever a `ValidationResult` is created or updated. See configuration details below for each supported sink.

### Alertmanager
Integrate with the Alertmanager API to emit alerts to all [supported Alertmanager receivers](https://prometheus.io/docs/alerting/latest/configuration/#receiver-integration-settings), including generic webhooks. The only required configuration is an Alertmanager endpoint. HTTP basic authentication and TLS are also supported. See [values.yaml](https://github.com/spectrocloud-labs/validator/blob/main/chart/validator/values.yaml) for configuration details.

#### Sample Output
![Screen Shot 2023-11-15 at 10 42 20 AM](https://github.com/spectrocloud-labs/validator/assets/1795270/ce958b8e-96d7-4f5e-8efc-80e2fc2b0b4d)

#### Setup
1. Install Alertmanager in your cluster (if it isn't installed already)
2. Configure Alertmanager alert content. Alerts can be formatted/customized via the following labels and annotations:

Labels
- alertname
- plugin
- validation_result
- expected_results

Annotations
- state
- validation_rule
- validation_type
- message
- status
- detail
- pipe-delimited array of detail messages, see sample config for parsing example
- failure (also pipe-delimited)
- last_validation_time

Example Alertmanager ConfigMap used to produce the sample output above:
```yaml
apiVersion: v1
data:
alertmanager.yml: |
global:
slack_api_url: https://slack.com/api/chat.postMessage
receivers:
- name: default-receiver
slack_configs:
- channel: <channel-id>
text: |-
{{ range .Alerts.Firing -}}
*Validation Result: {{ .Labels.validation_result }}/{{ .Labels.expected_results }}*
{{ range $k, $v := .Annotations }}
{{- if $v }}*{{ $k | title }}*:
{{- if match "\\|" $v }}
- {{ reReplaceAll "\\|" "\n- " $v -}}
{{- else }}
{{- printf " %s" $v -}}
{{- end }}
{{- end }}
{{ end }}
{{ end }}
title: "{{ (index .Alerts 0).Labels.plugin }}: {{ (index .Alerts 0).Labels.alertname }}\n"
http_config:
authorization:
credentials: xoxb--<bot>-<token>
send_resolved: false
route:
group_interval: 10s
group_wait: 10s
receiver: default-receiver
repeat_interval: 1h
templates:
- /etc/alertmanager/*.tmpl
kind: ConfigMap
metadata:
name: alertmanager
namespace: alertmanager
```
2. Install validator and/or upgrade your validator Helm release, configuring `values.sink` accordingly.

### Slack

#### Sample Output
<img width="704" alt="Screen Shot 2023-11-10 at 4 30 12 PM" src="https://github.com/spectrocloud-labs/validator/assets/1795270/c011143a-4d4b-4299-b88b-699188f4bda2">
<img width="700" alt="Screen Shot 2023-11-10 at 4 18 22 PM" src="https://github.com/spectrocloud-labs/validator/assets/1795270/9f2c4ab7-34d6-496a-9f60-68655a7ee3d6">

#### Setup

1. Go to https://api.slack.com/apps and click **Create New App**, then select **From scratch**. Pick an App Name and Slack Workspace, then click **Create App**.

<img src="https://github.com/spectrocloud-labs/validator/assets/1795270/58cbb5a0-12a4-4a83-a0dd-20ae87a8105d" width="500">
Expand All @@ -53,8 +125,8 @@ Validator can be configured to emit updates to various event sinks whenever a `V

4. Install validator and/or upgrade your validator Helm release, configuring `values.sink` accordingly.

## Getting Started
You’ll need a Kubernetes cluster to run against. You can use [KIND](https://sigs.k8s.io/kind) to get a local cluster for testing, or run against a remote cluster.
## Development
You’ll need a Kubernetes cluster to run against. You can use [kind](https://sigs.k8s.io/kind) to get a local cluster for testing, or run against a remote cluster.
**Note:** Your controller will automatically use the current context in your kubeconfig file (i.e. whatever cluster `kubectl cluster-info` shows).

### Running on the cluster
Expand Down
2 changes: 1 addition & 1 deletion api/v1alpha1/validatorconfig_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ type ValidatorConfigSpec struct {
}

type Sink struct {
// +kubebuilder:validation:Enum=slack
// +kubebuilder:validation:Enum=alertmanager;slack
Type string `json:"type"`
// Name of a K8s secret containing configuration details for the sink
SecretName string `json:"secretName"`
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,7 @@ spec:
type: string
type:
enum:
- alertmanager
- slack
type: string
required:
Expand Down
7 changes: 6 additions & 1 deletion chart/validator/templates/sink-secret.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,12 @@ kind: Secret
metadata:
name: {{ required ".Values.sink.secretName is required!" .Values.sink.secretName }}
stringData:
{{- if eq .Values.sink.type "slack" }}
{{- if eq .Values.sink.type "alertmanager" }}
endpoint: {{ required ".Values.sink.endpoint is required!" .Values.sink.endpoint }}
caCert: {{ .Values.sink.caCert }}
username: {{ .Values.sink.username }}
password: {{ .Values.sink.password }}
{{- else if eq .Values.sink.type "slack" }}
apiToken: {{ required ".Values.sink.apiToken is required!" .Values.sink.apiToken }}
channelId: {{ required ".Values.sink.channelId is required!" .Values.sink.channelId }}
{{- end }}
Expand Down
11 changes: 10 additions & 1 deletion chart/validator/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -54,10 +54,19 @@ metricsService:

# Optional sink configuration
sink: {}
# type: alertmanager
# secretName: alertmanager-sink-secret
# endpoint: "http://alertmanager.alertmanager.svc.cluster.local:9093"
# caCert: "" # (TLS CA certificate, optional)
# username: "" # (HTTP basic auth, optional)
# password: "" # (HTTP basic auth, optional)

# OR
# type: slack
# secretName: "slack-secret"
# secretName: slack-sink-secret
# apiToken: ""
# channelId: ""

# By default, a secret will be created. Leave the above fields blank and specify 'createSecret: false' to use an existing secret.
# WARNING: the existing secret must match the format used in sink-secret.yaml
# createSecret: true
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,7 @@ spec:
type: string
type:
enum:
- alertmanager
- slack
type: string
required:
Expand Down
4 changes: 2 additions & 2 deletions internal/controller/validationresult_controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -132,7 +132,7 @@ func (r *ValidationResultReconciler) Reconcile(ctx context.Context, req ctrl.Req
sinkConfig = sinkSecret.Data
}

if err := sink.Configure(*r.SinkClient, *vc, sinkConfig); err != nil {
if err := sink.Configure(*r.SinkClient, sinkConfig); err != nil {
r.Log.Error(err, "failed to configure sink")
return ctrl.Result{}, err
}
Expand Down Expand Up @@ -177,7 +177,7 @@ func (r *ValidationResultReconciler) updateStatus(ctx context.Context) error {
vr.Status.SinkState = sinkState

if err := r.Status().Update(context.Background(), vr); err != nil {
r.Log.V(0).Error(err, "failed to update ValidationResult status")
r.Log.V(1).Info("warning: failed to update ValidationResult status", "error", err.Error())
return err
}

Expand Down
161 changes: 161 additions & 0 deletions internal/sinks/alertmanager.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,161 @@
package sinks

import (
"bytes"
"crypto/tls"
"crypto/x509"
"encoding/base64"
"encoding/json"
"fmt"
"net/http"
"net/url"
"strconv"
"strings"

"github.com/go-logr/logr"
"github.com/pkg/errors"

"github.com/spectrocloud-labs/validator/api/v1alpha1"
)

type AlertmanagerSink struct {
client Client
log logr.Logger

endpoint string
username string
password string
}

type Alert struct {
Annotations map[string]string `json:"annotations"`
Labels map[string]string `json:"labels"`
}

var (
InvalidEndpoint = errors.New("invalid Alertmanager config: endpoint scheme and host are required")
EndpointRequired = errors.New("invalid Alertmanager config: endpoint required")
)

func (s *AlertmanagerSink) Configure(c Client, config map[string][]byte) error {
// endpoint
endpoint, ok := config["endpoint"]
if !ok {
return EndpointRequired
}
u, err := url.Parse(string(endpoint))
if err != nil {
return errors.Wrap(err, "invalid Alertmanager config: failed to parse endpoint")
}
if u.Scheme == "" || u.Host == "" {
return InvalidEndpoint
}
if u.Path != "" {
s.log.V(1).Info("stripping path from Alertmanager endpoint", "path", u.Path)
u.Path = ""
}
s.endpoint = fmt.Sprintf("%s/api/v2/alerts", u.String())

// basic auth
s.username = string(config["username"])
s.password = string(config["password"])

// tls
var caCertPool *x509.CertPool
var insecureSkipVerify bool

insecure, ok := config["insecureSkipVerify"]
if ok {
insecureSkipVerify, err = strconv.ParseBool(string(insecure))
if err != nil {
return errors.Wrap(err, "invalid Alertmanager config: failed to parse insecureSkipVerify")
}
}
caCert, ok := config["caCert"]
if ok {
caCertPool, err = x509.SystemCertPool()
if err != nil {
return errors.Wrap(err, "invalid Alertmanager config: failed to get system cert pool")
}
caCertPool.AppendCertsFromPEM(caCert)
}

c.hclient.Transport = &http.Transport{
TLSClientConfig: &tls.Config{
InsecureSkipVerify: insecureSkipVerify,
MinVersion: tls.VersionTLS12,
RootCAs: caCertPool,
},
}
s.client = c

return nil
}

func (s *AlertmanagerSink) Emit(r v1alpha1.ValidationResult) error {
alerts := make([]Alert, 0, len(r.Status.Conditions))

for i, c := range r.Status.Conditions {
alerts = append(alerts, Alert{
Labels: map[string]string{
"alertname": r.Name,
"plugin": r.Spec.Plugin,
"validation_result": strconv.Itoa(i + 1),
"expected_results": strconv.Itoa(r.Spec.ExpectedResults),
},
Annotations: map[string]string{
"state": string(r.Status.State),
"validation_rule": c.ValidationRule,
"validation_type": c.ValidationType,
"message": c.Message,
"status": string(c.Status),
"detail": strings.Join(c.Details, "|"),
"failure": strings.Join(c.Failures, "|"),
"last_validation_time": c.LastValidationTime.String(),
},
})
}

body, err := json.Marshal(alerts)
if err != nil {
s.log.Error(err, "failed to marshal alerts", "alerts", alerts)
return err
}
s.log.V(1).Info("Alertmanager message", "payload", body)

req, err := http.NewRequest(http.MethodPost, s.endpoint, bytes.NewReader(body))
if err != nil {
s.log.Error(err, "failed to create HTTP POST request", "endpoint", s.endpoint)
return err
}
req.Header.Add("Content-Type", "application/json")

if s.username != "" && s.password != "" {
req.Header.Add(basicAuthHeader(s.username, s.password))
}

resp, err := s.client.hclient.Do(req)
defer func() {
if resp != nil {
_ = resp.Body.Close()
}
}()
if err != nil {
s.log.Error(err, "failed to post alert", "endpoint", s.endpoint)
return err
}
if resp.StatusCode != 200 {
s.log.V(0).Info("failed to post alert", "endpoint", s.endpoint, "status", resp.Status, "code", resp.StatusCode)
return SinkEmissionFailed
}

s.log.V(0).Info("Successfully posted alert to Alertmanager", "endpoint", s.endpoint, "status", resp.Status, "code", resp.StatusCode)
return nil
}

func basicAuthHeader(username, password string) (string, string) {
auth := base64.StdEncoding.EncodeToString(
bytes.Join([][]byte{[]byte(username), []byte(password)}, []byte(":")),
)
return "Authorization", fmt.Sprintf("Basic %s", auth)
}
Loading

0 comments on commit 855e70e

Please sign in to comment.