Skip to content

Commit

Permalink
Fix missing monitoring link on main docs page, add section about scal…
Browse files Browse the repository at this point in the history
…ing persistent volume claims for ingesters (#601)
  • Loading branch information
mdisibio authored Mar 18, 2021
1 parent 59d3f97 commit c4d9771
Show file tree
Hide file tree
Showing 2 changed files with 29 additions and 0 deletions.
1 change: 1 addition & 0 deletions docs/tempo/website/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ Grafana Tempo is an open source, easy-to-use and high-volume distributed tracing

- [Getting Started](getting-started/)
- [Configuration](configuration/)
- [Monitoring](monitoring/)
- [Integration Guides/Trace Discovery](guides/)
- [Tempo CLI](cli/)
- [Architecture](architecture/)
Expand Down
28 changes: 28 additions & 0 deletions docs/tempo/website/monitoring/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,3 +66,31 @@ To set up alerting, download the provided json files and configure them for use

Check the [runbook](https://github.com/grafana/tempo/blob/master/operations/tempo-mixin/runbook.md) to understand the
various steps that can be taken to fix firing alerts!

## Important Metrics

### Free Disk Space
Tempo ingesters make heavy use of local disks to store write-ahead logs and blocks before being flushed to the backend (GCS/S3/etc). It is important to monitor the free volume space as full disks can lead to data loss and other errors. The amount of disk space available affects how much volume a Tempo ingester can process, and the length of time an outage to the backend can be tolerated.

Therefore it may be necessary to increase the disk space for ingesters as usage increases. When deployed as a StatefulSet with Persistent Volume Claims (PVC), some manual steps are required. The following has worked successfully on GKE with GCS:

1. Edit the persistent volume claim (pvc) for each ingester to the new size.

```
kubectl patch pvc -n <namespace> -p '{"spec": {"resources": {"requests": {"storage": "'15Gi'"}}}}' <pod-name>
```

Check all disks have been upgraded by running:

`kubectl get pvc -n <namespace>`

A restart is not necessary as the pods will automatically detect the increased disk space.

2. Delete the StatefulSet but leave the pods running:

`kubectl delete sts --cascade=false -n <namespace> ingester`

3. Edit and recreate the Statefulset with the new size. This covers new pods. There are many ways to deploy Tempo to kubernetes, these are examples for the popular ones:
* Raw yaml: `kubectl apply -f <something>.yaml`
* Helm: `helm upgrade ... tempo ...`
* Tanka: `tk apply ...`

0 comments on commit c4d9771

Please sign in to comment.