Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to get backup status (success or failure) from Prometheus metrics #589

Closed
AlexisDemeaulte opened this issue Jan 4, 2023 · 2 comments
Assignees
Milestone

Comments

@AlexisDemeaulte
Copy link

I would like to setup monitoring of my backups, especially detect when backups are failing. As far I can see the best way is to use clickhouse_backup_failed_* Prometheus metrics to do that, but values stay at 0. Maybe I don't test failures correctly, but this is the same for clickhouse_backup_successful_* metrics.

The way I did my test is the following :

Start the server

sudo ./build/linux/amd64/clickhouse-backup -c /etc/clickhouse-backup-oss.yaml server

Grab the metrics

curl localhost:7171/metrics|grep -e '^clickhouse_' > metrics-01

Create a backup

curl 'http://localhost:7171/backup/create?table=default.products&name=products_20230104' -X POST

Grab the metrics again

curl localhost:7171/metrics|grep -e '^clickhouse_' > metrics-02

Compare metrics

diff metrics-01 metrics-02

Where I got as result :

< clickhouse_backup_last_backup_size_local 0
---
> clickhouse_backup_last_backup_size_local 1627
10,11c10,11
< clickhouse_backup_last_create_duration 0
< clickhouse_backup_last_create_finish 1.672828537e+09
---
> clickhouse_backup_last_create_duration 8.1907188e+07
> clickhouse_backup_last_create_finish 1.672838838e+09
16c16
< clickhouse_backup_last_create_start 0
---
> clickhouse_backup_last_create_start 1.672838838e+09
38c38
< clickhouse_backup_number_backups_local 0

At this point I was expecting clickhouse_backup_successful_creates to be set to 1 but it stays to 0.
How you any idea why ? Did I miss something or did something wrong ?

Clickhouse-server's version is 21.3.13.9
Clickhouse-backup's version is 2.1.3

Thanks for your help

@Slach
Copy link
Collaborator

Slach commented Jan 4, 2023

Looks like you found bug
thanks a lot for reporting, i will fix it ASAP

you can try to adopt
following prometheus rules
https://github.com/Altinity/clickhouse-operator/blob/master/deploy/prometheus/prometheus-alert-rules-backup.yaml

@Slach Slach self-assigned this Jan 4, 2023
@Slach Slach added this to the 2.2.0 milestone Jan 4, 2023
@AlexisDemeaulte
Copy link
Author

Great, thanks for the feedback and the prometheus rules.

@Slach Slach closed this as completed in c74bbc3 Jan 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants