Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation/op-guide: Add rules for Prometheus 2.0 #8848

Merged
merged 1 commit into from
Nov 13, 2017

Conversation

brancz
Copy link
Contributor

@brancz brancz commented Nov 10, 2017

This adds the existing Prometheus alerting rules in the new Prometheus 2.0 rule syntax.

@xiang90 @gyuho

Copy link
Contributor

@gyuho gyuho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. Thanks!
/cc @xiang90

changes within the last hour
summary: a high number of leader changes within the etcd cluster are happening
- alert: HighNumberOfFailedGRPCRequests
expr: sum(rate(etcd_grpc_requests_failed_total{job="etcd"}[5m])) BY (grpc_method)
Copy link
Contributor

@gyuho gyuho Nov 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, should we update this to our new go-grpc-prometheus metrics?
Reference: #8802

/cc @xiang90

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I was going to do that in a subsequent PR, happy to do it here as well though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I was going to do that in a subsequent PR

Ok sounds good.
Can you file another PR?

Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

@gyuho gyuho merged commit adeb1fb into etcd-io:master Nov 13, 2017
@@ -100,7 +100,7 @@ Now Prometheus will scrape etcd metrics every 10 seconds.

### Alerting

There is a [set of default alerts for etcd v3 clusters](./etcd3_alert.rules).
There is a set of default alerts for etcd v3 clusters for [Prometheus 1.x](./etcd3_alert.rules) as well as [Prometheus 2.x](./etcd3_alert.rules).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one of the links is wrong. they point to the same file now. the v2 one should have .yam suffix?

/cc @gyuho @brancz

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're absolutely right. I'll fix it with the follow-up PR!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right,

@brancz when you fix alert rules for #8802, can we also change [Prometheus 2.x](./etcd3_alert.rules). to [Prometheus 2.x](./etcd3_alert.rules.yml).?

Thanks!

@brancz brancz deleted the prom-2.0-rules branch November 13, 2017 17:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants