Skip to content

Commit

Permalink
Documentation/op-guide: fix failed RPC rate, leader election metrics
Browse files Browse the repository at this point in the history
This fixes failed RPC rate query, where we do not need
subtraction because we already query by the status code.
Also adds grpc_method to make it more specific. Most of the
time, the failure recovers within 10-second, which is our
Prometheus scrap interval, so 'rate' query might not cover
that time window, showing as 0s, but still shows up in the graph.

Current leader election metrics just displays the counter
vector, but if we want to 'Rate Leader Election', it should
query with 'rate' function.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
  • Loading branch information
gyuho committed Jun 14, 2017
1 parent 750dc7f commit 5f396b1
Showing 1 changed file with 4 additions and 4 deletions.
8 changes: 4 additions & 4 deletions Documentation/op-guide/grafana.json
Original file line number Diff line number Diff line change
Expand Up @@ -123,9 +123,9 @@
"step": 2
},
{
"expr": "sum(rate(grpc_server_started_total{grpc_type=\"unary\"} [1m])) - sum(rate(grpc_server_handled_total{grpc_type=\"unary\",grpc_code!=\"OK\"} [1m]))",
"expr": "rate(grpc_server_handled_total{grpc_type=\"unary\",grpc_code!=\"OK\"}[1m])",
"intervalFactor": 2,
"legendFormat": "{{instance}} RPC Failed Rate",
"legendFormat": "{{instance}} {{grpc_method}} RPC Failed Rate",
"metric": "grpc_server_handled_total",
"refId": "B",
"step": 2
Expand Down Expand Up @@ -922,7 +922,7 @@
"stack": false,
"steppedLine": false,
"targets": [{
"expr": "etcd_server_leader_changes_seen_total",
"expr": "delta(etcd_server_leader_changes_seen_total[1m])",
"intervalFactor": 2,
"legendFormat": "{{instance}} Leader Change Seen",
"metric": "etcd_server_leader_changes_seen_total",
Expand Down Expand Up @@ -1009,4 +1009,4 @@
"version": 215,
"links": [],
"gnetId": null
}
}

0 comments on commit 5f396b1

Please sign in to comment.