Skip to content

Commit

Permalink
Added link to USE method and listed each term of USE
Browse files Browse the repository at this point in the history
  • Loading branch information
Vanlightly committed Oct 18, 2021
1 parent 5a0f67d commit 8d9baab
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion site/bps/BP-44-use-metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ release: "N/A"
---

### Motivation
Based on our experience (at Splunk) running many BookKeeper clusters in production, from very small to very large deployments (in terms of number of bookies, size of VMs and load) we have identified a number of short-comings with the current BookKeeper metrics that make it harder than it should be to identify bottlenecks in performance. The USE method is an effective strategy for diagnosing where bottlenecks in a system lie but the current metrics do not always expose metrics related to utilization and saturation. Also, even if you have a good mental model for how BookKeeper works internally, there are blindspots in the metrics that make it difficult to know what is happening at times.
Based on our experience (at Splunk) running many BookKeeper clusters in production, from very small to very large deployments (in terms of number of bookies, size of VMs and load) we have identified a number of short-comings with the current BookKeeper metrics that make it harder than it should be to identify bottlenecks in performance. The [USE method](https://www.brendangregg.com/usemethod.html) (Utilization, Saturation, Errors) is an effective strategy for diagnosing where bottlenecks in a system lie but the current metrics do not always expose metrics related to utilization and saturation. Also, even if you have a good mental model for how BookKeeper works internally, there are blindspots in the metrics that make it difficult to know what is happening at times.

Finally, many of the metrics are aggregated, such as journal and DbLedgerStorage. When these components are configured with multiple directories, it is currently not possible to inspect the metrics of only a single journal or DbLedgerStorage instance. One bad volume can be hard to identify.

Expand Down

0 comments on commit 8d9baab

Please sign in to comment.