Skip to content

Commit

Permalink
Add support for additional metrics on Linux in zfs input (#3565)
Browse files Browse the repository at this point in the history
  • Loading branch information
richardelling authored and danielnelson committed Jan 4, 2018
1 parent 1ea8d64 commit f13afea
Show file tree
Hide file tree
Showing 4 changed files with 261 additions and 45 deletions.
77 changes: 56 additions & 21 deletions plugins/inputs/zfs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,17 +13,21 @@ from `sysctl` and `zpool` on FreeBSD.
# kstatPath = "/proc/spl/kstat/zfs"

## By default, telegraf gather all zfs stats
## If not specified, then default is:
## Override the stats list using the kstatMetrics array:
## For FreeBSD, the default is:
# kstatMetrics = ["arcstats", "zfetchstats", "vdev_cache_stats"]
## For Linux, the default is:
# kstatMetrics = ["abdstats", "arcstats", "dnodestats", "dbufcachestats",
# "dmu_tx", "fm", "vdev_mirror_stats", "zfetchstats", "zil"]

## By default, don't gather zpool stats
# poolMetrics = false
```

### Measurements & Fields:

By default this plugin collects metrics about **Arc**, **Zfetch**, and
**Vdev cache**. All these metrics are either counters or measure sizes
By default this plugin collects metrics about ZFS internals and pool.
These metrics are either counters or measure sizes
in bytes. These metrics will be in the `zfs` measurement with the field
names listed bellow.

Expand All @@ -33,7 +37,7 @@ each pool.
- zfs
With fields listed bellow.

#### Arc Stats
#### ARC Stats (FreeBSD and Linux)

- arcstats_allocated (FreeBSD only)
- arcstats_anon_evict_data (Linux only)
Expand Down Expand Up @@ -153,7 +157,7 @@ each pool.
- arcstats_size
- arcstats_sync_wait_for_async (FreeBSD only)

#### Zfetch Stats
#### Zfetch Stats (FreeBSD and Linux)

- zfetchstats_bogus_streams (Linux only)
- zfetchstats_colinear_hits (Linux only)
Expand All @@ -168,29 +172,29 @@ each pool.
- zfetchstats_stride_hits (Linux only)
- zfetchstats_stride_misses (Linux only)

#### Vdev Cache Stats
#### Vdev Cache Stats (FreeBSD)

- vdev_cache_stats_delegations
- vdev_cache_stats_hits
- vdev_cache_stats_misses

#### Pool Metrics (optional)

On Linux:
On Linux (reference: kstat accumulated time and queue length statistics):

- zfs_pool
- nread (integer, )
- nwritten (integer, )
- reads (integer, )
- writes (integer, )
- wtime (integer, )
- wlentime (integer, )
- wupdate (integer, )
- rtime (integer, )
- rlentime (integer, )
- rupdate (integer, )
- wcnt (integer, )
- rcnt (integer, )
- nread (integer, bytes)
- nwritten (integer, bytes)
- reads (integer, count)
- writes (integer, count)
- wtime (integer, nanoseconds)
- wlentime (integer, queuelength * nanoseconds)
- wupdate (integer, timestamp)
- rtime (integer, nanoseconds)
- rlentime (integer, queuelength * nanoseconds)
- rupdate (integer, timestamp)
- wcnt (integer, count)
- rcnt (integer, count)

On FreeBSD:

Expand Down Expand Up @@ -224,7 +228,7 @@ $ ./telegraf --config telegraf.conf --input-filter zfs --test

A short description for some of the metrics.

#### Arc Stats
#### ARC Stats

`arcstats_hits` Total amount of cache hits in the arc.

Expand Down Expand Up @@ -283,12 +287,43 @@ A short description for some of the metrics.

`zfetchstats_hits` Counts the number of cache hits, to items which are in the cache because of the prefetcher.

`zfetchstats_misses` Counts the number of prefetch cache misses.

`zfetchstats_colinear_hits` Counts the number of cache hits, to items which are in the cache because of the prefetcher (prefetched linear reads)

`zfetchstats_stride_hits` Counts the number of cache hits, to items which are in the cache because of the prefetcher (prefetched stride reads)

#### Vdev Cache Stats
#### Vdev Cache Stats (FreeBSD only)
note: the vdev cache is deprecated in some ZFS implementations

`vdev_cache_stats_hits` Hits to the vdev (device level) cache.

`vdev_cache_stats_misses` Misses to the vdev (device level) cache.

#### ABD Stats (Linux Only)
ABD is a linear/scatter dual typed buffer for ARC

`abdstats_linear_cnt` number of linear ABDs which are currently allocated

`abdstats_linear_data_size` amount of data stored in all linear ABDs

`abdstats_scatter_cnt` number of scatter ABDs which are currently allocated

`abdstats_scatter_data_size` amount of data stored in all scatter ABDs

#### DMU Stats (Linux Only)

`dmu_tx_dirty_throttle` counts when writes are throttled due to the amount of dirty data growing too large

`dmu_tx_memory_reclaim` counts when memory is low and throttling activity

`dmu_tx_memory_reserve` counts when memory footprint of the txg exceeds the ARC size

#### Fault Management Ereport errors (Linux Only)

`fm_erpt-dropped` counts when an error report cannot be created (eg available memory is too low)

#### ZIL (Linux Only)
note: ZIL measurements are system-wide, neither per-pool nor per-dataset

`zil_commit_count` counts when ZFS transactions are committed to a ZIL
4 changes: 3 additions & 1 deletion plugins/inputs/zfs/zfs.go
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,9 @@ var sampleConfig = `
## By default, telegraf gather all zfs stats
## If not specified, then default is:
# kstatMetrics = ["arcstats", "zfetchstats", "vdev_cache_stats"]
## For Linux, the default is:
# kstatMetrics = ["abdstats", "arcstats", "dnodestats", "dbufcachestats",
# "dmu_tx", "fm", "vdev_mirror_stats", "zfetchstats", "zil"]
## By default, don't gather zpool stats
# poolMetrics = false
`
Expand Down
11 changes: 9 additions & 2 deletions plugins/inputs/zfs/zfs_linux.go
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,11 @@ func gatherPoolStats(pool poolInfo, acc telegraf.Accumulator) error {
func (z *Zfs) Gather(acc telegraf.Accumulator) error {
kstatMetrics := z.KstatMetrics
if len(kstatMetrics) == 0 {
kstatMetrics = []string{"arcstats", "zfetchstats", "vdev_cache_stats"}
// vdev_cache_stats is deprecated
// xuio_stats are ignored because as of Sep-2016, no known
// consumers of xuio exist on Linux
kstatMetrics = []string{"abdstats", "arcstats", "dnodestats", "dbufcachestats",
"dmu_tx", "fm", "vdev_mirror_stats", "zfetchstats", "zil"}
}

kstatPath := z.KstatPath
Expand All @@ -104,7 +108,7 @@ func (z *Zfs) Gather(acc telegraf.Accumulator) error {
for _, metric := range kstatMetrics {
lines, err := internal.ReadLines(kstatPath + "/" + metric)
if err != nil {
return err
continue
}
for i, line := range lines {
if i == 0 || i == 1 {
Expand All @@ -115,6 +119,9 @@ func (z *Zfs) Gather(acc telegraf.Accumulator) error {
}
rawData := strings.Split(line, " ")
key := metric + "_" + rawData[0]
if metric == "zil" || metric == "dmu_tx" || metric == "dnodestats" {
key = rawData[0]
}
rawValue := rawData[len(rawData)-1]
value, _ := strconv.ParseInt(rawValue, 10, 64)
fields[key] = value
Expand Down
Loading

0 comments on commit f13afea

Please sign in to comment.