Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a collector for ZFS, currently focussed on ARC stats. #213

Closed
wants to merge 24 commits into from
Closed
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
0a8832e
Add a collector for ZFS, currently focussed on ARC stats.
problame Feb 25, 2016
538183f
Incorporate Feedback from brian-brazil.
problame Mar 12, 2016
62e4a7d
Change behavior on errors.
problame Mar 13, 2016
e92e9a2
Enable ZFS exporter by default and update README.
problame Mar 13, 2016
6c13943
Style fixes.
problame Mar 13, 2016
4d15f0b
Comments should end in periods.
problame Mar 13, 2016
fdfe4a6
Build system fixes.
problame Mar 13, 2016
3ae58ba
Remove noop zfsInitialize().
problame Mar 13, 2016
ffc3455
Use struct instantiation instead of constructor for zfsMetrics.
problame Mar 13, 2016
394305a
Fix unreachable code.
problame Mar 13, 2016
e828e04
Remove race-condition in access on zfsMetricProvider in Update()
problame Mar 13, 2016
45e989d
Extract arcstat procfs file parsing into separate method.
problame Mar 13, 2016
3c33e7e
Add unit test for arcstats parsing.
problame Mar 13, 2016
66f4eeb
Fix out-of-bounds on empty lines at the end of the procfs file.
problame Mar 13, 2016
3a6a8a5
Style fixes.
problame Mar 13, 2016
22bbb85
Remove log statement.
problame Mar 13, 2016
735effc
Only compile on supported platforms.
problame Mar 13, 2016
6bdd416
Restructure implementation without zfsMetricProvider.
problame Mar 13, 2016
60aa697
Rename zfs_arc subsystem.
problame Mar 15, 2016
2f99eb6
Expose all available ARC metrics on FreeBSD and Linux.
problame Mar 15, 2016
006cf29
Rename PrepareUpdate() to make intentions behind method clear.
problame Mar 15, 2016
3bb5356
Expose zpool metrics on FreeBSD.
problame Mar 16, 2016
a102306
Update ZFS collector description.
problame Mar 16, 2016
e8d45a2
Stylistic fixes.
problame Mar 28, 2016
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ textfile | Exposes statistics read from local disk. The `--collector.textfile.di
time | Exposes the current system time. | _any_
vmstat | Exposes statistics from `/proc/vmstat`. | Linux
version | Exposes node\_exporter version. | _any_

zfs | Exposes [ZFS](http://open-zfs.org/) performance statistics (ARC-only for now) | [FreeBSD](https://www.freebsd.org/doc/handbook/zfs.html),[Linux](http://zfsonlinux.org/)

### Disabled by default

Expand Down
157 changes: 157 additions & 0 deletions collector/zfs.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,157 @@
package collector
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please include the same copyright/license header here as in all the other Node Exporter files. Same for the other Go files here.


// +build !nofilesystem
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nozfs

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still needs to be changed.

// +build !nozfs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should also have a restriction to only run on linux and freebsd, as it won't compile on other platforms


import (
"errors"

"github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/common/log"
)

type zfsMetricValue int

const zfsErrorValue = zfsMetricValue(-1)

var zfsNotAvailableError = errors.New("ZFS / ZFS statistics are not available")

type zfsSysctl string
type zfsSubsystemName string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is maybe overdoing the special typing a bit, I'd just keep those as normal strings.


const (
arc = zfsSubsystemName("zfs_arc")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the collectors have a shared namespace, so to be safe this should be zfsArc. I'd also just keep this as a plain string.

)

//------------------------------------------------------------------------------
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of these big block comments

// Metrics
//------------------------------------------------------------------------------

type zfsMetric struct {
subsystem zfsSubsystemName // The Prometheus subsystem name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments should end in periods

name string // The Prometheus name of the metric
sysctl zfsSysctl // The sysctl of the ZFS metric
}

func NewZFSMetric(subsystem zfsSubsystemName, sysctl, name string) zfsMetric {
return zfsMetric{
sysctl: zfsSysctl(sysctl),
subsystem: subsystem,
name: name,
}
}
func (m *zfsMetric) BuildFQName() string {
return prometheus.BuildFQName(Namespace, string(m.subsystem), m.name)
}

func (m *zfsMetric) HelpString() string {
return m.name
}

//------------------------------------------------------------------------------
// Collector
//------------------------------------------------------------------------------

func init() {
Factories["zfs"] = NewZFSCollector
}

type zfsCollector struct {
zfsMetrics []zfsMetric
metricProvider zfsMetricProvider
}

func NewZFSCollector() (Collector, error) {
err := zfsInitialize()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a noop

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now it is.
Let me explain why it's still there: the FreeBSD implementation could be made even faster if the sysctls strings were mapped to MIBs (integer array defining the sysctl). This is has to be done only once on initialization.

However, with recent changes (no errors on missing ZFS module) this is no longer possible.

Will remove.

switch {
case err == zfsNotAvailableError:
log.Debug(err)
break
return &zfsCollector{}, err
}

return &zfsCollector{
zfsMetrics: []zfsMetric{
NewZFSMetric(arc, "kstat.zfs.misc.arcstats.p", "mru_size"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a simple struct instantiation, no need for a function

NewZFSMetric(arc, "kstat.zfs.misc.arcstats.size", "size"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to supply these to Linux?

I'd also be wary of renaming only some metrics, as that'll confuse anyone trying to work backwards. Usually we'd do a straight passthrough.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if I get you right.
Are you worried about exposing c_min as target_min_size? This is not specific to Linux.

Are you worried about Linux using different sysctls than FreeBSD / other ZFSs one day?
FreeBSD and Linux both use OpenZFS so there's a common code base.
The sysctls merely serve as an identifier for the metrics, which made sense at the time the exporter was FreeBSD-only.
AFAIK, the sysctls have been brought over from the Sun-Implementation and didn't change since.
I don't like ZoL's using procfs to expose the values, but the procfs is a straight mapping from sysctls, so that's not an issue.

Hence, I conclude that using sysctls as OpenZFS-specific identifiers is acceptable, even if they are mapped to procfs on Linux.

NewZFSMetric(arc, "kstat.zfs.misc.arcstats.c_min", "target_min_size"),
NewZFSMetric(arc, "kstat.zfs.misc.arcstats.c", "target_size"),
NewZFSMetric(arc, "kstat.zfs.misc.arcstats.c_max", "target_max_size"),
NewZFSMetric(arc, "kstat.zfs.misc.arcstats.hits", "hits"),
NewZFSMetric(arc, "kstat.zfs.misc.arcstats.misses", "misses"),
},
metricProvider: NewZFSMetricProvider(),
}, nil
}

func (c *zfsCollector) Update(ch chan<- prometheus.Metric) (err error) {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've noticed you sometimes add extra whitespace at the beginning or end of function blocks. To me, it looks a little bit messy. However, I'm not a member of the Prometheus project, so I will defer to their stylistic preferences.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree.

log.Debug("Preparing metrics update")
err = c.metricProvider.PrepareUpdate()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should return the values you need. By keeping them in an object you'll have concurrency issues

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You haven't changed this, you should have this return the map directly rather than going via an object.

You should also be able to merge the two classes you currently have.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not encapsulate the strategy of retrieving metrics in a dedicated class?

Justifications for my design decision:

  1. Fetching metrics is very different on Linux vs. FreeBSD and may be different on any other OS that might be supported in the future.

  2. Future metrics may not be sysctl values but labeled per-dataset values, e.g. compressratio.

  3. What will most likely be required to cover all these metrics

    3.1. PrepareUpdate() to fetch some metrics in advance (currently only Linux sysctl metrics)
    3.2.handleMiss() to fetch metrics on demand (currently only FreeBSD sysctl metrics)

zfsMetricProvider provides a unified interface for these actions, separating platform-specific from platform-independent code not just on the file-level but also on the architecture-level.

What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a class with effectively one function that is used only once by another function isn't really gaining anything, it's just adding a layer of abstraction that makes the code harder to understand. We don't do this elsewhere, and I don't see a need to do it here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does 6bdd416 look to you? (Removed the metric provider, now the platform specifc code sends metrics into the channel).

Please have a look at collectors/zfs_linux.go:43.

switch {
case err == zfsNotAvailableError:
log.Debug(err)
return nil
return err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is unreachable

}
defer c.metricProvider.InvalidateCache()

log.Debugf("Fetching %d metrics", len(c.zfsMetrics))
for _, metric := range c.zfsMetrics {

value, err := c.metricProvider.Value(metric.sysctl)
if err != nil {
return err
}

ch <- prometheus.MustNewConstMetric(
prometheus.NewDesc(
metric.BuildFQName(),
metric.HelpString(),
nil,
nil,
),
prometheus.UntypedValue,
float64(value),
)

}

return err
}

//------------------------------------------------------------------------------
// Metrics Provider
// Platform-dependend parts implemented in zfs_${platform} files.
//------------------------------------------------------------------------------

type zfsMetricProvider struct {
values map[zfsSysctl]zfsMetricValue
}

func NewZFSMetricProvider() zfsMetricProvider {
return zfsMetricProvider{
values: make(map[zfsSysctl]zfsMetricValue),
}

}

func (p *zfsMetricProvider) InvalidateCache() {
p.values = make(map[zfsSysctl]zfsMetricValue)
}

func (p *zfsMetricProvider) Value(s zfsSysctl) (value zfsMetricValue, err error) {

var ok bool
value = zfsErrorValue

value, ok = p.values[s]
if !ok {
value, err = p.handleMiss(s)
if err != nil {
return value, err
}
}

return value, err
}
65 changes: 65 additions & 0 deletions collector/zfs_freebsd.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
package collector

// +build freebsd
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for this, go does this automatically from the filename.


import (
"fmt"
"unsafe"
)

/*
#cgo LDFLAGS:
#include <fcntl.h>
#include <stdlib.h>
#include <sys/param.h>
#include <sys/module.h>
#include <sys/pcpu.h>
#include <sys/resource.h>
#include <sys/sysctl.h>
#include <sys/time.h>

int zfsIntegerSysctl(const char *name) {

int value;
size_t value_size = sizeof(value);
if (sysctlbyname(name, &value, &value_size, NULL, 0) != -1 ||
value_size != sizeof(value)) {
return -1;
}
return value;

}

int zfsModuleLoaded() {
int modid = modfind("zfs");
return modid < 0 ? 0 : -1;
}

*/
import "C"

func zfsInitialize() error {
return nil
}

func (c *zfsMetricProvider) PrepareUpdate() error {
if C.zfsModuleLoaded() == 0 {
return zfsNotAvailableError
}
return nil
}

func (p *zfsMetricProvider) handleMiss(s zfsSysctl) (zfsMetricValue, error) {

sysctlCString := C.CString(string(s))
defer C.free(unsafe.Pointer(sysctlCString))

value := int(C.zfsIntegerSysctl(sysctlCString))

if value == -1 {
return zfsErrorValue, fmt.Errorf("Could not retrieve sysctl '%s'", s)
}

return zfsMetricValue(value), nil

}
76 changes: 76 additions & 0 deletions collector/zfs_linux.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
package collector

// +build linux

import (
"bufio"
"errors"
"fmt"
"os"
"strconv"
"strings"

"github.com/prometheus/common/log"
)

const zfsArcstatsProcpath = "spl/kstat/zfs/arcstats"

func zfsInitialize() error {
return nil
}

func (p *zfsMetricProvider) PrepareUpdate() (err error) {

err = p.prepareUpdateArcstats(zfsArcstatsProcpath)
if err != nil {
return
}
return nil
}

func (p *zfsMetricProvider) handleMiss(s zfsSysctl) (value zfsMetricValue, err error) {
// all values are fetched in PrepareUpdate()
return zfsErrorValue, fmt.Errorf("sysctl '%s' found")
}

func (p *zfsMetricProvider) prepareUpdateArcstats(zfsArcstatsProcpath string) (err error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would benefit from a unittest


file, err := os.Open(procFilePath(zfsArcstatsProcpath))
if err != nil {
log.Debugf("Cannot open ZFS arcstats procfs file for reading. " +
" Is the kernel module loaded?")
return zfsNotAvailableError
}
defer file.Close()

scanner := bufio.NewScanner(file)

parseLine := false
for scanner.Scan() {
parts := strings.Fields(scanner.Text())

if !parseLine && parts[0] == "name" && parts[1] == "type" && parts[2] == "data" {
// Start parsing from here
parseLine = true
continue
}

if !parseLine || len(parts) < 3 {
continue
}

key := fmt.Sprintf("kstat.zfs.misc.arcstats.%s", parts[0])

value, err := strconv.Atoi(parts[2])
if err != nil {
return fmt.Errorf("could not parse expected integer value for '%s'", key)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use %q instead of '%s'.

}
log.Debugf("%s = %d", key, value)
p.values[zfsSysctl(key)] = zfsMetricValue(value)
}
if !parseLine {
return errors.New("did not parse a single arcstat metrics")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*metric

}

return scanner.Err()
}
2 changes: 1 addition & 1 deletion node_exporter.go
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ import (
)

const (
defaultCollectors = "conntrack,cpu,diskstats,entropy,filefd,filesystem,loadavg,mdadm,meminfo,netdev,netstat,sockstat,stat,textfile,time,uname,version,vmstat"
defaultCollectors = "conntrack,cpu,diskstats,entropy,filefd,filesystem,loadavg,mdadm,meminfo,netdev,netstat,sockstat,stat,textfile,time,uname,version,vmstat,zfs"
)

var (
Expand Down