Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(telemetry): enable statsd and dogstatsd telemetry sinks (backport #18646) #18673

Merged
merged 1 commit into from
Dec 10, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ Ref: https://keepachangelog.com/en/1.0.0/

### Improvements

* (telemetry) [#18646] (https://github.com/cosmos/cosmos-sdk/pull/18646) Enable statsd and dogstatsd telemetry sinks.
* (server) [#18478](https://github.com/cosmos/cosmos-sdk/pull/18478) Add command flag to disable colored logs.
* (x/gov) [#18025](https://github.com/cosmos/cosmos-sdk/pull/18025) Improve `<appd> q gov proposer` by querying directly a proposal instead of tx events. It is an alias of `q gov proposal` as the proposer is a field of the proposal.
* (version) [#18063](https://github.com/cosmos/cosmos-sdk/pull/18063) Allow to define extra info to be displayed in `<appd> version --long` command.
Expand Down
1 change: 1 addition & 0 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,7 @@ require (
require (
filippo.io/edwards25519 v1.0.0 // indirect
github.com/99designs/go-keychain v0.0.0-20191008050251-8e49817e8af4 // indirect
github.com/DataDog/datadog-go v3.2.0+incompatible // indirect
github.com/DataDog/zstd v1.5.5 // indirect
github.com/beorn7/perks v1.0.1 // indirect
github.com/btcsuite/btcd/btcec/v2 v2.3.2 // indirect
Expand Down
1 change: 1 addition & 0 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@ github.com/Azure/go-ansiterm v0.0.0-20230124172434-306776ec8161 h1:L/gRVlceqvL25
github.com/Azure/go-ansiterm v0.0.0-20230124172434-306776ec8161/go.mod h1:xomTg63KZ2rFqZQzSB4Vz2SUXa1BpHTVz9L5PTmPC4E=
github.com/BurntSushi/toml v0.3.1/go.mod h1:xHWCNGjB5oqiDr8zfno3MHue2Ht5sIBksp03qcyfWMU=
github.com/BurntSushi/xgb v0.0.0-20160522181843-27f122750802/go.mod h1:IVnqGOEym/WlBOVXweHU+Q+/VP0lqqI8lqeDx9IjBqo=
github.com/DataDog/datadog-go v3.2.0+incompatible h1:qSG2N4FghB1He/r2mFrWKCaL7dXCilEuNEeAn20fdD4=
github.com/DataDog/datadog-go v3.2.0+incompatible/go.mod h1:LButxg5PwREeZtORoXG3tL4fMGNddJ+vMq1mwgfaqoQ=
github.com/DataDog/zstd v1.5.5 h1:oWf5W7GtOLgp6bciQYDmhHHjdhYkALu6S/5Ni9ZgSvQ=
github.com/DataDog/zstd v1.5.5/go.mod h1:g4AWEaM3yOg3HYfnJ3YIawPnVdXJh9QME85blwSAmyw=
Expand Down
11 changes: 11 additions & 0 deletions server/config/toml.go
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,17 @@ global-labels = [{{ range $k, $v := .Telemetry.GlobalLabels }}
["{{index $v 0 }}", "{{ index $v 1}}"],{{ end }}
]

# MetricsSink defines the type of metrics sink to use.
metrics-sink = "{{ .Telemetry.MetricsSink }}"

# StatsdAddr defines the address of a statsd server to send metrics to.
# Only utilized if MetricsSink is set to "statsd" or "dogstatsd".
statsd-addr = "{{ .Telemetry.StatsdAddr }}"

# DatadogHostname defines the hostname to use when emitting metrics to
# Datadog. Only utilized if MetricsSink is set to "dogstatsd".
datadog-hostname = "{{ .Telemetry.DatadogHostname }}"

###############################################################################
### API Configuration ###
###############################################################################
Expand Down
1 change: 1 addition & 0 deletions simapp/go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ require (
filippo.io/edwards25519 v1.0.0 // indirect
github.com/99designs/go-keychain v0.0.0-20191008050251-8e49817e8af4 // indirect
github.com/99designs/keyring v1.2.1 // indirect
github.com/DataDog/datadog-go v3.2.0+incompatible // indirect
github.com/DataDog/zstd v1.5.5 // indirect
github.com/aws/aws-sdk-go v1.44.224 // indirect
github.com/beorn7/perks v1.0.1 // indirect
Expand Down
1 change: 1 addition & 0 deletions simapp/go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -228,6 +228,7 @@ github.com/Azure/go-ansiterm v0.0.0-20230124172434-306776ec8161 h1:L/gRVlceqvL25
github.com/Azure/go-ansiterm v0.0.0-20230124172434-306776ec8161/go.mod h1:xomTg63KZ2rFqZQzSB4Vz2SUXa1BpHTVz9L5PTmPC4E=
github.com/BurntSushi/toml v0.3.1/go.mod h1:xHWCNGjB5oqiDr8zfno3MHue2Ht5sIBksp03qcyfWMU=
github.com/BurntSushi/xgb v0.0.0-20160522181843-27f122750802/go.mod h1:IVnqGOEym/WlBOVXweHU+Q+/VP0lqqI8lqeDx9IjBqo=
github.com/DataDog/datadog-go v3.2.0+incompatible h1:qSG2N4FghB1He/r2mFrWKCaL7dXCilEuNEeAn20fdD4=
github.com/DataDog/datadog-go v3.2.0+incompatible/go.mod h1:LButxg5PwREeZtORoXG3tL4fMGNddJ+vMq1mwgfaqoQ=
github.com/DataDog/zstd v1.5.5 h1:oWf5W7GtOLgp6bciQYDmhHHjdhYkALu6S/5Ni9ZgSvQ=
github.com/DataDog/zstd v1.5.5/go.mod h1:g4AWEaM3yOg3HYfnJ3YIawPnVdXJh9QME85blwSAmyw=
Expand Down
74 changes: 60 additions & 14 deletions telemetry/metrics.go
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,11 @@ import (
"bytes"
"encoding/json"
"fmt"
"net/http"
"time"

"github.com/hashicorp/go-metrics"
"github.com/hashicorp/go-metrics/datadog"
metricsprom "github.com/hashicorp/go-metrics/prometheus"
"github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/common/expfmt"
Expand All @@ -21,8 +23,17 @@ const (
FormatDefault = ""
FormatPrometheus = "prometheus"
FormatText = "text"

MetricSinkInMem = "mem"
MetricSinkStatsd = "statsd"
MetricSinkDogsStatsd = "dogstatsd"
)

// DisplayableSink is an interface that defines a method for displaying metrics.
type DisplayableSink interface {
DisplayMetrics(resp http.ResponseWriter, req *http.Request) (any, error)
}

// Config defines the configuration options for application telemetry.
type Config struct {
// Prefixed with keys to separate services
Expand Down Expand Up @@ -52,6 +63,17 @@ type Config struct {
// Example:
// [["chain_id", "cosmoshub-1"]]
GlobalLabels [][]string `mapstructure:"global-labels"`

// MetricsSink defines the type of metrics backend to use.
MetricsSink string `mapstructure:"type" default:"mem"`

// StatsdAddr defines the address of a statsd server to send metrics to.
// Only utilized if MetricsSink is set to "statsd" or "dogstatsd".
StatsdAddr string `mapstructure:"statsd-addr"`

// DatadogHostname defines the hostname to use when emitting metrics to
// Datadog. Only utilized if MetricsSink is set to "dogstatsd".
DatadogHostname string `mapstructure:"datadog-hostname"`
}

// Metrics defines a wrapper around application telemetry functionality. It allows
Expand All @@ -60,7 +82,7 @@ type Config struct {
// by the operator. In addition to the sinks, when a process gets a SIGUSR1, a
// dump of formatted recent metrics will be sent to STDERR.
type Metrics struct {
memSink *metrics.InmemSink
sink metrics.MetricSink
prometheusEnabled bool
}

Expand All @@ -76,29 +98,44 @@ func New(cfg Config) (_ *Metrics, rerr error) {
return nil, nil
}

if numGlobalLables := len(cfg.GlobalLabels); numGlobalLables > 0 {
parsedGlobalLabels := make([]metrics.Label, numGlobalLables)
if numGlobalLabels := len(cfg.GlobalLabels); numGlobalLabels > 0 {
parsedGlobalLabels := make([]metrics.Label, numGlobalLabels)
for i, gl := range cfg.GlobalLabels {
parsedGlobalLabels[i] = NewLabel(gl[0], gl[1])
}

globalLabels = parsedGlobalLabels
}

metricsConf := metrics.DefaultConfig(cfg.ServiceName)
metricsConf.EnableHostname = cfg.EnableHostname
metricsConf.EnableHostnameLabel = cfg.EnableHostnameLabel

memSink := metrics.NewInmemSink(10*time.Second, time.Minute)
inMemSig := metrics.DefaultInmemSignal(memSink)
defer func() {
if rerr != nil {
inMemSig.Stop()
}
}()
var (
sink metrics.MetricSink
err error
)
switch cfg.MetricsSink {
case MetricSinkStatsd:
sink, err = metrics.NewStatsdSink(cfg.StatsdAddr)
case MetricSinkDogsStatsd:
sink, err = datadog.NewDogStatsdSink(cfg.StatsdAddr, cfg.DatadogHostname)
default:
memSink := metrics.NewInmemSink(10*time.Second, time.Minute)
sink = memSink
inMemSig := metrics.DefaultInmemSignal(memSink)
defer func() {
if rerr != nil {
inMemSig.Stop()
}
}()
}

if err != nil {
return nil, err
}

m := &Metrics{memSink: memSink}
fanout := metrics.FanoutSink{memSink}
m := &Metrics{sink: sink}
fanout := metrics.FanoutSink{sink}

if cfg.PrometheusRetentionTime > 0 {
m.prometheusEnabled = true
Expand Down Expand Up @@ -140,6 +177,8 @@ func (m *Metrics) Gather(format string) (GatherResponse, error) {
}
}

// gatherPrometheus collects Prometheus metrics and returns a GatherResponse.
// If Prometheus metrics are not enabled, it returns an error.
func (m *Metrics) gatherPrometheus() (GatherResponse, error) {
if !m.prometheusEnabled {
return GatherResponse{}, fmt.Errorf("prometheus metrics are not enabled")
Expand All @@ -154,6 +193,7 @@ func (m *Metrics) gatherPrometheus() (GatherResponse, error) {
defer buf.Reset()

e := expfmt.NewEncoder(buf, expfmt.FmtText)

for _, mf := range metricsFamilies {
if err := e.Encode(mf); err != nil {
return GatherResponse{}, fmt.Errorf("failed to encode prometheus metrics: %w", err)
Expand All @@ -163,8 +203,14 @@ func (m *Metrics) gatherPrometheus() (GatherResponse, error) {
return GatherResponse{ContentType: string(expfmt.FmtText), Metrics: buf.Bytes()}, nil
}

// gatherGeneric collects generic metrics and returns a GatherResponse.
func (m *Metrics) gatherGeneric() (GatherResponse, error) {
summary, err := m.memSink.DisplayMetrics(nil, nil)
gm, ok := m.sink.(DisplayableSink)
if !ok {
return GatherResponse{}, fmt.Errorf("non in-memory metrics sink does not support generic format")
}

summary, err := gm.DisplayMetrics(nil, nil)
if err != nil {
return GatherResponse{}, fmt.Errorf("failed to gather in-memory metrics: %w", err)
}
Expand Down
2 changes: 2 additions & 0 deletions telemetry/metrics_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ func TestMetrics_Disabled(t *testing.T) {

func TestMetrics_InMem(t *testing.T) {
m, err := New(Config{
MetricsSink: MetricSinkInMem,
Enabled: true,
EnableHostname: false,
ServiceName: "test",
Expand All @@ -42,6 +43,7 @@ func TestMetrics_InMem(t *testing.T) {

func TestMetrics_Prom(t *testing.T) {
m, err := New(Config{
MetricsSink: MetricSinkInMem,
Enabled: true,
EnableHostname: false,
ServiceName: "test",
Expand Down
1 change: 1 addition & 0 deletions tests/go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ require (
filippo.io/edwards25519 v1.0.0 // indirect
github.com/99designs/go-keychain v0.0.0-20191008050251-8e49817e8af4 // indirect
github.com/99designs/keyring v1.2.1 // indirect
github.com/DataDog/datadog-go v3.2.0+incompatible // indirect
github.com/DataDog/zstd v1.5.5 // indirect
github.com/aws/aws-sdk-go v1.44.224 // indirect
github.com/beorn7/perks v1.0.1 // indirect
Expand Down
1 change: 1 addition & 0 deletions tests/go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -226,6 +226,7 @@ github.com/Azure/go-ansiterm v0.0.0-20230124172434-306776ec8161 h1:L/gRVlceqvL25
github.com/Azure/go-ansiterm v0.0.0-20230124172434-306776ec8161/go.mod h1:xomTg63KZ2rFqZQzSB4Vz2SUXa1BpHTVz9L5PTmPC4E=
github.com/BurntSushi/toml v0.3.1/go.mod h1:xHWCNGjB5oqiDr8zfno3MHue2Ht5sIBksp03qcyfWMU=
github.com/BurntSushi/xgb v0.0.0-20160522181843-27f122750802/go.mod h1:IVnqGOEym/WlBOVXweHU+Q+/VP0lqqI8lqeDx9IjBqo=
github.com/DataDog/datadog-go v3.2.0+incompatible h1:qSG2N4FghB1He/r2mFrWKCaL7dXCilEuNEeAn20fdD4=
github.com/DataDog/datadog-go v3.2.0+incompatible/go.mod h1:LButxg5PwREeZtORoXG3tL4fMGNddJ+vMq1mwgfaqoQ=
github.com/DataDog/zstd v1.5.5 h1:oWf5W7GtOLgp6bciQYDmhHHjdhYkALu6S/5Ni9ZgSvQ=
github.com/DataDog/zstd v1.5.5/go.mod h1:g4AWEaM3yOg3HYfnJ3YIawPnVdXJh9QME85blwSAmyw=
Expand Down
11 changes: 11 additions & 0 deletions tools/confix/data/v0.50-app.toml
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,17 @@ prometheus-retention-time = 0
# [["chain_id", "cosmoshub-1"]]
global-labels = []

# MetricsSink defines the type of metrics sink to use.
metrics-sink = "mem"

# StatsdAddr defines the address of a statsd server to send metrics to.
# Only utilized if MetricsSink is set to "statsd" or "dogstatsd".
statsd-addr = ""

# DatadogHostname defines the hostname to use when emitting metrics to
# Datadog. Only utilized if MetricsSink is set to "dogstatsd".
datadog-hostname = ""

###############################################################################
### API Configuration ###
###############################################################################
Expand Down
Loading