Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ping metrics #23

Merged
merged 13 commits into from
Apr 17, 2022
Merged

Ping metrics #23

merged 13 commits into from
Apr 17, 2022

Conversation

dmke
Copy link
Contributor

@dmke dmke commented Apr 1, 2022

This enables fetching device statistics from UNMS, extracting the ping RTT values from it, and export them back to Prometheus.

@dmke

This comment was marked as outdated.

@mraerino
Copy link
Member

the diff looks like it contains things from #22 - maybe rebase just in this case?

@dmke

This comment was marked as outdated.

Copy link
Member

@mraerino mraerino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the implementation looks good to me.

could you make this feature available behind a flag or query parameter though?
i'm not sure all users of this service want to cause these additional requests to the API.

@dmke
Copy link
Contributor Author

dmke commented Apr 12, 2022

Sure, you got some name in mind? Previous iterations used --with-statistics and later --enable-ping-metrics.

The former name stemmed from the API endpoint (/devices/{id}/statistics), but that didn't feel right, because we're only grabbing the ping metrics. The latter does reflect more the actual purpose, but will become outdated should any of the other metrics become interesting (among others, the API provides data for client connection quality, uplink/downlink/wireless utilization, WiFi signal strength, temperature).

Maybe something akin to --extra-metrics=<comma separated list> (i.e. --extra-metrics=ping), to be safe for future extension?

This is in preparation to fetch ping statistics from UNMS. The endpoint
returns A LOT more data, but for now this contains only the minimal set.
This enables fetching device statistics from UNMS, extracting
the ping RTT values from it, and export them back to prometheus.

Since fetching statistics requires talking to another, device-specific
endpoint, the context timeout is increased from 5 to 30s. A future
change should pass the request context from the promhttp endpoint
down to the fetchDeviceData() method, to make the timout depend on
Prometheus' scrape request life cycle.
@dmke

This comment was marked as outdated.

@mraerino
Copy link
Member

mraerino commented Apr 14, 2022

But I'm not a fan of underscores in flags. One option could be to just use a single-word flag name (--extras). Any other suggestions?

you can use StringSliceVar and pass a pointer to the config field as the first argument. that should work in any case

@dmke

This comment was marked as outdated.

@mraerino
Copy link
Member

mraerino commented Apr 14, 2022

wait, i have a feeling you're confusing some things here:

  • reading flags and config file is done via viper. you manually create the flags and use the mapstructure tag to read from the config and associate flags
  • env var handling is done via envconfig and you use the envconfig tag to specify the name

so setting envconfig:"extra_metrics" should solve the case where the env var does not work

@dmke
Copy link
Contributor Author

dmke commented Apr 14, 2022

[...] you're confusing [...] mapstructure [and] envconfig

You're right, I tripped over that.

setting envconfig:"extra_metrics" should solve the case

Already done here. That's not it.

I'd probably need to set mapstructure:"extra-metrics", but that would then introduce an inconsistency with mapstructure:"log_level"...

I've settled on a single config word (extras), which may not be 100% perfect, but avoids these edge cases.

@mraerino
Copy link
Member

@dmke i opened a PR against your PR to get the dash option working: digineo#1

the tests made it a lot easier to implement, thanks for that!

@mraerino
Copy link
Member

if you want to resolve the conflicts, this might be good to go!

@dmke
Copy link
Contributor Author

dmke commented Apr 17, 2022

Done.

I had to re-introduce the error-return for internal/handler.New(), I knew it was there for something important :)

Copy link
Member

@mraerino mraerino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one thing

@@ -241,6 +247,21 @@ func (e *Exporter) collectImpl(ctx context.Context, out chan<- prom.Metric) erro
out <- e.newMetric("wan_rx_rate", prom.GaugeValue, wanIF.Statistics.Rxrate, deviceLabels...)
out <- e.newMetric("wan_tx_rate", prom.GaugeValue, wanIF.Statistics.Txrate, deviceLabels...)
}

// Ping metrics, if enabled
if e.extras.Ping {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you need to check this here? should be fine to use the nil-check in PingMetrics right? that way the fetchDeviceData is the only place it's handled, which i'd like.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather leave it as is, because future extensions to the models.DeviceStatistics struct (hint: coming very soon) will require this conditional where it is now.

Copy link
Member

@mraerino mraerino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interesting feature. thanks for the contribution!

@mraerino mraerino merged commit 1ac4c3f into ffddorf:main Apr 17, 2022
@dmke dmke deleted the feat/ping-metrics branch April 17, 2022 16:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants