Skip to content
This repository has been archived by the owner on Aug 29, 2023. It is now read-only.

Remove dynamic metric names #226

Closed
dapplion opened this issue Nov 11, 2022 · 11 comments
Closed

Remove dynamic metric names #226

dapplion opened this issue Nov 11, 2022 · 11 comments
Labels
P1 High: Likely tackled by core team if no one steps up

Comments

@dapplion
Copy link
Contributor

@achingbrain Using a variable metric name is a really bad idea this will break dashboards. Metric name should be constant and contain only _ as special character

status: context.metrics.registerMetric(`libp2p_tcp_${addr}_server_status`, {

I don't really see the point for splitting metrics for different ports. Once someone requests that feature we can add a label on the metric per address. But please do not do this now adding extra labels adds cost to metrics at no-one request.

Originally posted by @dapplion in #223 (comment)

@p-shahi p-shahi moved this to Weekly Candidates in js-libp2p Nov 15, 2022
@mpetrunic mpetrunic added P0 Critical: Tackled by core team ASAP P1 High: Likely tackled by core team if no one steps up and removed P0 Critical: Tackled by core team ASAP labels Nov 15, 2022
@achingbrain
Copy link
Member

It's possible to configure multiple TCP listeners for a libp2p node, the address is in the metric name to disambiguate between them.

If you don't want this, you can supply an implementation of the Metrics interface that strips them out. Individual metrics are set up once at transport creation time so the overhead is minimal.

@wemeetagain
Copy link
Member

Can we add the address as another label for these metrics?

@achingbrain
Copy link
Member

achingbrain commented Nov 17, 2022

I don't think so as the metric names have to be unique.

If I start a node with two tcp listen addresses using the tcp addresses as labels I get errors like "Error: A metric with the name libp2p_tcp_connections_count has already been registered."

@dapplion
Copy link
Contributor Author

@achingbrain Then let consumers add a metrics suffix and prefix that they choose. Again current master is just un-usable and a terrible for actual deployments

@achingbrain
Copy link
Member

Then let consumers add a metrics suffix and prefix that they choose

This makes it impossible to comply with the prometheus naming conventions:

A metric name...
..should have a (single-word) application prefix relevant to the domain the metric belongs to.
...should have a suffix describing the unit, in plural form

So prefixes and suffixes are out, it has to be something in the middle of the metric name.

Again current master is just un-usable and a terrible for actual deployments

What is the actual problem this causes? The only change here is that it used to be "libp2p_tcp_server_status_info" and now it's "libp2p_tcp_0_0_0_0_4002_server_status_info" (or similar).

The metric name is stable and predictable unless you specify 0 as a tcp port.

Could you please go into a bit more detail about how this is un-usable?

@dapplion
Copy link
Contributor Author

Then let consumers add a metrics suffix and prefix that they choose

This makes it impossible to comply with the prometheus naming conventions:

A metric name...
..should have a (single-word) application prefix relevant to the domain the metric belongs to.
...should have a suffix describing the unit, in plural form

So prefixes and suffixes are out, it has to be something in the middle of the metric name.

Again current master is just un-usable and a terrible for actual deployments

What is the actual problem this causes? The only change here is that it used to be "libp2p_tcp_server_status_info" and now it's "libp2p_tcp_0_0_0_0_4002_server_status_info" (or similar).

The metric name is stable and predictable unless you specify 0 as a tcp port.

Could you please go into a bit more detail about how this is un-usable?

The dashboards we commit and distribute would become deployment specific to the port level. That's awful. Change the port of 1 of your deployed nodes and bricked.

If you want to comply with suffix prefix rule, then allow to inject some word in the middle:

function getMetricName(opt: {metricKeyword?: string}): string {
 if (opt.metricKeyword) {
   return `libp2p_tcp_${opt.metricKeyword}_server_status_info`
 } else {
  return "libp2p_tcp_server_status_info"
 }
}

There is no reason to force us into dealing with libp2p_tcp_0_0_0_0_4002_server_status_info without consent. Let consumers take that decision.

@achingbrain
Copy link
Member

Can we add the address as another label for these metrics?

I don't think so as the metric names have to be unique.

Thinking about this this a bit more, if we refactor @libp2p/prometheus-metrics to globally cache created metrics (prom-client metrics are global anyway so this isn't as crazy as it sounds) and re-use them if the same metric is created again, then we can track transport metrics as metric groups and use the socket address as the key for the metric which seems to work well.

image

It gets a little complicated as for calculated metrics we'll need to turn multiple values into a single value which may be surprising to users looking at the final metrics but it should hopefully be obvious from the metrics that something isn't right.

@dapplion
Copy link
Contributor Author

So we'll get rid of libp2p_tcp_0_0_0_0_4002_server_status_info?

@achingbrain
Copy link
Member

Yes, the address is in a label rather than the metric name

@dapplion
Copy link
Contributor Author

Yes, the address is in a label rather than the metric name

Can you do a PR?

@achingbrain
Copy link
Member

It's here: #230

Repository owner moved this from Weekly Candidates/Discuss to Done in js-libp2p Nov 22, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
P1 High: Likely tackled by core team if no one steps up
Projects
Archived in project
Development

No branches or pull requests

4 participants