Adding support for --listen-metrics-url parameter #157

suhrud-kumar · 2020-04-26T14:02:34Z

etcd has support for providing separate address for metrics endpoint.

This can be useful if the standard endpoint is configured with mutual (client) TLS authentication, but a load balancer or monitoring service still needs access to the metrics or health check. (etcd api will be served over https but /metrics endpoint will served over http )

Since etcdadm creates etcd in TLS be default, this will be useful for folks who don't to add etcd certs in their monitoring systems (for security reasons).

Currently we implemented a small hack to get around this, but it would be great if this is supported by etcdadm.

Let me know what you guys think, I can try to implement this with some guidance.

References:
https://github.com/etcd-io/etcd/blob/master/Documentation/op-guide/configuration.md#--listen-metrics-urls
https://github.com/etcd-io/etcd/blob/master/Documentation/op-guide/monitoring.md#metrics-endpoint

dlipovetsky · 2020-04-27T15:58:22Z

I agree, exposing the metrics endpoint would be useful. It's a great idea, thanks for suggesting it @suhrud-kumar!

Since etcdadm creates etcd in TLS be default, this will be useful for folks who don't to add etcd certs in their monitoring systems (for security reasons).

I'm skeptical of this motivation. Can you explain why someone might not want to "add etcd certs in their monitoring system?" (Thinking out loud: The etcd metrics endpoint could be used by a "pull" monitoring system (like Prometheus), and that system would only need a client certificate. The metrics client certificate, as far as I can tell, would have to be the same as the normal client certificate. So, unless etcd authorization is enabled, the monitoring system would have access to etcd data. I see why that's not desirable.)

I found some related discussion here: etcd-io/etcd#8060 (comment)

If you're able to work on it, please use the "/lifecycle active" command. If you have any implementation questions, please ask in #etcdadm channel in Kubernetes slack.

/priority important-longterm

suhrud-kumar · 2020-04-27T19:31:46Z

I agree, exposing the metrics endpoint would be useful. It's a great idea, thanks for suggesting it @suhrud-kumar!

Just to clarify, metrics endpoint will be exposed by default but in https (if installed using etcdadm), by adding this parameter an additional metrics endpoint will be exposed in http

Since etcdadm creates etcd in TLS be default, this will be useful for folks who don't to add etcd certs in their monitoring systems (for security reasons).

I'm skeptical of this motivation. Can you explain why someone might not want to "add etcd certs in their monitoring system?" (Thinking out loud: The etcd metrics endpoint could be used by a "pull" monitoring system (like Prometheus), and that system would only need a client certificate. The metrics client certificate, as far as I can tell, would have to be the same as the normal client certificate. So, unless etcd authorization is enabled, the monitoring system would have access to etcd data. I see why that's not desirable.)

Yes that's correct, since these certs can be used to access etcd data (thereby k8s secrets), we don't want to use them in other places unless absolutely required. And etcd provides a way to not use them for metrics endpoint (using above parameter).

Before starting working on this, I have a query on implementation. I have thought of adding one of three below arguments to implement this parameter in etcdadm.

--listen-metrics-url
Use etcd parameter as-is. But user would have to specify complete address of this endpoint including scheme (http or https), ip address (mostly node ip), and port
(for example --listen-metrics-url http://:2381)

This requires user to know ip address of node for etcd init or join (can be done using automation scripts) but most configurable.

--enable-http-metrics-on-port
User only needs to specify port, we can create endpoint in http and use node ip by default
--enable-http-metrics
If user sets this to true, create additional metrics endpoint in http, on port 2381 (since etcd already uses 2379 & 2380) and use node ip by default

@dlipovetsky
Let me know your thoughts on this.

dlipovetsky · 2020-05-04T22:33:34Z

--listen-metrics-url
Use etcd parameter as-is. But user would have to specify complete address of this endpoint including scheme (http or https), ip address (mostly node ip), and port
(for example --listen-metrics-url http://:2381)

As you point out, this is the most configurable option. And etcdadm can provide a sane default (as it would have to do, if we chose one of the other options).

The URL breaks down into scheme://host:port.

Scheme: Has to be http. I don't see any way to provide a cert the metrics URL, so I don't think etcd supports https here. Corect me if I'm wrong, please.
Host: I think the default URL host can be equal to the default advertised client URL host.
Port: The default port, I think, could be the one "officially" allocated to the etcd exporter in the Prometheus wiki, 9379. (As an aside, 2381 looks like it was once officially allocated, but is not anymore).

suhrud-kumar · 2020-05-07T05:31:20Z

The URL breaks down into scheme://host:port.

Scheme: Has to be http. I don't see any way to provide a cert the metrics URL, so I don't think etcd supports https here. Corect me if I'm wrong, please.

Well technically it can also be https, and users will have to access endpoint using etcd client certs. But a https /metrics endpoint will be available by default anyway, so this may not be much useful. We can keep default scheme as http.

/lifecycle active

fejta-bot · 2020-08-05T06:00:49Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot · 2020-09-04T06:42:34Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

fejta-bot · 2020-10-04T07:25:40Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

k8s-ci-robot · 2020-10-04T07:25:47Z

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot added the priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. label Apr 27, 2020

k8s-ci-robot added the lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. label May 7, 2020

This was referenced May 8, 2020

added support for --listen-metrics-url parameter #159

Closed

added support for --listen-metrics-url parameter #160

Closed

k8s-ci-robot added lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. and removed lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. labels Aug 5, 2020

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Sep 4, 2020

k8s-ci-robot closed this as completed Oct 4, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding support for --listen-metrics-url parameter #157

Adding support for --listen-metrics-url parameter #157

suhrud-kumar commented Apr 26, 2020 •

edited

Loading

dlipovetsky commented Apr 27, 2020 •

edited

Loading

suhrud-kumar commented Apr 27, 2020

dlipovetsky commented May 4, 2020

suhrud-kumar commented May 7, 2020 •

edited

Loading

fejta-bot commented Aug 5, 2020

fejta-bot commented Sep 4, 2020

fejta-bot commented Oct 4, 2020

k8s-ci-robot commented Oct 4, 2020

Adding support for --listen-metrics-url parameter #157

Adding support for --listen-metrics-url parameter #157

Comments

suhrud-kumar commented Apr 26, 2020 • edited Loading

dlipovetsky commented Apr 27, 2020 • edited Loading

suhrud-kumar commented Apr 27, 2020

dlipovetsky commented May 4, 2020

suhrud-kumar commented May 7, 2020 • edited Loading

fejta-bot commented Aug 5, 2020

fejta-bot commented Sep 4, 2020

fejta-bot commented Oct 4, 2020

k8s-ci-robot commented Oct 4, 2020

suhrud-kumar commented Apr 26, 2020 •

edited

Loading

dlipovetsky commented Apr 27, 2020 •

edited

Loading

suhrud-kumar commented May 7, 2020 •

edited

Loading