You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Introduce options to secure serving metrics on K8s system and EKS-A management components.
Why is this needed:
As an EKS Anywhere cluster administrator, I would like to scrape metrics from the K8s system and EKS-A management components in a simple but secure way. Those metrics are useful for building dashboard and alerts, monitoring the healthy state of a cluster.
Currently in EKS-A, metrics of some system components are already exposed by default (e.g. coredns, kube-api-server). Other system and management components such as kube-controller-manager are configured with the default --bind-address=127.0.0.1 or equivalent, so that these servers are only listening on localhost. The goal is to expose those metrics in a secure fashion so that external monitoring services such as Prometheus can consume them properly.
Details
There are three types of system/management components we would like to serve metrics from:
K8s system components, such as kube-controller-manager, kube-scheduler, kube-proxy.
EKS-A management components, such as eksa-cluster-controller, eks-anywhere-packages.
CAPI components, such as capi-controller, capi-kubeadm-control-plane, capv-controller (provider specific), etcdadm-controller, etcdadm-bootstrap-provider
In the list above, scraping metrics on the secure port of the K8s system components are already introduced as default in Kubernetes with native K8s authentication and authorization workflow: kubernetes/kubernetes#72491. So the controller-manager / scheduler secure metrics should already be enabled by default with --authentication-kubeconfig and authorization-kubeconfig flags. Regarding how they can emit metrics with RBAC, we need more investigation (whether the above core components can all be exposed from the /metrics endpoint via authentication (user/group/SA) and authorization (via RBAC verb: get, nonResourceURLs: /metrics)).
As for CAPI components, all of them are built based of controller-runtime who implemented a feature in its v0.16.0 release to provide a secure endpoint for metrics which uses https and provides authentication and authorization: kubernetes-sigs/controller-runtime#2407. CAPI community took this feature and implemented it to its core controllers in its v1.6.0 release: kubernetes-sigs/cluster-api#9264. Not all the CAPI infrastructure providers have yet implement the same feature but we do expect this to be the API pattern to follow. External etcd components are maintained by the EKS Anywhere team. We can follow the same pattern CAPI core did for secure diagnostics and implement it in etcdadm-controller-manager, etcdadm-bootstrap-provider.
For EKS-A management, it is also built based of controller-runtime. We can follow the same pattern CAPI community did for secure diagnostics -- this requires further changes in the EKS-A cluster-controller-manager and eks-anywhere-packages.
After figuring out how each type of components can serve metrics endpoint securely, we can then decide on how to make them configurable through EKS-A with simplicity and security. Whether it's through EKS-A cluster spec, or doc recommendation with RBAC and ClusterRole.
Planning
We want to prioritize the work of exposing K8s system components first based on request:
As explained above, the metrics authentication and authorization flow are different between those native K8s components vs the rest built on top of controller-runtime. Thus we would like to implement the feature by phases:
A design doc for a solution for all the system and management components. It needs to be generic enough to onboard or be compatible with the K8s/ EKS-A / CAPI / etcd components metrics.
Implementation of exposing K8s system components based on the design.
Introducing secure diagnostics in EKS-A management components featuring controller-runtime authorization for metrics endpoint.
Introducing secure diagnostics in external etcd components featuring controller-runtime authorization for metrics endpoint.
Pushing or contributing to CAPI to enable secure diagnostics features for all EKS-A supported CAPI providers.
Implementation of exposing EKS-A and CAPI components metrics through cluster spec.
The text was updated successfully, but these errors were encountered:
What would you like to be added:
Introduce options to secure serving metrics on K8s system and EKS-A management components.
Why is this needed:
As an EKS Anywhere cluster administrator, I would like to scrape metrics from the K8s system and EKS-A management components in a simple but secure way. Those metrics are useful for building dashboard and alerts, monitoring the healthy state of a cluster.
Currently in EKS-A, metrics of some system components are already exposed by default (e.g.
coredns
,kube-api-server
). Other system and management components such askube-controller-manager
are configured with the default--bind-address=127.0.0.1
or equivalent, so that these servers are only listening onlocalhost
. The goal is to expose those metrics in a secure fashion so that external monitoring services such as Prometheus can consume them properly.Details
There are three types of system/management components we would like to serve metrics from:
kube-controller-manager
,kube-scheduler
,kube-proxy
.eksa-cluster-controller
,eks-anywhere-packages
.capi-controller
,capi-kubeadm-control-plane
,capv-controller
(provider specific),etcdadm-controller
,etcdadm-bootstrap-provider
In the list above, scraping metrics on the secure port of the K8s system components are already introduced as default in Kubernetes with native K8s authentication and authorization workflow: kubernetes/kubernetes#72491. So the controller-manager / scheduler secure metrics should already be enabled by default with
--authentication-kubeconfig
andauthorization-kubeconfig
flags. Regarding how they can emit metrics with RBAC, we need more investigation (whether the above core components can all be exposed from the /metrics endpoint via authentication (user/group/SA) and authorization (via RBAC verb: get, nonResourceURLs: /metrics)).As for CAPI components, all of them are built based of controller-runtime who implemented a feature in its v0.16.0 release to provide a secure endpoint for metrics which uses https and provides authentication and authorization: kubernetes-sigs/controller-runtime#2407. CAPI community took this feature and implemented it to its core controllers in its v1.6.0 release: kubernetes-sigs/cluster-api#9264. Not all the CAPI infrastructure providers have yet implement the same feature but we do expect this to be the API pattern to follow. External etcd components are maintained by the EKS Anywhere team. We can follow the same pattern CAPI core did for secure diagnostics and implement it in
etcdadm-controller-manager
,etcdadm-bootstrap-provider
.For EKS-A management, it is also built based of controller-runtime. We can follow the same pattern CAPI community did for secure diagnostics -- this requires further changes in the EKS-A
cluster-controller-manager
andeks-anywhere-packages
.After figuring out how each type of components can serve metrics endpoint securely, we can then decide on how to make them configurable through EKS-A with simplicity and security. Whether it's through EKS-A cluster spec, or doc recommendation with RBAC and ClusterRole.
Planning
We want to prioritize the work of exposing K8s system components first based on request:
As explained above, the metrics authentication and authorization flow are different between those native K8s components vs the rest built on top of controller-runtime. Thus we would like to implement the feature by phases:
The text was updated successfully, but these errors were encountered: