From 67a1d8ef3a757fdcf8709b85698fb1693418dbb1 Mon Sep 17 00:00:00 2001 From: Hannes Baum Date: Wed, 11 Oct 2023 11:58:21 +0200 Subject: [PATCH 1/8] K8s cluster robustness features (#414) This commit adds the standard for K8s robustness features, including Kube-API rate limiting, ETCD compaction as well as CA expiration avoidance. Signed-off-by: Hannes Baum --- Standards/scs-0215-v1-robustness-features.md | 251 +++++++++++++++++++ 1 file changed, 251 insertions(+) create mode 100644 Standards/scs-0215-v1-robustness-features.md diff --git a/Standards/scs-0215-v1-robustness-features.md b/Standards/scs-0215-v1-robustness-features.md new file mode 100644 index 000000000..9f76fdfac --- /dev/null +++ b/Standards/scs-0215-v1-robustness-features.md @@ -0,0 +1,251 @@ +--- +title: Robustness features for K8s clusters +type: Standard +status: Draft +track: KaaS +--- + +## Introduction + +Kubernetes clusters in a productive environment are under the assumption to always perform perfectly without any major +interruptions. But due to external or unforeseen influences, clusters can be disrupted in their normal workflow, which +could lead to slow responsiveness or even malfunctions. +In order to possibly mitigate some problems for the Kubernetes clusters, robustness features should be introduced into +the SCS standards. These would harden the cluster infrastructure against several problems, making failures less likely. + +### Glossary + +The following special terms are used throughout this standard document: + +| Term | Meaning | +|-----------------------------|----------------------------------------------------------------| +| Certificate Authority | Trusted organization that issues digital certificates entities | +| Certificate Signing Request | Request in order to apply for a digital identity certificate | + +## Motivation + +A typical productive Kubernetes cluster could be hardened in many different ways, whereas probably many of these actions +would overlap and target similar weaknesses of a cluster. +For this version of the standard, the following points should be addressed: + +* Kube-API rate limiting +* etcd compaction/defragmentation +* etcd backup +* Certificate Authority (CA) expiration avoidance + +These robustness features should mainly increase the up-time of the Kubernetes cluster by avoiding downtimes either +because of internal problems or external threads like "Denial of Service" attacks. +Additionally, the etcd database should be strengthened with these features in order to provide a secure and robust +backend for the Kubernetes cluster. + +## Design Considerations + +In order to provide a conclusive standard, some design considerations need to be set beforehand: + +### Kube-API rate limiting + +Rate limiting is the practice of preventing too many requests to the same server in some time frame. This can help prevent +service interruptions due to congestion and therefore slow responsiveness or even service shutdown. +Kubernetes suggests multiple ways to integrate such a Ratelimit for its API server, a few of which will be mentioned here. +In order to provide a useful Ratelimit for the Kubernetes cluster, combination of these methods should be considered. + +#### API server flags + +The Kubernetes API server has some flags available to limit the amount of incoming requests that will be accepted by +the server, which should prevent crashing of the API server. This nevertheless shouldn't be the only measure to +introduce a rate limit, since important requests could get blocked during high traffic periods (at least according to +the official documentation). +The following controls are available to tune the server: + +* max-requests-inflight +* max-mutating-requests-inflight +* min-request-timeout + +More details can be found in the following documents: +[Flow Control](https://kubernetes.io/docs/concepts/cluster-administration/flow-control/) + +#### Ratelimit Admission Controller + +From version 1.13 onwards, Kubernetes includes a EventRateLimit Admission Controller, which aims to mitigate Ratelimit +problems associated with the API server by providing limits for requests every second either to specific resources or +even the whole API server. If requests are denied due to this Admission Controller, they're either cached or denied +completely and need to be retried; this also depends on the EventRateLimit configuration. +More details can be found in the following documents: +[Rancher rate limiting](https://rke.docs.rancher.com/config-options/rate-limiting) +[EventRateLimit](https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#eventratelimit) +It is important to note, that this only helps the API server against event overloads and not necessarily the network +in front of it, which could still be overwhelmed. + +#### Flow control + +Flow control for the Kubernetes API server can be provided by the API priority and fairness feature, which classifies +and isolates requests in a fine-grained way in order to prevent an overload of the API server. +The package introduces queues in order to not deny requests and dequeue them through Fair Queueing techniques. +Overall, the Flow control package introduces many different features like request queues, rule-based flow control, +different priority levels and rate limit maximums. +The concept documentation offers a more in-depth explanation of the feature: +[Flow Control](https://kubernetes.io/docs/concepts/cluster-administration/flow-control/) + +### etcd maintenance + +etcd is a strongly consistent, distributed key-value store that provides a reliable way to store data that needs to be +accessed by a distributed system or cluster of machines. For these reasons, etcd was chosen as the default database +for Kubernetes. +In order to remain reliable, an etcd cluster needs periodic maintenance. This is necessary to maintain the etcd keyspace; +failure to do so could lead to a cluster-wide alarm, which would put the cluster into a limited-operation mode. +To mitigate this scenario, the etcd keyspace can be compacted. Additionally, an etcd cluster can be defragmented, which +gives back disk space to the underlying file system and can help bring the cluster back into an operable state, if it +ran out of space earlier. + +etcd keyspace maintenance can be achieved by providing the necessary flags/parameters to etcd, either via the KubeadmControlPlane or in the +configuration file of the etcd cluster, if it is managed independent of the Kubernetes cluster. +Possible flags, that can be set for this feature, are: + +* auto-compaction-mode +* auto-compaction-retention + +More information about compaction can be found in the respective etcd documentation +[etcd maintenance](https://etcd.io/docs/v3.3/op-guide/maintenance/) + +### etcd backup + +An etcd cluster should be regularly backed up in order to be able to restore the cluster to a known good state at an +earlier space in time if a failure or incorrect state happens. +The cluster should be backed up multiple times in order to have different possible states to go back to. This is especially +useful, if data in the newer backups was already corrupted in some way or important data was deleted in them. +For this reason, a backup strategy needs to be developed with a decreasing number of backups in an increasing period of time, +meaning that the previous year should only have 1 backup, but the current week should have multiple. +Information about the backup process can be found in the etcd documentation: +[Upgrade etcd](https://kubernetes.io/docs/tasks/administer-cluster/configure-upgrade-etcd/) + +### Certificate rotation + +In order to secure the communication of a Kubernetes cluster, (TLS) certificates signed by a controlled +Certificate Authority (CA) can be used. +Normally, these certificates expire after a set period of time. In order to avoid expiration and failure of a cluster, +these certificates need to be rotated regularly and at best automatically. +It is important to either set `--rotate-server-certificates` as a command line parameter or set `rotateCertificates: true` +in the kubelet config or the `kubeletExtraArgs` of the cluster-template.yaml file. This activates the rotation of the +kubelet server certificate. Another recommendation is to set `serverTLSBootstrap: true`, which also enables the request +and rotation of the certificate for the kubelet according to the documentation. + +A clusters certificates can either be rotated by updating the cluster, which according to the Kubernetes documentation +automatically renews the certificates. +Another method would be the manual renewal, which can be done through multiple methods, depending on the K8s cluster +used. An example for a default K8s cluster would be to execute the command + +```bash +kubeadm certs renew all +``` + +Since clusters conformant with the SCS standards would probably be updated within the time period described in the +standard [SCS-0210-v2](https://github.com/SovereignCloudStack/standards/tree/main/Standards/scs-0210-v2-k8s-version-policy.md), +this rotation can probably be assumed to happen. Nevertheless, the alternative can still be mentioned in the standard. +Additionally, the Certificate Signing Request (CSR) need to be approved manually due to security reasons with the commands + +```bash +kubectl get csr +kubectl certificate approve +``` + +Another option to approve the CSRs would be to use a third-party controller that automates the process. One example for +this would be the [Kubelet CSR approver](https://github.com/postfinance/kubelet-csr-approver), which can be deployed on +a K8s cluster and requires `serverTLSBootstrap` to be set to true. Other controllers with a similar functionality might +have other specific requirements, which won't be explored in this document. + +Another problem is that the CA might expire. Unfortunately, not all K8s tools have functionality to renew these +certificates with the help of commands. Instead, there is documentation for manually rotating the CA, which can be found +under [Manual rotation of ca certificate](https://kubernetes.io/docs/tasks/tls/manual-rotation-of-ca-certificates/). + +Further information can be found in the Kubernetes documentation: +[Kubeadm certs](https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-certs/) +[Kubelete TLS bootstrapping](https://kubernetes.io/docs/reference/access-authn-authz/kubelet-tls-bootstrapping/) + +## Decision + +Robustness features combine multiple aspects of increasing the security, hardness and +longevity of a Kubernetes cluster. The decisions will be separated into their respective +areas. + +### Kube-API rate limiting + +The number of requests send to the kube-api or Kubernetes API server MUST be limited +in order to protect the server against outages, deceleration or malfunctions due to an +overload of requests. +In order to do so, at least the following parameters MUST be set on a Kubernetes cluster: + +* max-requests-inflight +* max-mutating-requests-inflight +* min-request-timeout + +Values for these flags/parameters SHOULD be adapted to the needs of the environment and +the expected load. + +A cluster MUST also activate and configure a Ratelimit admission controller. +This requires an `EventRateLimit` resource to be deployed on the Kubernetes cluster. +The following settings are RECOMMENDED for a cluster-wide deployment, but more +fine-grained rate limiting can also be applied, if this is necessary. + +```yaml +kind: Configuration +apiVersion: eventratelimit.admission.k8s.io/v1alpha1 +limits: +- burst: 20000 + qps: 5000 + type: Server +``` + +It is also RECOMMENDED to activate the Kubernetes API priority and fairness feature, +which also uses the aforementioned cluster parameters to better queue, schedule and +prioritize incoming requests. + +### etcd compaction + +etcd MUST be cleaned up regularly, so that it functions correctly and doesn't take +up too much space, which happens because of its increase of the keyspace. + +To compact the etcd keyspace, the following flags/parameters MUST be set for etcd: + +* auto-compaction-mode = periodic +* auto-compaction-retention = 8h + +### etcd backup + +An etcd cluster MUST be backed up regularly. It is RECOMMENDED to adapt +a strategy of decreasing backups over longer time periods, e.g. keeping snapshots every +10 minutes for the last 120 minutes, then one hourly for 1 day, then one daily for 2 weeks, +then one weekly for 3 months, then one monthly for 2 years, and after that a yearly backup. +At the very least, a backup MUST be done once a week. +These numbers can be adapted to the security setup and concerns like storage or network +usage. It is also RECOMMENDED to encrypt the backups in order to secure them further. +How this is done is up to the operator. + +### Certificate rotation + +It should be avoided, that certificates expire either on the whole cluster or for single components. +To avoid this scenario, certificates SHOULD be rotated regularly; in the +case of SCS, we REQUIRE at least a yearly certificate rotation. +To achieve a complete certificate rotation, the parameters `serverTLSBootstrap` and `rotateCertificates` +MUST be set in the kubelet configuration. + +The certificates can be rotated by either updating the Kubernetes cluster, which automatically +renews certificates, or by manually renewing them. How this is done is dependent on the used K8s cluster. + +After this, new CSRs MUST be approved manually or with a third-party controller, e.g. the [kubelet-csr-approver](https://github.com/postfinance/kubelet-csr-approver). + +It is also RECOMMENDED to renew the CA regularly to avoid an expiration of the CA. +This standard doesn't set a timeline for this, since it is dependent on the CA. + +## Related Documents + +[Flow Control](https://kubernetes.io/docs/concepts/cluster-administration/flow-control/) +[Rate limiting](https://rke.docs.rancher.com/config-options/rate-limiting) +[EventRateLimit](https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#eventratelimit) +[etcd maintenance](https://etcd.io/docs/v3.3/op-guide/maintenance/) +[Upgrade etcd](https://kubernetes.io/docs/tasks/administer-cluster/configure-upgrade-etcd/) +[Kubeadm certs](https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-certs/) +[Kubelet TLS bootstrapping](https://kubernetes.io/docs/reference/access-authn-authz/kubelet-tls-bootstrapping/) + +## Conformance Tests + +Conformance Tests, OPTIONAL From 84fcef3ff4670d82ac28ab5cbf85caf515678a01 Mon Sep 17 00:00:00 2001 From: Hannes Baum Date: Thu, 11 Jan 2024 11:23:49 +0100 Subject: [PATCH 2/8] fixup! K8s cluster robustness features (#414) Signed-off-by: Hannes Baum --- Standards/scs-0215-v1-robustness-features.md | 22 ++++++++++++-------- 1 file changed, 13 insertions(+), 9 deletions(-) diff --git a/Standards/scs-0215-v1-robustness-features.md b/Standards/scs-0215-v1-robustness-features.md index 9f76fdfac..088416ff4 100644 --- a/Standards/scs-0215-v1-robustness-features.md +++ b/Standards/scs-0215-v1-robustness-features.md @@ -127,12 +127,13 @@ these certificates need to be rotated regularly and at best automatically. It is important to either set `--rotate-server-certificates` as a command line parameter or set `rotateCertificates: true` in the kubelet config or the `kubeletExtraArgs` of the cluster-template.yaml file. This activates the rotation of the kubelet server certificate. Another recommendation is to set `serverTLSBootstrap: true`, which also enables the request -and rotation of the certificate for the kubelet according to the documentation. +and rotation of the certificate for the kubelet by requesting them from `certificates.k8s.io` +according to the documentation under [Kubeadm certs]. -A clusters certificates can either be rotated by updating the cluster, which according to the Kubernetes documentation +A clusters certificates can be rotated by updating the cluster, which according to the Kubernetes documentation automatically renews the certificates. Another method would be the manual renewal, which can be done through multiple methods, depending on the K8s cluster -used. An example for a default K8s cluster would be to execute the command +used. An example for a K8s cluster set up with `kubeadm` would be to execute the command ```bash kubeadm certs renew all @@ -140,16 +141,18 @@ kubeadm certs renew all Since clusters conformant with the SCS standards would probably be updated within the time period described in the standard [SCS-0210-v2](https://github.com/SovereignCloudStack/standards/tree/main/Standards/scs-0210-v2-k8s-version-policy.md), -this rotation can probably be assumed to happen. Nevertheless, the alternative can still be mentioned in the standard. -Additionally, the Certificate Signing Request (CSR) need to be approved manually due to security reasons with the commands +this rotation can probably be assumed to happen, because of the cluster update functionality. + +It is also important to mention, that due to security reasons, the CSR needs to be approved manually with the commands ```bash kubectl get csr kubectl certificate approve ``` +in order to complete a certificate rotation. Another option to approve the CSRs would be to use a third-party controller that automates the process. One example for -this would be the [Kubelet CSR approver](https://github.com/postfinance/kubelet-csr-approver), which can be deployed on +this would be the [kubelet csr approver](https://github.com/postfinance/kubelet-csr-approver), which can be deployed on a K8s cluster and requires `serverTLSBootstrap` to be set to true. Other controllers with a similar functionality might have other specific requirements, which won't be explored in this document. @@ -157,7 +160,7 @@ Another problem is that the CA might expire. Unfortunately, not all K8s tools ha certificates with the help of commands. Instead, there is documentation for manually rotating the CA, which can be found under [Manual rotation of ca certificate](https://kubernetes.io/docs/tasks/tls/manual-rotation-of-ca-certificates/). -Further information can be found in the Kubernetes documentation: +Further information and examples can be found in the Kubernetes documentation: [Kubeadm certs](https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-certs/) [Kubelete TLS bootstrapping](https://kubernetes.io/docs/reference/access-authn-authz/kubelet-tls-bootstrapping/) @@ -223,7 +226,7 @@ How this is done is up to the operator. ### Certificate rotation It should be avoided, that certificates expire either on the whole cluster or for single components. -To avoid this scenario, certificates SHOULD be rotated regularly; in the +To avoid this scenario, certificates MUST be rotated regularly; in the case of SCS, we REQUIRE at least a yearly certificate rotation. To achieve a complete certificate rotation, the parameters `serverTLSBootstrap` and `rotateCertificates` MUST be set in the kubelet configuration. @@ -234,7 +237,8 @@ renews certificates, or by manually renewing them. How this is done is dependent After this, new CSRs MUST be approved manually or with a third-party controller, e.g. the [kubelet-csr-approver](https://github.com/postfinance/kubelet-csr-approver). It is also RECOMMENDED to renew the CA regularly to avoid an expiration of the CA. -This standard doesn't set a timeline for this, since it is dependent on the CA. +This standard doesn't set an exact timeline for a renewal, since it is dependent on lifetime and +therefore expiration date of the CA in question. ## Related Documents From f846ba30c0c6b9021ac5ae8ef54fda0c276d43e4 Mon Sep 17 00:00:00 2001 From: Hannes Baum Date: Thu, 11 Jan 2024 11:25:55 +0100 Subject: [PATCH 3/8] fixup! K8s cluster robustness features (#414) Signed-off-by: Hannes Baum --- Standards/scs-0215-v1-robustness-features.md | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/Standards/scs-0215-v1-robustness-features.md b/Standards/scs-0215-v1-robustness-features.md index 088416ff4..9df56a244 100644 --- a/Standards/scs-0215-v1-robustness-features.md +++ b/Standards/scs-0215-v1-robustness-features.md @@ -127,7 +127,7 @@ these certificates need to be rotated regularly and at best automatically. It is important to either set `--rotate-server-certificates` as a command line parameter or set `rotateCertificates: true` in the kubelet config or the `kubeletExtraArgs` of the cluster-template.yaml file. This activates the rotation of the kubelet server certificate. Another recommendation is to set `serverTLSBootstrap: true`, which also enables the request -and rotation of the certificate for the kubelet by requesting them from `certificates.k8s.io` +and rotation of the certificate for the kubelet by requesting them from `certificates.k8s.io` according to the documentation under [Kubeadm certs]. A clusters certificates can be rotated by updating the cluster, which according to the Kubernetes documentation @@ -149,6 +149,7 @@ It is also important to mention, that due to security reasons, the CSR needs to kubectl get csr kubectl certificate approve ``` + in order to complete a certificate rotation. Another option to approve the CSRs would be to use a third-party controller that automates the process. One example for @@ -234,10 +235,11 @@ MUST be set in the kubelet configuration. The certificates can be rotated by either updating the Kubernetes cluster, which automatically renews certificates, or by manually renewing them. How this is done is dependent on the used K8s cluster. -After this, new CSRs MUST be approved manually or with a third-party controller, e.g. the [kubelet-csr-approver](https://github.com/postfinance/kubelet-csr-approver). +After this, new CSRs MUST be approved manually or with a third-party controller, +e.g. the [kubelet-csr-approver](https://github.com/postfinance/kubelet-csr-approver). It is also RECOMMENDED to renew the CA regularly to avoid an expiration of the CA. -This standard doesn't set an exact timeline for a renewal, since it is dependent on lifetime and +This standard doesn't set an exact timeline for a renewal, since it is dependent on lifetime and therefore expiration date of the CA in question. ## Related Documents From e2469bb8709e5bdc2aa732210280bba21371fe84 Mon Sep 17 00:00:00 2001 From: Hannes Baum Date: Fri, 12 Jan 2024 14:36:07 +0100 Subject: [PATCH 4/8] fixup! K8s cluster robustness features (#414) Signed-off-by: Hannes Baum --- Standards/scs-0215-v1-robustness-features.md | 3 --- 1 file changed, 3 deletions(-) diff --git a/Standards/scs-0215-v1-robustness-features.md b/Standards/scs-0215-v1-robustness-features.md index 9df56a244..9b5ac39c1 100644 --- a/Standards/scs-0215-v1-robustness-features.md +++ b/Standards/scs-0215-v1-robustness-features.md @@ -235,9 +235,6 @@ MUST be set in the kubelet configuration. The certificates can be rotated by either updating the Kubernetes cluster, which automatically renews certificates, or by manually renewing them. How this is done is dependent on the used K8s cluster. -After this, new CSRs MUST be approved manually or with a third-party controller, -e.g. the [kubelet-csr-approver](https://github.com/postfinance/kubelet-csr-approver). - It is also RECOMMENDED to renew the CA regularly to avoid an expiration of the CA. This standard doesn't set an exact timeline for a renewal, since it is dependent on lifetime and therefore expiration date of the CA in question. From 3eb4898e7298551db09e6475c72d13ab5489711a Mon Sep 17 00:00:00 2001 From: Hannes Baum Date: Fri, 12 Jan 2024 14:44:06 +0100 Subject: [PATCH 5/8] fixup! K8s cluster robustness features (#414) Signed-off-by: Hannes Baum --- Standards/scs-0215-v1-robustness-features.md | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/Standards/scs-0215-v1-robustness-features.md b/Standards/scs-0215-v1-robustness-features.md index 9b5ac39c1..43cc2e489 100644 --- a/Standards/scs-0215-v1-robustness-features.md +++ b/Standards/scs-0215-v1-robustness-features.md @@ -143,14 +143,20 @@ Since clusters conformant with the SCS standards would probably be updated withi standard [SCS-0210-v2](https://github.com/SovereignCloudStack/standards/tree/main/Standards/scs-0210-v2-k8s-version-policy.md), this rotation can probably be assumed to happen, because of the cluster update functionality. -It is also important to mention, that due to security reasons, the CSR needs to be approved manually with the commands +It is also important to mention, that CSRs may need to be approved manually with the commands ```bash kubectl get csr kubectl certificate approve ``` -in order to complete a certificate rotation. +in order to complete a certificate rotation. This is most likely dependent on the K8s cluster solution in use. +`kubectl get csr` allows to check, if this is the case; a `Pending` CSR would need to be approved. + +```bash +NAME AGE SIGNERNAME REQUESTOR CONDITION +csr-9wvgt 112s kubernetes.io/kubelet-serving system:node:worker-1 Pending +``` Another option to approve the CSRs would be to use a third-party controller that automates the process. One example for this would be the [kubelet csr approver](https://github.com/postfinance/kubelet-csr-approver), which can be deployed on From ef6ad56bef04a7a2ec8967f59f7102fcbf190523 Mon Sep 17 00:00:00 2001 From: Jan Schoone Date: Mon, 15 Jan 2024 15:18:44 +0100 Subject: [PATCH 6/8] chore(wording): recommend to use the long form for this type of document Signed-off-by: Jan Schoone --- Standards/scs-0215-v1-robustness-features.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/Standards/scs-0215-v1-robustness-features.md b/Standards/scs-0215-v1-robustness-features.md index 43cc2e489..0f7298ed1 100644 --- a/Standards/scs-0215-v1-robustness-features.md +++ b/Standards/scs-0215-v1-robustness-features.md @@ -1,5 +1,5 @@ --- -title: Robustness features for K8s clusters +title: Robustness features for Kubernetes clusters type: Standard status: Draft track: KaaS @@ -132,8 +132,8 @@ according to the documentation under [Kubeadm certs]. A clusters certificates can be rotated by updating the cluster, which according to the Kubernetes documentation automatically renews the certificates. -Another method would be the manual renewal, which can be done through multiple methods, depending on the K8s cluster -used. An example for a K8s cluster set up with `kubeadm` would be to execute the command +Another method would be the manual renewal, which can be done through multiple methods, depending on the Kubernetes cluster +used. An example for a Kubernetes cluster set up with `kubeadm` would be to execute the command ```bash kubeadm certs renew all @@ -150,7 +150,7 @@ kubectl get csr kubectl certificate approve ``` -in order to complete a certificate rotation. This is most likely dependent on the K8s cluster solution in use. +in order to complete a certificate rotation. This is most likely dependent on the Kubernetes cluster solution in use. `kubectl get csr` allows to check, if this is the case; a `Pending` CSR would need to be approved. ```bash @@ -160,10 +160,10 @@ csr-9wvgt 112s kubernetes.io/kubelet-serving system:node:worker-1 Another option to approve the CSRs would be to use a third-party controller that automates the process. One example for this would be the [kubelet csr approver](https://github.com/postfinance/kubelet-csr-approver), which can be deployed on -a K8s cluster and requires `serverTLSBootstrap` to be set to true. Other controllers with a similar functionality might +a Kubernetes cluster and requires `serverTLSBootstrap` to be set to true. Other controllers with a similar functionality might have other specific requirements, which won't be explored in this document. -Another problem is that the CA might expire. Unfortunately, not all K8s tools have functionality to renew these +Another problem is that the CA might expire. Unfortunately, not all Kubernetes tools have functionality to renew these certificates with the help of commands. Instead, there is documentation for manually rotating the CA, which can be found under [Manual rotation of ca certificate](https://kubernetes.io/docs/tasks/tls/manual-rotation-of-ca-certificates/). @@ -239,7 +239,7 @@ To achieve a complete certificate rotation, the parameters `serverTLSBootstrap` MUST be set in the kubelet configuration. The certificates can be rotated by either updating the Kubernetes cluster, which automatically -renews certificates, or by manually renewing them. How this is done is dependent on the used K8s cluster. +renews certificates, or by manually renewing them. How this is done is dependent on the used Kubernetes cluster. It is also RECOMMENDED to renew the CA regularly to avoid an expiration of the CA. This standard doesn't set an exact timeline for a renewal, since it is dependent on lifetime and From b9aa12dc3dcf50a6380950051bb324c081f9eab8 Mon Sep 17 00:00:00 2001 From: Hannes Baum Date: Thu, 18 Jan 2024 16:08:51 +0100 Subject: [PATCH 7/8] fixup! K8s cluster robustness features (#414) Signed-off-by: Hannes Baum --- Standards/scs-0215-v1-robustness-features.md | 74 +++++++++++--------- 1 file changed, 41 insertions(+), 33 deletions(-) diff --git a/Standards/scs-0215-v1-robustness-features.md b/Standards/scs-0215-v1-robustness-features.md index 0f7298ed1..c97a69a20 100644 --- a/Standards/scs-0215-v1-robustness-features.md +++ b/Standards/scs-0215-v1-robustness-features.md @@ -17,10 +17,10 @@ the SCS standards. These would harden the cluster infrastructure against several The following special terms are used throughout this standard document: -| Term | Meaning | -|-----------------------------|----------------------------------------------------------------| -| Certificate Authority | Trusted organization that issues digital certificates entities | -| Certificate Signing Request | Request in order to apply for a digital identity certificate | +| Term | Abbreviation | Meaning | +|-----------------------------|--------------|----------------------------------------------------------------| +| Certificate Authority | CA | Trusted organization that issues digital certificates entities | +| Certificate Signing Request | CSR | Request in order to apply for a digital identity certificate | ## Motivation @@ -124,49 +124,59 @@ In order to secure the communication of a Kubernetes cluster, (TLS) certificates Certificate Authority (CA) can be used. Normally, these certificates expire after a set period of time. In order to avoid expiration and failure of a cluster, these certificates need to be rotated regularly and at best automatically. -It is important to either set `--rotate-server-certificates` as a command line parameter or set `rotateCertificates: true` -in the kubelet config or the `kubeletExtraArgs` of the cluster-template.yaml file. This activates the rotation of the -kubelet server certificate. Another recommendation is to set `serverTLSBootstrap: true`, which also enables the request -and rotation of the certificate for the kubelet by requesting them from `certificates.k8s.io` -according to the documentation under [Kubeadm certs]. +Certificates can either be rotated manually (a reference for manually working with certificates can be found +[here](https://github.com/kelseyhightower/kubernetes-the-hard-way/blob/79a3f79b27bd28f82f071bb877a266c2e62ee506/docs/04-certificate-authority.md)) +or automatically, which requires other things to care about in a deployment. -A clusters certificates can be rotated by updating the cluster, which according to the Kubernetes documentation -automatically renews the certificates. -Another method would be the manual renewal, which can be done through multiple methods, depending on the Kubernetes cluster -used. An example for a Kubernetes cluster set up with `kubeadm` would be to execute the command +Some tools or clusters provide possibilities to rotate certificates manually. +For example, `kubeadm` and `k3s` provides the following commands ```bash +# kubeadm kubeadm certs renew all + +# k3s +k3s certificate rotate ``` -Since clusters conformant with the SCS standards would probably be updated within the time period described in the -standard [SCS-0210-v2](https://github.com/SovereignCloudStack/standards/tree/main/Standards/scs-0210-v2-k8s-version-policy.md), -this rotation can probably be assumed to happen, because of the cluster update functionality. +A CA might also expire. Unfortunately, not all Kubernetes tools have functionality to renew these certificates. +Instead, documentation is provided to manually rotating a CA ([Manual rotation of ca certificate]). + +#### Automatic certificate rotation + +kubelet can be configured to obtain properly signed certificates from the `certificates.k8s.io` API of Kubernetes. +To do this, set `serverTLSBootstrap: true` in the configuration file of kubelet, which enables both the certificate request +during bootstrapping and the rotation mechanism. Setting `rotateCertificates: true` only enables the certificate rotation [Kubeadm certs]. +`--rotate-certificates` or `--rotate-server-certificates` shouldn't be used as command line arguments to set these flags, +since both parameters are deprecated according to [Certificate rotation]. + +It is also important to note that some Kubernetes clusters or admin tools provide additional ways to rotate certificates. +For example, `kubeadm` automatically rotates certificates, if the cluster is updated with the tool (see [Automatic Certificate renewal]). +This would also mean, that at least `kubeadm`-based clusters can be assumed to rotate their certificates regularly, +since they would probably be updated within the time period described in the +standard [SCS-0210-v2](https://github.com/SovereignCloudStack/standards/tree/main/Standards/scs-0210-v2-k8s-version-policy.md). + +If an automatic certificate rotation happens, these certificates need to be approved either manually or by a third party +controller like the [kubelet csr approver](https://github.com/postfinance/kubelet-csr-approver), which can be deployed on +a Kubernetes cluster to automate this process. -It is also important to mention, that CSRs may need to be approved manually with the commands +A manual approval of these CSRs could be done with the commands ```bash kubectl get csr kubectl certificate approve ``` -in order to complete a certificate rotation. This is most likely dependent on the Kubernetes cluster solution in use. -`kubectl get csr` allows to check, if this is the case; a `Pending` CSR would need to be approved. +in order to complete a certificate rotation. +But it should be noted, that this is also most likely dependent on the Kubernetes cluster solution in use. + +`kubectl get csr` allows to check, if a CSR needs to be approved; a `Pending` CSR would need to be approved. ```bash NAME AGE SIGNERNAME REQUESTOR CONDITION csr-9wvgt 112s kubernetes.io/kubelet-serving system:node:worker-1 Pending ``` -Another option to approve the CSRs would be to use a third-party controller that automates the process. One example for -this would be the [kubelet csr approver](https://github.com/postfinance/kubelet-csr-approver), which can be deployed on -a Kubernetes cluster and requires `serverTLSBootstrap` to be set to true. Other controllers with a similar functionality might -have other specific requirements, which won't be explored in this document. - -Another problem is that the CA might expire. Unfortunately, not all Kubernetes tools have functionality to renew these -certificates with the help of commands. Instead, there is documentation for manually rotating the CA, which can be found -under [Manual rotation of ca certificate](https://kubernetes.io/docs/tasks/tls/manual-rotation-of-ca-certificates/). - Further information and examples can be found in the Kubernetes documentation: [Kubeadm certs](https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-certs/) [Kubelete TLS bootstrapping](https://kubernetes.io/docs/reference/access-authn-authz/kubelet-tls-bootstrapping/) @@ -235,11 +245,6 @@ How this is done is up to the operator. It should be avoided, that certificates expire either on the whole cluster or for single components. To avoid this scenario, certificates MUST be rotated regularly; in the case of SCS, we REQUIRE at least a yearly certificate rotation. -To achieve a complete certificate rotation, the parameters `serverTLSBootstrap` and `rotateCertificates` -MUST be set in the kubelet configuration. - -The certificates can be rotated by either updating the Kubernetes cluster, which automatically -renews certificates, or by manually renewing them. How this is done is dependent on the used Kubernetes cluster. It is also RECOMMENDED to renew the CA regularly to avoid an expiration of the CA. This standard doesn't set an exact timeline for a renewal, since it is dependent on lifetime and @@ -254,6 +259,9 @@ therefore expiration date of the CA in question. [Upgrade etcd](https://kubernetes.io/docs/tasks/administer-cluster/configure-upgrade-etcd/) [Kubeadm certs](https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-certs/) [Kubelet TLS bootstrapping](https://kubernetes.io/docs/reference/access-authn-authz/kubelet-tls-bootstrapping/) +[Certificate rotation](https://kubernetes.io/docs/reference/access-authn-authz/kubelet-tls-bootstrapping/#certificate-rotation) +[Manual rotation of ca certificate](https://kubernetes.io/docs/tasks/tls/manual-rotation-of-ca-certificates/) +[Automatic Certificate renewal](https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-certs/#automatic-certificate-renewal) ## Conformance Tests From b9b4c9a1db316d48b9d3ef37a00934ee7dc21c28 Mon Sep 17 00:00:00 2001 From: Hannes Baum Date: Thu, 18 Jan 2024 16:13:48 +0100 Subject: [PATCH 8/8] fixup! K8s cluster robustness features (#414) Signed-off-by: Hannes Baum --- Standards/scs-0215-v1-robustness-features.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/Standards/scs-0215-v1-robustness-features.md b/Standards/scs-0215-v1-robustness-features.md index c97a69a20..e0ad3dc88 100644 --- a/Standards/scs-0215-v1-robustness-features.md +++ b/Standards/scs-0215-v1-robustness-features.md @@ -156,8 +156,8 @@ This would also mean, that at least `kubeadm`-based clusters can be assumed to r since they would probably be updated within the time period described in the standard [SCS-0210-v2](https://github.com/SovereignCloudStack/standards/tree/main/Standards/scs-0210-v2-k8s-version-policy.md). -If an automatic certificate rotation happens, these certificates need to be approved either manually or by a third party -controller like the [kubelet csr approver](https://github.com/postfinance/kubelet-csr-approver), which can be deployed on +If an automatic certificate rotation happens, these certificates need to be approved either manually or by a third party +controller like the [kubelet csr approver](https://github.com/postfinance/kubelet-csr-approver), which can be deployed on a Kubernetes cluster to automate this process. A manual approval of these CSRs could be done with the commands @@ -167,7 +167,7 @@ kubectl get csr kubectl certificate approve ``` -in order to complete a certificate rotation. +in order to complete a certificate rotation. But it should be noted, that this is also most likely dependent on the Kubernetes cluster solution in use. `kubectl get csr` allows to check, if a CSR needs to be approved; a `Pending` CSR would need to be approved.