From faeaf0eef2e7452e705313e5ff4e62771904b39b Mon Sep 17 00:00:00 2001 From: Richard Wall Date: Fri, 20 Oct 2023 17:14:45 +0100 Subject: [PATCH 1/7] Explain why and how to isolate the cert-manager workloads Signed-off-by: Richard Wall --- content/docs/installation/best-practice.md | 103 +++++++++++++++++++++ 1 file changed, 103 insertions(+) diff --git a/content/docs/installation/best-practice.md b/content/docs/installation/best-practice.md index c59a16d8424..7ecc4daf2ad 100644 --- a/content/docs/installation/best-practice.md +++ b/content/docs/installation/best-practice.md @@ -22,6 +22,109 @@ are designed for backwards compatibility rather than for best practice or maximu You may find that the default resources do not comply with the security policy on your Kubernetes cluster and in that case you can modify the installation configuration using Helm chart values to override the defaults. +## Isolate cert-manager on dedicated node pools + +cert-manager is a cluster scoped operator and you should treat it as part of your platform control plane. + +The cert-manager controller caches all the Secret resources of the cluster in memory, +so if an untrusted / malicious workload were to be scheduled to the same Node as the controller, +and somehow gain privileged access to the underlying node, +it may be able to read the secrets from memory. +You can mitigate this risk by running cert-manager on nodes that are reserved for trusted platform operators. + +This can be achieved using node taints and node affinity to schedule cert-manager Pods to Nodes +which are dedicated to running your platform components. +A node taint tells Kubernetes to avoid scheduling Pods without a corresponding toleration on those nodes. +The node affinity on Pods tells Kubernetes to schedule those Pods on the dedicated nodes. + +The Helm chart for cert-manager has parameters to configure the `tolerations` and `nodeAffinity` for each component. +The exact values of these parameters will depend on you particular cluster. +For example, if you have a pool of nodes +labelled with `kubectl label node ... node-restriction.kubernetes.io/reserved-for=platform` and +tainted with `kubectl taint node ... node-restriction.kubernetes.io/reserved-for=platform:NoExecute`, +you can add the following affinity and tolerations values to allow cert-manager Pods to run on those nodes: + +```yaml +affinity: + nodeAffinity: + requiredDuringSchedulingIgnoredDuringExecution: + nodeSelectorTerms: + - matchExpressions: + - key: node-restriction.kubernetes.io/reserved-for + operator: In + values: + - platform +tolerations: +- key: node-restriction.kubernetes.io/reserved-for + operator: Equal + value: platform + +webhook: + affinity: + nodeAffinity: + requiredDuringSchedulingIgnoredDuringExecution: + nodeSelectorTerms: + - matchExpressions: + - key: node-restriction.kubernetes.io/reserved-for + operator: In + values: + - platform + tolerations: + - key: node-restriction.kubernetes.io/reserved-for + operator: Equal + value: platform + +cainjector: + affinity: + nodeAffinity: + requiredDuringSchedulingIgnoredDuringExecution: + nodeSelectorTerms: + - matchExpressions: + - key: node-restriction.kubernetes.io/reserved-for + operator: In + values: + - platform + tolerations: + - key: node-restriction.kubernetes.io/reserved-for + operator: Equal + value: platform + +startupapicheck: + affinity: + nodeAffinity: + requiredDuringSchedulingIgnoredDuringExecution: + nodeSelectorTerms: + - matchExpressions: + - key: node-restriction.kubernetes.io/reserved-for + operator: In + values: + - platform + tolerations: + - key: node-restriction.kubernetes.io/reserved-for + operator: Equal + value: platform +``` + +> 📖 Read more about [Taints and Tolerations](https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/) +> in the Kubernetes documentation. +> +> 📖 Read the [Guide to isolating tenant workloads to specific nodes](https://aws.github.io/aws-eks-best-practices/security/docs/multitenancy/#isolating-tenant-workloads-to-specific-nodes) +> in the EKS Best Practice Guides, +> for an in-depth explanation of these techniques. +> +> 📖 Learn how to [Isolate your workloads in dedicated node pools](https://cloud.google.com/kubernetes-engine/docs/how-to/isolate-workloads-dedicated-nodes) on Google Kubernetes Engine. +> +> 📖 Read more about the [`node-restriction.kubernetes.io/` prefix and the `NodeRestriction` admission plugin](https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#noderestriction). +> +> ℹī¸ On a multi-tenant cluster, +> consider enabling the [`PodTolerationRestriction` plugin](https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#podtolerationrestriction) +> to limit which tolerations tenants may add to their Pods. +> You may also use that plugin to add default tolerations to the `cert-manager` namespace, +> which obviates the need to explicitly set the tolerations in the Helm chart. +> +> ℹī¸ Alternatively, you could use Kyverno to limit which tolerations Pods are allowed to use. +> Read [Restrict control plane scheduling](https://kyverno.io/policies/other/res/restrict-controlplane-scheduling/restrict-controlplane-scheduling/) as a starting point. + ## High Availability cert-manager has three long-running components: controller, cainjector, and webhook. From ac18af6b568affc4e6f35b6cc5f41d5deb13356c Mon Sep 17 00:00:00 2001 From: Richard Wall Date: Mon, 23 Oct 2023 12:42:22 +0100 Subject: [PATCH 2/7] Use nodeSelector instead of affinity, for simplicity Signed-off-by: Richard Wall --- content/docs/installation/best-practice.md | 80 +++++++++------------- 1 file changed, 32 insertions(+), 48 deletions(-) diff --git a/content/docs/installation/best-practice.md b/content/docs/installation/best-practice.md index 7ecc4daf2ad..86926259aae 100644 --- a/content/docs/installation/best-practice.md +++ b/content/docs/installation/best-practice.md @@ -25,86 +25,70 @@ and in that case you can modify the installation configuration using Helm chart ## Isolate cert-manager on dedicated node pools cert-manager is a cluster scoped operator and you should treat it as part of your platform control plane. +The cert-manager controller creates and modifies Kubernetes Secret resources +and the controller and cainjector both cache TLS Secret resources in memory. +These are two reasons why you should consider isolating the cert-manager components from +other less privileged workloads. +For example, if an untrusted or malicious workload runs on the same Node as the cert-manager controller, +and somehow gains root access to the underlying node, +it may be able to read the private keys in Secrets that the controller has cached in memory. -The cert-manager controller caches all the Secret resources of the cluster in memory, -so if an untrusted / malicious workload were to be scheduled to the same Node as the controller, -and somehow gain privileged access to the underlying node, -it may be able to read the secrets from memory. You can mitigate this risk by running cert-manager on nodes that are reserved for trusted platform operators. +This can be achieved using a combination of Node taints, Pod tolerations and Pod node selector settings. +* A Node `taint` tells the Kubernetes scheduler to *exclude* Pods from a Node, by default. +* A Pod `toleration` tells the Kubernetes scheduler to *allow* Pods on the tainted Node. +* A Pod `nodeSelector` tells the Kubernetes scheduler to *place* Pods on a Node with matching labels. -This can be achieved using node taints and node affinity to schedule cert-manager Pods to Nodes -which are dedicated to running your platform components. -A node taint tells Kubernetes to avoid scheduling Pods without a corresponding toleration on those nodes. -The node affinity on Pods tells Kubernetes to schedule those Pods on the dedicated nodes. - -The Helm chart for cert-manager has parameters to configure the `tolerations` and `nodeAffinity` for each component. -The exact values of these parameters will depend on you particular cluster. +The Helm chart for cert-manager has parameters to configure the Pod `tolerations` and `nodeSelector` for each component. +The exact values of these parameters will depend on your particular cluster. For example, if you have a pool of nodes labelled with `kubectl label node ... node-restriction.kubernetes.io/reserved-for=platform` and tainted with `kubectl taint node ... node-restriction.kubernetes.io/reserved-for=platform:NoExecute`, -you can add the following affinity and tolerations values to allow cert-manager Pods to run on those nodes: +you can use the following values to run cert-manager Pods on those nodes: ```yaml -affinity: - nodeAffinity: - requiredDuringSchedulingIgnoredDuringExecution: - nodeSelectorTerms: - - matchExpressions: - - key: node-restriction.kubernetes.io/reserved-for - operator: In - values: - - platform +nodeSelector: + kubernetes.io/os: linux + node-restriction.kubernetes.io/reserved-for: platform tolerations: - key: node-restriction.kubernetes.io/reserved-for operator: Equal value: platform webhook: - affinity: - nodeAffinity: - requiredDuringSchedulingIgnoredDuringExecution: - nodeSelectorTerms: - - matchExpressions: - - key: node-restriction.kubernetes.io/reserved-for - operator: In - values: - - platform + nodeSelector: + kubernetes.io/os: linux + node-restriction.kubernetes.io/reserved-for: platform tolerations: - key: node-restriction.kubernetes.io/reserved-for operator: Equal value: platform cainjector: - affinity: - nodeAffinity: - requiredDuringSchedulingIgnoredDuringExecution: - nodeSelectorTerms: - - matchExpressions: - - key: node-restriction.kubernetes.io/reserved-for - operator: In - values: - - platform + nodeSelector: + kubernetes.io/os: linux + node-restriction.kubernetes.io/reserved-for: platform tolerations: - key: node-restriction.kubernetes.io/reserved-for operator: Equal value: platform startupapicheck: - affinity: - nodeAffinity: - requiredDuringSchedulingIgnoredDuringExecution: - nodeSelectorTerms: - - matchExpressions: - - key: node-restriction.kubernetes.io/reserved-for - operator: In - values: - - platform + nodeSelector: + kubernetes.io/os: linux + node-restriction.kubernetes.io/reserved-for: platform tolerations: - key: node-restriction.kubernetes.io/reserved-for operator: Equal value: platform ``` +> ℹī¸ This example uses `nodeSelector` to *place* the Pods but you could also use `affinity.nodeAffinity`. +> `nodeSelector` is chosen here because it has a simpler syntax. +> +> ℹī¸ The default `nodeSelector` value `kubernetes.io/os: linux` [avoids placing cert-manager Pods on Windows nodes in a mixed OS cluster](https://github.com/cert-manager/cert-manager/pull/3605), +> so that must be explicitly included here too. +> > 📖 Read more about [Taints and Tolerations](https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/) > in the Kubernetes documentation. > From 6fd4d3f98c9334fffdde484911543f4f764b6eaa Mon Sep 17 00:00:00 2001 From: Richard Wall Date: Mon, 23 Oct 2023 12:52:08 +0100 Subject: [PATCH 3/7] Link to all the general documentation sites in addition to specific pages Signed-off-by: Richard Wall --- content/docs/installation/best-practice.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/content/docs/installation/best-practice.md b/content/docs/installation/best-practice.md index 86926259aae..3834bf9c535 100644 --- a/content/docs/installation/best-practice.md +++ b/content/docs/installation/best-practice.md @@ -24,7 +24,7 @@ and in that case you can modify the installation configuration using Helm chart ## Isolate cert-manager on dedicated node pools -cert-manager is a cluster scoped operator and you should treat it as part of your platform control plane. +cert-manager is a cluster scoped operator and you should treat it as part of your platform's control plane. The cert-manager controller creates and modifies Kubernetes Secret resources and the controller and cainjector both cache TLS Secret resources in memory. These are two reasons why you should consider isolating the cert-manager components from @@ -90,13 +90,13 @@ startupapicheck: > so that must be explicitly included here too. > > 📖 Read more about [Taints and Tolerations](https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/) -> in the Kubernetes documentation. +> in the [Kubernetes documentation](https://kubernetes.io/docs/). > > 📖 Read the [Guide to isolating tenant workloads to specific nodes](https://aws.github.io/aws-eks-best-practices/security/docs/multitenancy/#isolating-tenant-workloads-to-specific-nodes) -> in the EKS Best Practice Guides, +> in the [EKS Best Practice Guides](https://aws.github.io/aws-eks-best-practices/), > for an in-depth explanation of these techniques. > -> 📖 Learn how to [Isolate your workloads in dedicated node pools](https://cloud.google.com/kubernetes-engine/docs/how-to/isolate-workloads-dedicated-nodes) on Google Kubernetes Engine. +> 📖 Learn how to [Isolate your workloads in dedicated node pools](https://cloud.google.com/kubernetes-engine/docs/how-to/isolate-workloads-dedicated-nodes) on [Google Kubernetes Engine](https://cloud.google.com/kubernetes-engine/docs/). > > 📖 Read more about the [`node-restriction.kubernetes.io/` prefix and the `NodeRestriction` admission plugin](https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#noderestriction). > @@ -106,7 +106,7 @@ startupapicheck: > You may also use that plugin to add default tolerations to the `cert-manager` namespace, > which obviates the need to explicitly set the tolerations in the Helm chart. > -> ℹī¸ Alternatively, you could use Kyverno to limit which tolerations Pods are allowed to use. +> ℹī¸ Alternatively, you could use [Kyverno](https://kyverno.io/docs/) to limit which tolerations Pods are allowed to use. > Read [Restrict control plane scheduling](https://kyverno.io/policies/other/res/restrict-controlplane-scheduling/restrict-controlplane-scheduling/) as a starting point. ## High Availability From 4195d60ec0538ccf95de0ba429a2ed9cb6690726 Mon Sep 17 00:00:00 2001 From: Richard Wall Date: Mon, 23 Oct 2023 14:20:33 +0100 Subject: [PATCH 4/7] Update content/docs/installation/best-practice.md Co-authored-by: Josh Soref <2119212+jsoref@users.noreply.github.com> Signed-off-by: Richard Wall --- content/docs/installation/best-practice.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/docs/installation/best-practice.md b/content/docs/installation/best-practice.md index 3834bf9c535..87cac871ce7 100644 --- a/content/docs/installation/best-practice.md +++ b/content/docs/installation/best-practice.md @@ -39,7 +39,7 @@ This can be achieved using a combination of Node taints, Pod tolerations and Pod * A Pod `toleration` tells the Kubernetes scheduler to *allow* Pods on the tainted Node. * A Pod `nodeSelector` tells the Kubernetes scheduler to *place* Pods on a Node with matching labels. -The Helm chart for cert-manager has parameters to configure the Pod `tolerations` and `nodeSelector` for each component. +The Helm chart for cert-manager has parameters to configure the Pod `tolerations` and `nodeSelector` for each component. The exact values of these parameters will depend on your particular cluster. For example, if you have a pool of nodes labelled with `kubectl label node ... node-restriction.kubernetes.io/reserved-for=platform` and From 0a539e57c55c631c2ab29af292e1090ab5f37017 Mon Sep 17 00:00:00 2001 From: Richard Wall Date: Tue, 24 Oct 2023 08:18:08 +0100 Subject: [PATCH 5/7] Make it clear that the use of taints and toleration is only an example Signed-off-by: Richard Wall --- content/docs/installation/best-practice.md | 30 ++++++++++++++++------ 1 file changed, 22 insertions(+), 8 deletions(-) diff --git a/content/docs/installation/best-practice.md b/content/docs/installation/best-practice.md index 87cac871ce7..2d9d6393534 100644 --- a/content/docs/installation/best-practice.md +++ b/content/docs/installation/best-practice.md @@ -34,17 +34,30 @@ and somehow gains root access to the underlying node, it may be able to read the private keys in Secrets that the controller has cached in memory. You can mitigate this risk by running cert-manager on nodes that are reserved for trusted platform operators. -This can be achieved using a combination of Node taints, Pod tolerations and Pod node selector settings. -* A Node `taint` tells the Kubernetes scheduler to *exclude* Pods from a Node, by default. -* A Pod `toleration` tells the Kubernetes scheduler to *allow* Pods on the tainted Node. -* A Pod `nodeSelector` tells the Kubernetes scheduler to *place* Pods on a Node with matching labels. The Helm chart for cert-manager has parameters to configure the Pod `tolerations` and `nodeSelector` for each component. The exact values of these parameters will depend on your particular cluster. -For example, if you have a pool of nodes -labelled with `kubectl label node ... node-restriction.kubernetes.io/reserved-for=platform` and -tainted with `kubectl taint node ... node-restriction.kubernetes.io/reserved-for=platform:NoExecute`, -you can use the following values to run cert-manager Pods on those nodes: + +### Example + +This example demonstrates how to use: +`taints` to *repel* non-platform Pods from Nodes which you have reserved for your platform's control-plane, +`tolerations` to *allow* cert-manager Pods to run on those Nodes, and +`nodeSelector` to *place* the cert-manager Pods on those Nodes. + +Label the Nodes: + +```bash +kubectl label node ... node-restriction.kubernetes.io/reserved-for=platform +``` + +Taint the Nodes: + +```bash +kubectl taint node ... node-restriction.kubernetes.io/reserved-for=platform:NoExecute +``` + +Then install cert-manager using the following Helm chart values: ```yaml nodeSelector: @@ -85,6 +98,7 @@ startupapicheck: > ℹī¸ This example uses `nodeSelector` to *place* the Pods but you could also use `affinity.nodeAffinity`. > `nodeSelector` is chosen here because it has a simpler syntax. +> Read [Assigning Pods to Nodes](https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/) to learn more. > > ℹī¸ The default `nodeSelector` value `kubernetes.io/os: linux` [avoids placing cert-manager Pods on Windows nodes in a mixed OS cluster](https://github.com/cert-manager/cert-manager/pull/3605), > so that must be explicitly included here too. From 5afffded856ebd7effb81fe3fb707e05d2d01afa Mon Sep 17 00:00:00 2001 From: Richard Wall Date: Wed, 25 Oct 2023 08:31:23 +0100 Subject: [PATCH 6/7] Add link to RedHat OpenShift pod placement documentation Signed-off-by: Richard Wall --- content/docs/installation/best-practice.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/content/docs/installation/best-practice.md b/content/docs/installation/best-practice.md index 2d9d6393534..d79c513b486 100644 --- a/content/docs/installation/best-practice.md +++ b/content/docs/installation/best-practice.md @@ -112,6 +112,8 @@ startupapicheck: > > 📖 Learn how to [Isolate your workloads in dedicated node pools](https://cloud.google.com/kubernetes-engine/docs/how-to/isolate-workloads-dedicated-nodes) on [Google Kubernetes Engine](https://cloud.google.com/kubernetes-engine/docs/). > +> 📖 Learn about [Placing pods on specific nodes using node selectors, with RedHat OpenShift](https://docs.openshift.com/container-platform/4.13/nodes/scheduling/nodes-scheduler-node-selectors.html). +> > 📖 Read more about the [`node-restriction.kubernetes.io/` prefix and the `NodeRestriction` admission plugin](https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#noderestriction). > > ℹī¸ On a multi-tenant cluster, From 7b22cba1d9d3f94aaf0f3af0befdf1c24e563e74 Mon Sep 17 00:00:00 2001 From: Richard Wall Date: Wed, 25 Oct 2023 09:08:29 +0100 Subject: [PATCH 7/7] Move links to pod placement docs nearer to where concepts are introduced Thanks @schelv for the suggestion. Signed-off-by: Richard Wall --- content/docs/installation/best-practice.md | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/content/docs/installation/best-practice.md b/content/docs/installation/best-practice.md index d79c513b486..75558131ac3 100644 --- a/content/docs/installation/best-practice.md +++ b/content/docs/installation/best-practice.md @@ -38,6 +38,12 @@ You can mitigate this risk by running cert-manager on nodes that are reserved fo The Helm chart for cert-manager has parameters to configure the Pod `tolerations` and `nodeSelector` for each component. The exact values of these parameters will depend on your particular cluster. +> 📖 Read [Assigning Pods to Nodes](https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/) +> in the [Kubernetes documentation](https://kubernetes.io/docs/). +> +> 📖 Read about [Taints and Tolerations](https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/) +> in the [Kubernetes documentation](https://kubernetes.io/docs/). + ### Example This example demonstrates how to use: @@ -98,14 +104,10 @@ startupapicheck: > ℹī¸ This example uses `nodeSelector` to *place* the Pods but you could also use `affinity.nodeAffinity`. > `nodeSelector` is chosen here because it has a simpler syntax. -> Read [Assigning Pods to Nodes](https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/) to learn more. > > ℹī¸ The default `nodeSelector` value `kubernetes.io/os: linux` [avoids placing cert-manager Pods on Windows nodes in a mixed OS cluster](https://github.com/cert-manager/cert-manager/pull/3605), > so that must be explicitly included here too. > -> 📖 Read more about [Taints and Tolerations](https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/) -> in the [Kubernetes documentation](https://kubernetes.io/docs/). -> > 📖 Read the [Guide to isolating tenant workloads to specific nodes](https://aws.github.io/aws-eks-best-practices/security/docs/multitenancy/#isolating-tenant-workloads-to-specific-nodes) > in the [EKS Best Practice Guides](https://aws.github.io/aws-eks-best-practices/), > for an in-depth explanation of these techniques.