Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failing secret webhook causes unexpected behavior #4266

Closed
Oats87 opened this issue May 19, 2023 · 17 comments
Closed

Failing secret webhook causes unexpected behavior #4266

Oats87 opened this issue May 19, 2023 · 17 comments
Assignees
Labels
kind/rancher-integration Needed to support Rancher integration

Comments

@Oats87
Copy link
Contributor

Oats87 commented May 19, 2023

Environmental Info:
RKE2 Version: v1.26.4+rke2r1

Node(s) CPU architecture, OS, and Version:
Not relevant.

Cluster Configuration:
1 server node

Describe the bug:
The existing of a set of failing webhook configurations in a cluster blocks RKE2 server from starting up successfully, but RKE2 server still happily starts up pretending like nothing is wrong.

Steps To Reproduce:

  1. Install RKE2 server
  2. Let it come up and become healthy
  3. Remove the kube-controller-manager.yaml static pod manifest from /var/lib/rancher/rke2/agent/pod-manifests
  4. Restart rke2-server
  5. See kube-controller-manager never comes back online

Expected behavior:
rke2-server errors out on startup, starts failing, or something more useful than just silently failing.

Actual behavior:
rke2-server silently fails creating the kube-controller-manager manifest and the coordinator is stuck without proper feedback as to the fact that rke2-server is not really healthy beyond looking at symptoms.

Additional context / logs:
I hit this when manually validating:

The rke2-server log message was:

May 19 15:03:43 kskcm-pool1-2876d649-4xn4n rke2[17126]: time="2023-05-19T15:03:43Z" level=warning msg="Failed to create Kubernetes secret: Internal error occurred: failed calling webhook \"rancher.cattle.io.secrets\": failed to call webhook: Post \"https://rancher-webhook.cattle-system.svc:443/v1/webhook/mutation/secrets?timeout=10s\": context deadline exceeded"

There is a corresponding rancher/rancher issue filed here: rancher/rancher#41613 but while this will fix this instance of the "storm", it is not a holistic solution as there may be other webhook configurations that manipulate secrets (OPA?)

@Oats87 Oats87 changed the title Failing secret webhook causes inconsistent RKE2 startup behavior Failing secret webhook causes unexpected RKE2 startup behavior May 19, 2023
@brandond
Copy link
Member

@Oats87 can you post the complete log? I am not familiar with that specific error and I'm not finding it anywhere in either the K3s or RKE2 codebase. I can't think of what secret we would need to create in order for the kube-controller-manager manifest to be dropped. The creation of static pod manifests is a completely standalone operation without any dependencies on the apiserver being available.

@brandond
Copy link
Member

I suspect that this is coming from the dynamiclistener secret store; it is failing to update the cert secret, and for some reason that’s preventing the supervisor listener from coming up all the way, so it is blocking on the ready check before starting the rest of the control-plane components.

@brandond
Copy link
Member

It appears that this is being fixed on the rancher side by rancher/webhook#240 - it seems that it was not intended for the downstream cluster webook to include rules to require it to be called on creation of secrets.

@brandond
Copy link
Member

brandond commented May 22, 2023

I am testing with a bogus webhook that blocks secret creates:

apiVersion: admissionregistration.k8s.io/v1
kind: MutatingWebhookConfiguration
metadata:
  name: rancher.cattle.io
webhooks:
- admissionReviewVersions:
  - v1
  - v1beta1
  clientConfig:
    url: https://httpbin.org/status/502
  failurePolicy: Fail
  matchPolicy: Equivalent
  name: rancher.cattle.io.secrets
  namespaceSelector: {}
  objectSelector: {}
  reinvocationPolicy: Never
  rules:
  - apiGroups:
    - ""
    apiVersions:
    - v1
    operations:
    - CREATE
    resources:
    - secrets
    scope: Namespaced
  sideEffects: NoneOnDryRun
  timeoutSeconds: 5

When node registration is blocked by the failing webhook, the error on the server isn't clear what's going on:

May 22 22:15:40 systemd-node-1 rke2[223]: time="2023-05-22T22:15:40Z" level=error msg="Internal error occurred: failed calling webhook \"rancher.cattle.io.secrets\": failed to call webhook: an error on the server (\"\") has prevented the request from succeeding"

The error log on the agent does indicate that the password was rejected:

May 22 22:17:15 systemd-node-2 rke2[222]: time="2023-05-22T22:17:15Z" level=info msg="Waiting to retrieve agent configuration; server is not ready: Node password rejected, duplicate hostname or contents of '/etc/rancher/node/password' may not match server node-passwd entry, try enabling a unique node name with the --with-node-id flag"

I am not able to get RKE2 stuck in a situation where it won't recreate the control-plane static pods. I do see an error from dynamiclistener when it tries to create the secret, but this is just a warning, and all subsequent writes use Update which works since it is not blocked by the webhook: https://github.com/rancher/dynamiclistener/blob/2b62d5cc694d566dd8f3f67eb7b0f6bb46266a65/storage/kubernetes/controller.go#L108

May 22 22:22:08 systemd-node-1 rke2[13757]: time="2023-05-22T22:22:08Z" level=info msg="Starting /v1, Kind=ServiceAccount controller"
May 22 22:22:09 systemd-node-1 rke2[13757]: time="2023-05-22T22:22:09Z" level=warning msg="Failed to create Kubernetes secret: Internal error occurred: failed calling webhook \"rancher.cattle.io.secrets\": failed to call webhook: an error on the server (\"<html>\\r\\n<head><title>502 Bad Gateway</title></head>\\r\\n<body>\\r\\n<center><h1>502 Bad Gateway</h1></center>\\r\\n</body>\\r\\n</html>\") has prevented the request from succeeding"
May 22 22:22:09 systemd-node-1 rke2[13757]: time="2023-05-22T22:22:09Z" level=info msg="Starting /v1, Kind=Secret controller"
May 22 22:22:09 systemd-node-1 rke2[13757]: time="2023-05-22T22:22:09Z" level=info msg="Updating TLS secret for kube-system/rke2-serving (count: 11): map[listener.cattle.io/cn-10.43.0.1:10.43.0.1 listener.cattle.io/cn-127.0.0.1:127.0.0.1 listener.cattle.io/cn-172.17.0.3:172.17.0.3 listener.cattle.io/cn-__1-f16284:::1 listener.cattle.io/cn-fd7c_53a5_aef5__242_ac11_3-d08c40:fd7c:53a5:aef5::242:ac11:3 listener.cattle.io/cn-kubernetes:kubernetes listener.cattle.io/cn-kubernetes.default:kubernetes.default listener.cattle.io/cn-kubernetes.default.svc:kubernetes.default.svc listener.cattle.io/cn-kubernetes.default.svc.cluster.local:kubernetes.default.svc.cluster.local listener.cattle.io/cn-localhost:localhost listener.cattle.io/cn-systemd-node-1:systemd-node-1 listener.cattle.io/fingerprint:SHA1=45D709FF6B1AFADAA2D4CBBF4D5F9B913C442CC7]"

The static pods all start up fine:

systemd-node-1:/ # kubectl get pod -n kube-system -l tier=control-plane -o wide
NAME                                      READY   STATUS    RESTARTS        AGE     IP           NODE             NOMINATED NODE   READINESS GATES
cloud-controller-manager-systemd-node-1   1/1     Running   1 (5m41s ago)   5m17s   172.17.0.3   systemd-node-1   <none>           <none>
etcd-systemd-node-1                       1/1     Running   1 (5m41s ago)   13m     172.17.0.3   systemd-node-1   <none>           <none>
kube-apiserver-systemd-node-1             1/1     Running   1 (5m41s ago)   13m     172.17.0.3   systemd-node-1   <none>           <none>
kube-controller-manager-systemd-node-1    1/1     Running   1 (5m41s ago)   5m17s   172.17.0.3   systemd-node-1   <none>           <none>
kube-proxy-systemd-node-1                 1/1     Running   1 (5m41s ago)   5m18s   172.17.0.3   systemd-node-1   <none>           <none>
kube-proxy-systemd-node-2                 1/1     Running   0               9m52s   172.17.0.4   systemd-node-2   <none>           <none>
kube-scheduler-systemd-node-1             1/1     Running   1 (5m41s ago)   5m17s   172.17.0.3   systemd-node-1   <none>           <none>

@brandond brandond added this to the v1.27.3+rke2r1 milestone May 22, 2023
@brandond brandond self-assigned this May 22, 2023
@brandond brandond added the kind/rancher-integration Needed to support Rancher integration label May 22, 2023
@brandond
Copy link
Member

brandond commented May 22, 2023

For the node password creation case, I think we could address this by soft failing with a warning if there is an error when creating the node password secret:
https://github.com/k3s-io/k3s/blob/91c5e0d75a009115cb33f281dc4fc1a15b80da69/pkg/nodepassword/nodepassword.go#L44-L45

I would defer to @macedogm on the security implications of this. Node password secrets are a protection that we have added to ensure that one node cannot impersonate another by joining the cluster with an existing name. The attacker would need to know the node's password, in addition to its hostname.

If we just retried the secret creation in the background, instead of failing immediately, this could address the concern around webhooks blocking node joins, while still eventually ensuring that the secret is created once the new node is up and the webhook outage is resolved.

@macedogm
Copy link
Member

@brandond I wasn't able to understand exactly which behavior will happen when the password creation fails. Will RKE2 proceed by creating the node with an empty password, completely fail creating the node or goes into an unknown state?

Nevertheless, I believe that the right approach should be:

  1. Retry creating the node's password (a max retry logic is needed to avoid letting the process on a loop).
  2. If the max retries is reached, then the entire process should fail with a clear message.

@brandond
Copy link
Member

@macedogm the workflow is:

  1. Agent sends node password to server when initially joining
  2. Server stores node password in a Kubernetes secret
  3. Any subsequent attempts to join the cluster using the same node name must use the same password (until the Kubernetes node and node password secret are deleted)

It is essentially a shared secret that agents must continue to reuse when connecting to the cluster. Failing to save this does not materially weaken any of the core Kubernetes security model, but it does potentially allow multiple nodes to join with the same node name, or for an attacker that has compromised one node to configure it to rejoin the cluster as another node in order to attract privileged workloads (although there are many other more serious attacks that someone could mount if they had control over an agent node).

@macedogm
Copy link
Member

macedogm commented May 24, 2023

@brandond thanks for the explanation.

although there are many other more serious attacks that someone could mount if they had control over an agent node

Agree.

Failing to save this does not materially weaken any of the core Kubernetes security model, but it does potentially allow multiple nodes to join with the same node name

Based on the previous comment, there are much worse attacks that can happen, but giving that we provide at least this basic protection of checking the <node's name + node's password> combination, I recommend that we follow with the approach that you proposed earlier:

If we just retried the secret creation in the background, instead of failing immediately, this could address the concern around webhooks blocking node joins, while still eventually ensuring that the secret is created once the new node is up and the webhook outage is resolved.

Apologies for making this thread longer, but I would just like to be sure that by creating the secret in the background plus eventually ensuring that the secret is created will not open the cluster and node to a potential timing attack. While the password is being recreated, in the scenario that the webhook is down, will the plan be to let the node waiting to join until the secret is fully created, right?

@brandond
Copy link
Member

While the password is being recreated, in the scenario that the webhook is down, will the plan be to let the node waiting to join until the secret is fully created, right?

There is a chicken-and-egg issue here, in that we need to allow hosts to join while the webhook is down, because there may not be any host for the webhook pod to run on, until a new node is joined to the cluster. The proposal is to allow the node to join even when the secret is still pending creation, with an awareness of the fact that this does create a window where the node password protection is not enforced - the period of time between the node joining, and the webhook coming up and allowing the secret to be created.

@macedogm
Copy link
Member

It's a tricky situation indeed. I guess this mainly affects RKE2 in connection with Rancher, unless other webhooks that provide the same kind of validation are configured. Do you see this happening in other scenarios outside of Rancher?

I fail to see a way to 100% mitigate this without causing a possible deadlock. Suppose a possible timing attack happens and a malicious user adds a second node that has the same name of a previously added node. I believe that such attack would actually be blocked by K8s due to the name uniqueness property, right?

If this is true, would still make sense to possibly implement some kind "post-webhook is running again" validation to warn on such situations?

I'm just being overzealous, because as we discussed before, if such type of attack happens, the malicious user already has certain privileges in the environment that would allow it to possibly execute other kinds of attacks.

Not sure if a note in the docs would be needed, giving the K8s property mentioned above.

@brandond
Copy link
Member

If this is true, would still make sense to possibly implement some kind "post-webhook is running again" validation to warn on such situations?

Yeah, thats probably reasonable. There are currently other code paths that just warn if the node password can't be validated - such as when the etcd or apiserver nodes are starting up, and we cannot access secrets yet. If there is a failure, we just log it. We could probably enhance all of these to create a Kubernetes event or something?

@macedogm
Copy link
Member

macedogm commented May 26, 2023

We could probably enhance all of these to create a Kubernetes event or something?

That would be amazing if we could do it. I guess a warning event in such situations would be the way to go, providing a better way for users that want to monitor and alert on these scenarios.

How should we proceed from here, please?

@brandond brandond changed the title Failing secret webhook causes unexpected RKE2 startup behavior Failing secret webhook causes unexpected behavior May 31, 2023
@brandond
Copy link
Member

brandond commented Jun 1, 2023

PR is opened on the K3s side.

@bguzman-3pillar bguzman-3pillar self-assigned this Jun 1, 2023
@brandond
Copy link
Member

brandond commented Jun 8, 2023

/backport v1.26.6+rke2r1

@brandond
Copy link
Member

brandond commented Jun 8, 2023

/backport v1.25.11+rke2r1

@brandond
Copy link
Member

brandond commented Jun 8, 2023

/backport v1.24.15+rke2r1

@bguzman-3pillar
Copy link
Contributor

Validated on 02f58981a09e61f578c306c4e417aa336bf170eb

rke2 -v
rke2 version v1.27.3-rc2+rke2r1 (02f58981a09e61f578c306c4e417aa336bf170eb)
go version go1.20.5 X:boringcrypto

Environment Details

Infrastructure

  • Cloud
  • Hosted

Node(s) CPU architecture, OS, and Version:

Ubuntu

Cluster Configuration:

1 server, 1 agent

Additional files

$ cat webh.yaml 
apiVersion: admissionregistration.k8s.io/v1
kind: MutatingWebhookConfiguration
metadata:
  name: rancher.cattle.io
webhooks:
- admissionReviewVersions:
  - v1
  - v1beta1
  clientConfig:
    url: https://httpbin.org/status/502
  failurePolicy: Fail
  matchPolicy: Equivalent
  name: rancher.cattle.io.secrets
  namespaceSelector: {}
  objectSelector: {}
  reinvocationPolicy: Never
  rules:
  - apiGroups:
    - ""
    apiVersions:
    - v1
    operations:
    - CREATE
    resources:
    - secrets
    scope: Namespaced
  sideEffects: NoneOnDryRun
  timeoutSeconds: 5

Testing Steps

  1. Install rke2
  2. Create a bad webhook configuration (this one just calls out to httpbin.org for a 502 error response)
  3. Attempt to join a new agent

Validation Results:

  • rke2 version used for validation:
$ kubectl get node,pod -A 
NAME                    STATUS   ROLES                       AGE     VERSION
node/ip-172-31-17-193   Ready    <none>                      114s    v1.27.3+rke2r1
node/ip-172-31-21-158   Ready    control-plane,etcd,master   7m49s   v1.27.3+rke2r1

NAMESPACE     NAME                                                        READY   STATUS      RESTARTS   AGE
kube-system   pod/cloud-controller-manager-ip-172-31-21-158               1/1     Running     0          7m28s
kube-system   pod/etcd-ip-172-31-21-158                                   1/1     Running     0          7m28s
kube-system   pod/helm-install-rke2-canal-8mc5b                           0/1     Completed   0          7m35s
kube-system   pod/helm-install-rke2-coredns-98wsv                         0/1     Completed   0          7m35s
kube-system   pod/helm-install-rke2-ingress-nginx-lzhrf                   0/1     Completed   0          7m35s
kube-system   pod/helm-install-rke2-metrics-server-x5qbl                  0/1     Completed   0          7m35s
kube-system   pod/helm-install-rke2-snapshot-controller-crd-ssdwb         0/1     Completed   0          7m35s
kube-system   pod/helm-install-rke2-snapshot-controller-sbnms             0/1     Completed   1          7m35s
kube-system   pod/helm-install-rke2-snapshot-validation-webhook-7krnp     0/1     Completed   0          7m35s
kube-system   pod/kube-apiserver-ip-172-31-21-158                         1/1     Running     0          7m5s
kube-system   pod/kube-controller-manager-ip-172-31-21-158                1/1     Running     0          7m1s
kube-system   pod/kube-proxy-ip-172-31-17-193                             1/1     Running     0          114s
kube-system   pod/kube-proxy-ip-172-31-21-158                             1/1     Running     0          6m55s
kube-system   pod/kube-scheduler-ip-172-31-21-158                         1/1     Running     0          6m57s
kube-system   pod/rke2-canal-cjbl4                                        2/2     Running     0          114s
kube-system   pod/rke2-canal-l8pcq                                        2/2     Running     0          7m21s
kube-system   pod/rke2-coredns-rke2-coredns-5f5d6b54c7-5svsp              1/1     Running     0          7m23s
kube-system   pod/rke2-coredns-rke2-coredns-5f5d6b54c7-prjfk              1/1     Running     0          113s
kube-system   pod/rke2-coredns-rke2-coredns-autoscaler-6bf8f59fd5-84wjg   1/1     Running     0          7m23s
kube-system   pod/rke2-ingress-nginx-controller-7b6s6                     1/1     Running     0          95s
kube-system   pod/rke2-ingress-nginx-controller-vbf4v                     1/1     Running     0          6m27s
kube-system   pod/rke2-metrics-server-6d79d977db-m7dsz                    1/1     Running     0          6m46s
kube-system   pod/rke2-snapshot-controller-7d6476d7cb-dnldv               1/1     Running     0          6m35s
kube-system   pod/rke2-snapshot-validation-webhook-6c5f6cf5d8-9f7z7       1/1     Running     0          6m37s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/rancher-integration Needed to support Rancher integration
Projects
None yet
Development

No branches or pull requests

4 participants