EKS tries to create system processes (kube-system namespace) incorrectly on Admiralty virtual nodes #158

matt-slalom · 2023-01-10T00:54:58Z

Scenario

AWS EKS cluster (K8s 1.24) using spot instance nodes
Admiralty 0.15.1

Problem Description

Certain pods get stuck in "Terminating" status, apparently because K8s is trying to schedule processes on the Admiralty virtual nodes. Not sure if this is connected to the use of spot instances (where nodes disappear). If it is connected, I'd expect this would also affect autoscaling groups (unconfirmed).

Issue

Admiralty appears to be confusing EKS. EKS is trying to schedule pods like kube-proxy, ebs-csi-node, and aws-node on Admiralty virtual nodes and failing.

Is there a taint we should be putting on the EKS-supplied nodes?

Observations

Representative (truncated output) pod states after nodes disappear:

kubectl get pods -n kube-system                                     
NAME                                 READY   STATUS        RESTARTS   AGE
kube-system    aws-node-76rwv                                                    0/1     Terminating   0          4d3h
kube-system    aws-node-lbsjj                                                    1/1     Running       0          2d23h
kube-system    aws-node-mlvh4                                                    1/1     Running       0          2d22h
kube-system    aws-node-t8nb2                                                    0/1     Terminating   0          4d3h
kube-system    coredns-799c5565b4-6446n                                          1/1     Running       0          2d22h
kube-system    coredns-799c5565b4-gltcn                                          1/1     Running       0          2d23h
kube-system    ebs-csi-controller-b5d8854df-885vr                                6/6     Running       0          2d22h
kube-system    ebs-csi-controller-b5d8854df-zhsrx                                6/6     Running       0          2d23h
kube-system    ebs-csi-node-g7kv9                                                3/3     Running       0          2d22h
kube-system    ebs-csi-node-jhj6l                                                0/3     Terminating   0          4d3h
kube-system    ebs-csi-node-rpsjr                                                3/3     Running       0          2d23h
kube-system    ebs-csi-node-rqs8r                                                0/3     Terminating   0          4d3h
kube-system    kube-proxy-d6hdx                                                  0/1     Terminating   0          4d3h
kube-system    kube-proxy-hr6k7                                                  0/1     Terminating   0          4d3h
kube-system    kube-proxy-l26nj                                                  1/1     Running       0          2d22h
kube-system    kube-proxy-rbqh8                                                  1/1     Running       0          2d23h

Investigate one of the stuck pods (note 10250 is the port used by Admiralty) The x.x.x.114 address no longer exists in the cluster, so I'm guessing it previously belonged to a pod on a node that no longer exists (spot instance).

kubectl logs aws-node-76rwv -n kube-system     
Defaulted container "aws-node" out of: aws-node, aws-vpc-cni-init (init)
Error from server: Get "https://172.16.2.114:10250/containerLogs/kube-system/aws-node-76rwv/aws-node": dial tcp 172.16.2.114:10250: connect: connection refused

Force kill the pod to clean up, and K8s spawns a new one, but it stays stuck in Pending:

kubectl delete pod --force --grace-period=0 -n kube-system aws-node-76rwv                                                                                       
Warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely.
pod "aws-node-76rwv" force deleted

# wait a while

kubectl get pods -n kube-system aws-node-jcrqf
NAME             READY   STATUS    RESTARTS   AGE
aws-node-jcrqf   0/1     Pending   0          39m

K8s is trying to run start the pod on a virtual node for some reason:

kubectl -n kube-system describe pod aws-node-jcrqf |tail -4
Events:
  Type    Reason     Age   From               Message
  ----    ------     ----  ----               -------
  Normal  Scheduled  40m   default-scheduler  Successfully assigned kube-system/aws-node-jcrqf to admiralty-default-app-gke-cluster-1998d80ea2

Other pods are running on EKS nodes. The "pending" and "terminating" pods are scheduled on Admiralty virtual nodes.

kubectl -n kube-system get pods -n kube-system -o wide                                   
NAME                                 READY   STATUS        RESTARTS   AGE     IP             NODE                                                    NOMINATED NODE   READINESS GATES
aws-node-jcrqf                       0/1     Pending       0          41m     <none>         admiralty-default-app-gke-cluster-1998d80ea2            <none>           <none>
aws-node-lbsjj                       1/1     Running       0          3d      172.16.2.182   ip-172-16-2-182.us-west-2.compute.internal              <none>           <none>
aws-node-mlvh4                       1/1     Running       0          2d23h   172.16.2.226   ip-172-16-2-226.us-west-2.compute.internal              <none>           <none>
aws-node-t8nb2                       0/1     Terminating   0          4d4h    <none>         admiralty-default-multi-cloud-test-cluster-c962bad2df   <none>           <none>

Double check that the kube-system namespace does not have the Admiralty label:

kubectl get ns kube-system --show-labels
NAME          STATUS   AGE   LABELS
kube-system   Active   11d   kubernetes.io/metadata.name=kube-system

The text was updated successfully, but these errors were encountered:

matt-slalom · 2023-01-10T01:16:38Z

It looks like the virtual nodes have taints, but that the aws-node daemonset is ignoring it.

kubectl describe node admiralty-default-multi-cloud-test-cluster-c962bad2df | grep Taints
virtual-kubelet.io/provider=admiralty:NoSchedule

matt-slalom · 2023-01-10T19:15:02Z

I added a label to the EKS nodes and patched the aws-node daemonset to use a node selector, but only partial success. EKS no longer tries to deploy to the remote cluster's virtual node, but unfortunately the Admiralty virtual node for the local cluster also picked up the label, so aws-node is still trying to deploy to the Admiralty node.

adrienjt · 2023-01-16T21:25:43Z

You can exclude labels from being picked up, cf. #115.

matt-slalom · 2023-01-16T21:34:31Z

Thanks for the recommendation @adrienjt. I should point out that my attempts to tweak aws-node are really kind of hacky since aws-node is supplied by EKS and isn't something I control (though I can obviously make changes to it).

Maybe I'm misunderstanding, but I think this conflict between Admiralty and EKS is something Admiralty needs to address, even if it's in documentation. Altering the aws-node daemonset might ultimately be part of the solution, but it seems to me it should be part of the Admiralty install.

Or am I missing something?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

EKS tries to create system processes (kube-system namespace) incorrectly on Admiralty virtual nodes #158

EKS tries to create system processes (kube-system namespace) incorrectly on Admiralty virtual nodes #158

matt-slalom commented Jan 10, 2023 •

edited

Loading

matt-slalom commented Jan 10, 2023

matt-slalom commented Jan 10, 2023 •

edited

Loading

adrienjt commented Jan 16, 2023

matt-slalom commented Jan 16, 2023 •

edited

Loading

EKS tries to create system processes (kube-system namespace) incorrectly on Admiralty virtual nodes #158

EKS tries to create system processes (kube-system namespace) incorrectly on Admiralty virtual nodes #158

Comments

matt-slalom commented Jan 10, 2023 • edited Loading

Scenario

Problem Description

Issue

Observations

matt-slalom commented Jan 10, 2023

matt-slalom commented Jan 10, 2023 • edited Loading

adrienjt commented Jan 16, 2023

matt-slalom commented Jan 16, 2023 • edited Loading

matt-slalom commented Jan 10, 2023 •

edited

Loading

matt-slalom commented Jan 10, 2023 •

edited

Loading

matt-slalom commented Jan 16, 2023 •

edited

Loading