You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What k8s version are you using (kubectl version)?:
kubectl version Output
$ kubectl version
Server Version: v1.28.9-eks-036c24b
What environment is this in?:
AWS EKS
What did you expect to happen?:
Expected the cluster-autoscaler to recognize it could scale down a node, and move a coredns pod to another node, while still honoring maxSkew=1.
Current configuration has coredns Deployment of 5 replicas, with TopologySpreadConstraints where topologyKey = "kubernetes.io/hostname", labelSelector = {"matchLabels":{"k8s-app":"kube-dns"}}, whenUnsatisfiable = "DoNotSchedule", maxSkew = 1.
If we currently have 5 nodes and 5 coredns pods like:
1 1 1 1 1
Then after some time, cluster autoscaler determines we no longer need 5 nodes for workload and we can scale down to 4 to save money. What I want to happen is something like:
1 1 1 1 1 -----> 2 1 1 1
This should still be a valid configuration for maxSkew=1.
What happened instead?:
During cluster-autoscaler scale-down simulation (in exact scenario described above), the cluster-autoscaler logs show the below failure. It's unable to scale-down the node even though maxSkew=1 will still be honored after this node is deleted. I'm guessing the cluster-autoscaler is including the node it wants to remove while calculating skew, and thus skew = 2 - 0 (global minimum since node to be deleted will have no coredns pods) = 2 > 1 = maxSkew. Therefore it claims it can't put pod on any node, and is treating topologySpreadConstraint like a podAntiAffinity rule.
19:56:09.838749 1 cluster.go:155] ip-10-177-149-54.ec2.internal for removal
19:56:09.839185 1 klogx.go:87] failed to find place for kube-system/coredns-568: cannot put pod coredns-568 on any node
19:56:09.839209 1 cluster.go:175] node ip-10-177-149-54.ec2.internal is not suitable for removal: can reschedule only 0 out of 1 pods
When I increase maxSkew = 2, cluster-autoscaler is able to scale down the unneeded nodes, and honors the maxSkew=2. Issue only seems to occur when maxSkew = 1. In other circumstances, like no TopologySpreadConstraints, cluster-autoscaler is also able to move coredns pods to other nodes and scale down.
How to reproduce it (as minimally and precisely as possible):
Scale up Nodes to above normal amount
Have a Deployment with replicas >= nodeCount and topologySpreadConstraints similar to config below, making sure maxSkew=1.
Kill whatever done to scale up Nodes (if used) and watch cluster-autoscaler try but fail to scale down unneeded nodes.
Anything else we need to know?:
Deployment Config (labelSelector specific for coredns)
greatshane
changed the title
Cluster Autoscaler not Respecting TopologySpread maxSkew=1 on Scale Down
Cluster Autoscaler Does Not Respect TopologySpread maxSkew=1 on Scale Down
Jul 24, 2024
Which component are you using?:
cluster-autoscaler
What version of the component are you using?:
Component version:
amazonaws.com/cluster-autoscaler:v1.28.0
What k8s version are you using (
kubectl version
)?:kubectl version
OutputWhat environment is this in?:
AWS EKS
What did you expect to happen?:
Expected the cluster-autoscaler to recognize it could scale down a node, and move a coredns pod to another node, while still honoring maxSkew=1.
Current configuration has coredns Deployment of 5 replicas, with TopologySpreadConstraints where topologyKey = "kubernetes.io/hostname", labelSelector = {"matchLabels":{"k8s-app":"kube-dns"}}, whenUnsatisfiable = "DoNotSchedule", maxSkew = 1.
If we currently have 5 nodes and 5 coredns pods like:
1 1 1 1 1
Then after some time, cluster autoscaler determines we no longer need 5 nodes for workload and we can scale down to 4 to save money. What I want to happen is something like:
1 1 1 1 1 -----> 2 1 1 1
This should still be a valid configuration for maxSkew=1.
What happened instead?:
During cluster-autoscaler scale-down simulation (in exact scenario described above), the cluster-autoscaler logs show the below failure. It's unable to scale-down the node even though maxSkew=1 will still be honored after this node is deleted. I'm guessing the cluster-autoscaler is including the node it wants to remove while calculating skew, and thus skew = 2 - 0 (global minimum since node to be deleted will have no coredns pods) = 2 > 1 = maxSkew. Therefore it claims it can't put pod on any node, and is treating topologySpreadConstraint like a podAntiAffinity rule.
When I increase maxSkew = 2, cluster-autoscaler is able to scale down the unneeded nodes, and honors the maxSkew=2. Issue only seems to occur when maxSkew = 1. In other circumstances, like no TopologySpreadConstraints, cluster-autoscaler is also able to move coredns pods to other nodes and scale down.
How to reproduce it (as minimally and precisely as possible):
Anything else we need to know?:
Deployment Config (labelSelector specific for coredns)
The text was updated successfully, but these errors were encountered: