Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster Autoscaler Does Not Respect TopologySpread maxSkew=1 on Scale Down #6984

Open
greatshane opened this issue Jun 27, 2024 · 1 comment
Labels
area/cluster-autoscaler kind/bug Categorizes issue or PR as related to a bug.

Comments

@greatshane
Copy link

greatshane commented Jun 27, 2024

Which component are you using?:

cluster-autoscaler

What version of the component are you using?:

Component version:
amazonaws.com/cluster-autoscaler:v1.28.0

What k8s version are you using (kubectl version)?:

kubectl version Output
$ kubectl version
Server Version: v1.28.9-eks-036c24b

What environment is this in?:

AWS EKS

What did you expect to happen?:

Expected the cluster-autoscaler to recognize it could scale down a node, and move a coredns pod to another node, while still honoring maxSkew=1.

Current configuration has coredns Deployment of 5 replicas, with TopologySpreadConstraints where topologyKey = "kubernetes.io/hostname", labelSelector = {"matchLabels":{"k8s-app":"kube-dns"}}, whenUnsatisfiable = "DoNotSchedule", maxSkew = 1.

If we currently have 5 nodes and 5 coredns pods like:

1 1 1 1 1

Then after some time, cluster autoscaler determines we no longer need 5 nodes for workload and we can scale down to 4 to save money. What I want to happen is something like:

1 1 1 1 1 -----> 2 1 1 1

This should still be a valid configuration for maxSkew=1.

What happened instead?:

During cluster-autoscaler scale-down simulation (in exact scenario described above), the cluster-autoscaler logs show the below failure. It's unable to scale-down the node even though maxSkew=1 will still be honored after this node is deleted. I'm guessing the cluster-autoscaler is including the node it wants to remove while calculating skew, and thus skew = 2 - 0 (global minimum since node to be deleted will have no coredns pods) = 2 > 1 = maxSkew. Therefore it claims it can't put pod on any node, and is treating topologySpreadConstraint like a podAntiAffinity rule.

19:56:09.838749       1 cluster.go:155] ip-10-177-149-54.ec2.internal for removal
19:56:09.839185       1 klogx.go:87] failed to find place for kube-system/coredns-568: cannot put pod coredns-568 on any node
19:56:09.839209       1 cluster.go:175] node ip-10-177-149-54.ec2.internal is not suitable for removal: can reschedule only 0 out of 1 pods

When I increase maxSkew = 2, cluster-autoscaler is able to scale down the unneeded nodes, and honors the maxSkew=2. Issue only seems to occur when maxSkew = 1. In other circumstances, like no TopologySpreadConstraints, cluster-autoscaler is also able to move coredns pods to other nodes and scale down.

How to reproduce it (as minimally and precisely as possible):

  1. Scale up Nodes to above normal amount
  2. Have a Deployment with replicas >= nodeCount and topologySpreadConstraints similar to config below, making sure maxSkew=1.
  3. Kill whatever done to scale up Nodes (if used) and watch cluster-autoscaler try but fail to scale down unneeded nodes.

Anything else we need to know?:

Deployment Config (labelSelector specific for coredns)


  topologySpreadConstraints:
    - maxSkew: 1
      topologyKey: kubernetes.io/hostname
      whenUnsatisfiable: DoNotSchedule
      labelSelector:
        matchLabels:
          k8s-app: kube-dns
@greatshane greatshane added the kind/bug Categorizes issue or PR as related to a bug. label Jun 27, 2024
@adrianmoisey
Copy link
Member

/area cluster-autoscaler

@greatshane greatshane changed the title Cluster Autoscaler not Respecting TopologySpread maxSkew=1 on Scale Down Cluster Autoscaler Does Not Respect TopologySpread maxSkew=1 on Scale Down Jul 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cluster-autoscaler kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

3 participants