-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AWS EKS autodiscover autoscaling Pod scheduling failed #1649
Comments
Which Cluster Autoscaler version? |
which pod? My cluster is multi-zone.
|
This line suggests you may have multi-zone ASG. If a pod is requesting a volume in zone A, but CA uses as template node from zone B in the same group, this predicate fails. Basically, multi-zone instance groups are incompatible with the basic assumption that all nodes in a node group will be identical. It's also not simply a matter of working around this assumption - if the Cluster Autoscaler can't control in which zone the new instance will be created, it can't effectively accommodate such pod. To avoid this, please set up multiple single-zone node groups. |
" if the Cluster Autoscaler can't control in which zone the new instance will be created" - what prevents the CA from controlling it then? My cluster has 3 availability zones. Do you have anything that works with multi-zone? After all, this is required for high availability. |
Do you have (1) a single ASG that spans multiple zones, or (2) multiple ASGs, each in one of the zones? Cluster Autoscaler supports high availability setup via (2). If you have (1), only ASG controls where it starts a new instance. As far as I understand, there's no API that would allow the user to choose a zone for a new instance in this case. |
I have (1). Does it mean that I cannot use the CA in this case? Thanks. |
It won't work as expected, especially if you're using feature that require zone-aware scheduling. I would suggest migrating to (2) if possible. |
"there's no API that would allow the user to choose a zone for a new instance in this case." - which cloud provider does the cluster autoscaler work seemlessly now with "multi-zone"? |
That depends what you call "seamless"; for example, on GKE the multi-zonal node-pool is implemented as multiple single-zone instance groups, which are treated as independent node groups. This is essentially the same setup as (2). You're not the first person questioning this setup, it comes up every couple of months at least since kubernetes-retired/contrib#1552 (comment). Without an ability to provision an instance in a specific zone, there's really not much that autoscaler can do. Even if it kept adding instances until one happened to be in the right zone, the same problem occurs with scale down - delete too many empty nodes from one zone, and ASG will "rebalance" itself, potentially breaking the setup. |
Is it something that we can do to enroll the cloud providers to provide this APIs? |
Is it possible to achieve (2) on AWS EKS? |
There are definitely users with this setup on AWS. I'm not sure how EKS provisions multi-zone node groups, so I can't answer here. cc @Jeffwan |
@khteh Not sure how did you provision your EKS cluster? If you use eksctl,
|
@Jeffwan, I use CloudFormation with AWS Console. |
Hi @khteh, Then you need to make changes in CloudFormation. Use single AZ before you create node group. |
I have deleted my cluster and created it using eksctl. How to create a new ASG with different availability zone? |
And how to get details of which AZ each ASG runs? |
@khteh Can you follow instruction I attached? It has all the information you need. To get AZ info for reach ASG, I would recommend you to go to AWS portal. |
I correct my question: "How to create a new ASG in a specific availability zone?" eksctl does not have this AZ command-line option. |
I think this is what you want, just add
|
It works now. Thanks! |
Failed scheduling pod. Any advice and insight is appreciated.
The text was updated successfully, but these errors were encountered: