Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS EKS autodiscover autoscaling Pod scheduling failed #1649

Closed
khteh opened this issue Feb 4, 2019 · 21 comments
Closed

AWS EKS autodiscover autoscaling Pod scheduling failed #1649

khteh opened this issue Feb 4, 2019 · 21 comments
Labels
area/cluster-autoscaler area/provider/aws Issues or PRs related to aws provider

Comments

@khteh
Copy link

khteh commented Feb 4, 2019

Failed scheduling pod. Any advice and insight is appreciated.

I0204 07:31:14.597199       1 scheduler_binder.go:338] All volumes for Pod "default/iconverse-elasticsearch-0" match with Node "template-node-for-iconverse-worker-nodes-NodeGroup-1B0XF3VJS2O0J-1244933151533827896"
I0204 07:31:14.597296       1 scale_up.go:152] Scale-up predicate failed: NoVolumeZoneConflict predicate mismatch, cannot put default/iconverse-elasticsearch-0 on template-node-for-iconverse-worker-nodes-NodeGroup-1B0XF3VJS2O0J-1244933151533827896, reason: node(s) had no available volume zone
I0204 07:31:14.597335       1 scheduler_binder.go:338] All volumes for Pod "default/iconverse-mysql-1" match with Node "template-node-for-iconverse-worker-nodes-NodeGroup-1B0XF3VJS2O0J-1244933151533827896"
I0204 07:31:14.597434       1 scale_up.go:152] Scale-up predicate failed: NoVolumeZoneConflict predicate mismatch, cannot put default/iconverse-mysql-1 on template-node-for-iconverse-worker-nodes-NodeGroup-1B0XF3VJS2O0J-1244933151533827896, reason: node(s) had no available volume zone
I0204 07:31:14.597456       1 scale_up.go:181] No pod can fit to iconverse-worker-nodes-NodeGroup-1B0XF3VJS2O0J
I0204 07:31:14.597468       1 scale_up.go:186] No expansion options
I0204 07:31:14.597516       1 static_autoscaler.go:322] Calculating unneeded nodes
I0204 07:31:14.597863       1 factory.go:33] Event(v1.ObjectReference{Kind:"Pod", Namespace:"default", Name:"iconverse-elasticsearch-0", UID:"11cdf1a9-2849-11e9-b3fc-06f1be837e28", APIVersion:"v1", ResourceVersion:"6211123", FieldPath:""}): type: 'Normal' reason: 'NotTriggerScaleUp' pod didn't trigger scale-up (it wouldn't fit if a new node is added)
I0204 07:31:14.597881       1 factory.go:33] Event(v1.ObjectReference{Kind:"Pod", Namespace:"default", Name:"iconverse-mysql-1", UID:"11e05b55-2849-11e9-b3fc-06f1be837e28", APIVersion:"v1", ResourceVersion:"6211125", FieldPath:""}): type: 'Normal' reason: 'NotTriggerScaleUp' pod didn't trigger scale-up (it wouldn't fit if a new node is added)
I0204 07:31:14.603999       1 leaderelection.go:199] successfully renewed lease kube-system/cluster-autoscaler
I0204 07:31:14.685488       1 scale_down.go:175] Scale-down calculation: ignoring 2 nodes, that were unremovable in the last 5m0s
I0204 07:31:14.685582       1 static_autoscaler.go:352] Scale down status: unneededOnly=false lastScaleUpTime=2019-02-04 04:01:00.233513434 +0000 UTC lastScaleDownDeleteTime=2019-02-04 06:49:34.102920447 +0000 UTC lastScaleDownFailTime=2019-02-04 03:24:50.790531631 +0000 UTC schedulablePodsPresent=false isDeleteInProgress=false
I0204 07:31:14.685618       1 static_autoscaler.go:355] Starting scale down
I0204 07:31:14.825312       1 scale_down.go:446] No candidates for scale down
I0204 07:31:16.615359       1 leaderelection.go:199] successfully renewed lease kube-system/cluster-autoscaler
I0204 07:31:18.626796       1 leaderelection.go:199] successfully renewed lease kube-system/cluster-autoscaler
I0204 07:31:20.646423       1 leaderelection.go:199] successfully renewed lease kube-system/cluster-autoscaler
I0204 07:31:22.658910       1 leaderelection.go:199] successfully renewed lease kube-system/cluster-autoscaler
I0204 07:31:24.670532       1 leaderelection.go:199] successfully renewed lease kube-system/cluster-autoscaler
I0204 07:31:24.840315       1 static_autoscaler.go:114] Starting main loop
I0204 07:31:25.076214       1 utils.go:456] No pod using affinity / antiaffinity found in cluster, disabling affinity predicate for this loop
I0204 07:31:25.076235       1 static_autoscaler.go:263] Filtering out schedulables
I0204 07:31:25.076343       1 scheduler_binder.go:338] All volumes for Pod "default/iconverse-elasticsearch-0" match with Node "ip-10-0-2-236.ap-southeast-1.compute.internal"
I0204 07:31:25.076398       1 scheduler_binder.go:338] All volumes for Pod "default/iconverse-mysql-1" match with Node "ip-10-0-2-236.ap-southeast-1.compute.internal"
I0204 07:31:25.076433       1 static_autoscaler.go:273] No schedulable pods
I0204 07:31:25.076446       1 scale_up.go:59] Pod default/iconverse-elasticsearch-0 is unschedulable
I0204 07:31:25.076451       1 scale_up.go:59] Pod default/iconverse-mysql-1 is unschedulable
I0204 07:31:25.193514       1 scale_up.go:92] Upcoming 0 nodes
I0204 07:31:25.337999       1 scheduler_binder.go:338] All volumes for Pod "default/iconverse-elasticsearch-0" match with Node "template-node-for-iconverse-worker-nodes-NodeGroup-1B0XF3VJS2O0J-4078860463823378974"
I0204 07:31:25.338056       1 scale_up.go:152] Scale-up predicate failed: NoVolumeZoneConflict predicate mismatch, cannot put default/iconverse-elasticsearch-0 on template-node-for-iconverse-worker-nodes-NodeGroup-1B0XF3VJS2O0J-4078860463823378974, reason: node(s) had no available volume zone
I0204 07:31:25.338073       1 scheduler_binder.go:338] All volumes for Pod "default/iconverse-mysql-1" match with Node "template-node-for-iconverse-worker-nodes-NodeGroup-1B0XF3VJS2O0J-4078860463823378974"
I0204 07:31:25.338105       1 scale_up.go:152] Scale-up predicate failed: NoVolumeZoneConflict predicate mismatch, cannot put default/iconverse-mysql-1 on template-node-for-iconverse-worker-nodes-NodeGroup-1B0XF3VJS2O0J-4078860463823378974, reason: node(s) had no available volume zone
I0204 07:31:25.338118       1 scale_up.go:181] No pod can fit to iconverse-worker-nodes-NodeGroup-1B0XF3VJS2O0J
I0204 07:31:25.338130       1 scale_up.go:186] No expansion options
I0204 07:31:25.338177       1 static_autoscaler.go:322] Calculating unneeded nodes
I0204 07:31:25.338369       1 factory.go:33] Event(v1.ObjectReference{Kind:"Pod", Namespace:"default", Name:"iconverse-elasticsearch-0", UID:"11cdf1a9-2849-11e9-b3fc-06f1be837e28", APIVersion:"v1", ResourceVersion:"6211123", FieldPath:""}): type: 'Normal' reason: 'NotTriggerScaleUp' pod didn't trigger scale-up (it wouldn't fit if a new node is added)
@bskiba
Copy link
Member

bskiba commented Feb 4, 2019

Which Cluster Autoscaler version?
Can you share pod spec?
Does the cluster span multiple zones?

@aleksandra-malinowska

@bskiba bskiba added area/cluster-autoscaler area/provider/aws Issues or PRs related to aws provider labels Feb 4, 2019
@khteh
Copy link
Author

khteh commented Feb 4, 2019

https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-autodiscover.yaml

which pod? My cluster is multi-zone.

Name:               cluster-autoscaler-59999c5dfd-h92xw
Namespace:          kube-system
Priority:           0
PriorityClassName:  <none>
Node:               ip-10-0-1-92.ap-southeast-1.compute.internal/10.0.1.92
Start Time:         Mon, 04 Feb 2019 11:24:26 +0800
Labels:             app=cluster-autoscaler
                    pod-template-hash=1555571898
Annotations:        <none>
Status:             Running
IP:                 10.0.1.98
Controlled By:      ReplicaSet/cluster-autoscaler-59999c5dfd
Containers:
  cluster-autoscaler:
    Container ID:  docker://e308f2e8a202f54d00c7d992f3b3b21aa6f75962dc90ed569a1290829a84c232
    Image:         k8s.gcr.io/cluster-autoscaler:v1.2.2
    Image ID:      docker-pullable://k8s.gcr.io/cluster-autoscaler@sha256:36a369ca4643542d501bce0addf8b903f2141ae9e2608662b77a3d24f01d7780
    Port:          <none>
    Host Port:     <none>
    Command:
      ./cluster-autoscaler
      --v=4
      --stderrthreshold=info
      --cloud-provider=aws
      --skip-nodes-with-local-storage=false
      --expander=least-waste
      --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/iconverse
    State:          Running
      Started:      Mon, 04 Feb 2019 11:24:27 +0800
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     100m
      memory:  300Mi
    Requests:
      cpu:     100m
      memory:  300Mi
    Environment:
      AWS_REGION:  ap-southeast-1
    Mounts:
      /etc/ssl/certs/ca-bundle.crt from ssl-certs (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from cluster-autoscaler-token-6sqjz (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  ssl-certs:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/ssl/certs/ca-bundle.crt
    HostPathType:  
  cluster-autoscaler-token-6sqjz:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  cluster-autoscaler-token-6sqjz
    Optional:    false
QoS Class:       Guaranteed
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:          <none>

@aleksandra-malinowska
Copy link
Contributor

I0204 07:31:25.338105 1 scale_up.go:152] Scale-up predicate failed: NoVolumeZoneConflict predicate mismatch, cannot put default/iconverse-mysql-1 on template-node-for-iconverse-worker-nodes-NodeGroup-1B0XF3VJS2O0J-4078860463823378974, reason: node(s) had no available volume zone

This line suggests you may have multi-zone ASG. If a pod is requesting a volume in zone A, but CA uses as template node from zone B in the same group, this predicate fails.

Basically, multi-zone instance groups are incompatible with the basic assumption that all nodes in a node group will be identical. It's also not simply a matter of working around this assumption - if the Cluster Autoscaler can't control in which zone the new instance will be created, it can't effectively accommodate such pod. To avoid this, please set up multiple single-zone node groups.

@khteh
Copy link
Author

khteh commented Feb 4, 2019

" if the Cluster Autoscaler can't control in which zone the new instance will be created" - what prevents the CA from controlling it then? My cluster has 3 availability zones. Do you have anything that works with multi-zone? After all, this is required for high availability.

@aleksandra-malinowska
Copy link
Contributor

Do you have (1) a single ASG that spans multiple zones, or (2) multiple ASGs, each in one of the zones? Cluster Autoscaler supports high availability setup via (2). If you have (1), only ASG controls where it starts a new instance. As far as I understand, there's no API that would allow the user to choose a zone for a new instance in this case.

@khteh
Copy link
Author

khteh commented Feb 4, 2019

I have (1). Does it mean that I cannot use the CA in this case? Thanks.

@aleksandra-malinowska
Copy link
Contributor

It won't work as expected, especially if you're using feature that require zone-aware scheduling. I would suggest migrating to (2) if possible.

@khteh
Copy link
Author

khteh commented Feb 4, 2019

"there's no API that would allow the user to choose a zone for a new instance in this case." - which cloud provider does the cluster autoscaler work seemlessly now with "multi-zone"?

@aleksandra-malinowska
Copy link
Contributor

That depends what you call "seamless"; for example, on GKE the multi-zonal node-pool is implemented as multiple single-zone instance groups, which are treated as independent node groups. This is essentially the same setup as (2).

You're not the first person questioning this setup, it comes up every couple of months at least since kubernetes-retired/contrib#1552 (comment). Without an ability to provision an instance in a specific zone, there's really not much that autoscaler can do. Even if it kept adding instances until one happened to be in the right zone, the same problem occurs with scale down - delete too many empty nodes from one zone, and ASG will "rebalance" itself, potentially breaking the setup.

@khteh
Copy link
Author

khteh commented Feb 4, 2019

Is it something that we can do to enroll the cloud providers to provide this APIs?

@khteh
Copy link
Author

khteh commented Feb 4, 2019

Is it possible to achieve (2) on AWS EKS?

@aleksandra-malinowska
Copy link
Contributor

There are definitely users with this setup on AWS. I'm not sure how EKS provisions multi-zone node groups, so I can't answer here. cc @Jeffwan

@Jeffwan
Copy link
Contributor

Jeffwan commented Feb 4, 2019

@khteh Not sure how did you provision your EKS cluster?

If you use eksctl,

  1. You can easily add --node-zones to make ASG only bring up nodes in one single availability zone.
eksctl create cluster --name=sample-cluster --nodes=2 --node-type=m4.xlarge --ssh-access --region=us-west-2 --node-zones=us-west-2a
  1. If you need more ASG, check here to add more nodegroups

@khteh
Copy link
Author

khteh commented Feb 5, 2019

@Jeffwan, I use CloudFormation with AWS Console.

@Jeffwan
Copy link
Contributor

Jeffwan commented Feb 5, 2019

Hi @khteh, Then you need to make changes in CloudFormation. Use single AZ before you create node group.

image

@khteh
Copy link
Author

khteh commented Feb 5, 2019

I have deleted my cluster and created it using eksctl. How to create a new ASG with different availability zone?

@khteh
Copy link
Author

khteh commented Feb 5, 2019

And how to get details of which AZ each ASG runs? eksctl get nodegroup --clustername=<foo> does not give much information.

@Jeffwan
Copy link
Contributor

Jeffwan commented Feb 5, 2019

@khteh Can you follow instruction I attached? It has all the information you need.

To get AZ info for reach ASG, I would recommend you to go to AWS portal.

@khteh
Copy link
Author

khteh commented Feb 5, 2019

I correct my question: "How to create a new ASG in a specific availability zone?" eksctl does not have this AZ command-line option.

@Jeffwan
Copy link
Contributor

Jeffwan commented Feb 5, 2019

@khteh

I think this is what you want, just add --node-zones option.

eksctl create cluster --name=sample-cluster --nodes=2 --node-type=m4.xlarge --ssh-access --region=us-west-2 --node-zones=us-west-2a

@khteh
Copy link
Author

khteh commented Feb 5, 2019

It works now. Thanks!

@khteh khteh closed this as completed Feb 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cluster-autoscaler area/provider/aws Issues or PRs related to aws provider
Projects
None yet
Development

No branches or pull requests

4 participants