-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cluster Autoscaler not working on the new AL2023 EKS optimised AMI #6963
Comments
/area cluster-autoscaler |
Hi @adrianmoisey and @ashishrajora0808 , You have some solution for the above issue? |
I haven't found any resolution to this. Working with AWS support but no luck yet. Are you facing the same issue @Arulaln-AR ? |
@ashishrajora0808 , yes. i am too working on the same issue with AWS support. But i know the other working solution, which you need to follow below article. It is like attaching role to a service account annotation and referring the service account to the pod. https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html But i also don't want to follow that. |
@ashishrajora0808 , If you don't want to follow the other solution provided above. Simply change the nodegroup ami type to "AL2_x86_64" from "AL2023_x86_64_STANDARD". |
@Arulaln-AR Thanks, but that action will just revert back to the AL2 AMI. I want to use the new AL2023 AMI as AL2 goes end of life next year. |
@Arulaln-AR The issue seems to be due to the nodeconfig, the block apiVersion: node.eks.aws/v1alpha1
|
Not autoscaler related so closing the case. |
@ashishrajora0808 , I heard from aws support. it is because of the token hop limit. It is set to 1 for AL2023 but in AL2 it was set to 2. We need to customize the launch template to make it work. |
Which component are you using?: registry.k8s.io/autoscaling/cluster-autoscaler:v1.29.0
cluster-autoscalerWhat version of the component are you using?:
Component version: v1.29.0
What k8s version are you using (
kubectl version
)?: 1.30kubectl version
OutputWhat environment is this in?: AWS
What did you expect to happen?: On trialling the Amazon linux 2023 EKS optimised AMI, I just expected things to work as the worker nodes in EKS have all the desired permissions for the AS to communicate to the ASG.
What happened instead?:
I am getting errors on the startup of ASG which points to some sort of credentials or networking issue.
How to reproduce it (as minimally and precisely as possible):
│ I0621 15:22:13.971945 1 aws_manager.go:79] AWS SDK Version: 1.48.7 │
│ I0621 15:22:13.972068 1 auto_scaling_groups.go:396] Regenerating instance to ASG map for ASG names: [] │
│ I0621 15:22:13.972083 1 auto_scaling_groups.go:403] Regenerating instance to ASG map for ASG tags: map[k8s.io/cluster-autoscaler/enabled: k8s.io/clust │
│ er-autoscaler/qa-ore-blue:] │
│ E0621 15:24:14.262752 1 aws_manager.go:128] Failed to regenerate ASG cache: RequestError: send request failed │
│ caused by: Post "https://autoscaling.us-west-2.amazonaws.com/": dial tcp: lookup autoscaling.us-west-2.amazonaws.com: i/o timeout │
│ F0621 15:24:14.262782 1 aws_cloud_provider.go:460] Failed to create AWS Manager: RequestError: send request failed │
│ caused by: Post "https://autoscaling.us-west-2.amazonaws.com/": dial tcp: lookup autoscaling.us-west-2.amazonaws.com: i/o timeout
Anything else we need to know?: I updated the AWS VPC CNI plugin as part of the investigation but it did not help
amazon-k8s-cni-init:v1.18.2
The AS service account for the new EKS AL2023 AMI looks like it is not loading secrets. Not sure if this is the cause:
Name: cluster-autoscaler-aws-cluster-autoscaler │
│ Namespace: kube-system │
│ Labels: app.kubernetes.io/instance=cluster-autoscaler │
│ app.kubernetes.io/managed-by=Helm │
│ app.kubernetes.io/name=aws-cluster-autoscaler │
│ app.kubernetes.io/version=1.29.0 │
│ helm.sh/chart=cluster-autoscaler-9.35.0 │
│ Annotations: eks.amazonaws.com/role-arn: arn:aws:iam::123456789987:role/*-eks-worker-role-ore │
│ meta.helm.sh/release-name: cluster-autoscaler │
│ meta.helm.sh/release-namespace: kube-system │
│ Image pull secrets: │
│ Mountable secrets: │
│ Tokens: │
│ Events:
The text was updated successfully, but these errors were encountered: