Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Planning upgrade to EKS 1.29 #6253

Closed
jackstockley89 opened this issue Oct 9, 2024 · 10 comments
Closed

Planning upgrade to EKS 1.29 #6253

jackstockley89 opened this issue Oct 9, 2024 · 10 comments

Comments

@jackstockley89
Copy link
Contributor

Go through the EKS and Kubernetes release notes for version 1.29 and create a plan to upgrade our clusters

Things to consider:

Review EKS / Kubernetes changelogs & release notes
EKS Module supported at target upgrade version?
Are there any API deprecations & removals? (Check EKS insights)
Are there new components being added?
What changes are being introduced to current components?
Are there changes to core infra of the CP required? i.e. Are all our current components compatible with 1.29?
Are there changes users need to make?
Do we need to expand any of our smoke/integration testing?
Create additional tickets needed for any findings specific to this upgrade

Cluster upgrade Runbook:
https://runbooks.cloud-platform.service.justice.gov.uk/upgrade-eks-cluster.html

Related to: #6252

@jackstockley89
Copy link
Contributor Author

Changelog since v1.28.0

Urgent Upgrade Notes

(No, really, you MUST read this before you upgrade)

  • Stopped accepting component configuration for kube-proxy and kubelet during kubeadm upgrade plan --config. This was a legacy behavior that was not well supported for upgrades and could be used only at the plan stage to determine if the configuration for these components stored in the cluster needs manual version migration. In the future, kubeadm will attempt alternative component config migration approaches. (kubeadm: Remove the support of configurable component configs kubernetes/kubernetes#120788, @chendave)
  • kubeadm: a separate "super-admin.conf" file is now deployed. The User in admin.conf is now bound to a new RBAC Group kubeadm:cluster-admins that has cluster-admin ClusterRole access. The User in super-admin.conf is now bound to the system:masters built-in super-powers / break-glass Group that can bypass RBAC. Before this change, the default admin.conf was bound to system:masters Group, which was undesired. Executing kubeadm init phase kubeconfig all or just kubeadm init will now generate the new super-admin.conf file. The cluster admin can then decide to keep the file present on a node host or move it to a safe location. kubadm certs renew will renew the certificate in super-admin.conf to one year if the file exists; if it does not exist a "MISSING" note will be printed. kubeadm upgrade apply for this release will migrate this particular node to the two file setup. Subsequent kubeadm releases will continue to optionally renew the certificate in super-admin.conf if the file exists on disk and if renew on upgrade is not disabled. kubeadm join --control-plane will now generate only an admin.conf file that has the less privileged User. (kubeadm: add support for separate super-admin.conf kubeconfig file kubernetes/kubernetes#121305, @neolit123)

Changes by Kind

Deprecation

@jackstockley89
Copy link
Contributor Author

Kubernetes 1.29

Kubernetes 1.29 is now available in Amazon EKS. For more information about Kubernetes 1.29, see the official release announcement.

Important

  • The deprecated flowcontrol.apiserver.k8s.io/v1beta2 API version of FlowSchema and PriorityLevelConfiguration are no longer served in Kubernetes v1.29. If you have manifests or client software that uses the deprecated beta API group, you should change these before you upgrade to v1.29.

  • The .status.kubeProxyVersion field for node objects is now deprecated, and the Kubernetes project is proposing to remove that field in a future release. The deprecated field is not accurate and has historically been managed by kubelet - which does not actually know the kube-proxy version, or even whether kube-proxy is running. If you’ve been using this field in client software, stop - the information isn’t reliable and the field is now deprecated.

  • In Kubernetes 1.29 to reduce potential attack surface, the LegacyServiceAccountTokenCleanUp feature labels legacy auto-generated secret-based tokens as invalid if they have not been used for a long time (1 year by default), and automatically removes them if use is not attempted for a long time after being marked as invalid (1 additional year by default). To identify such tokens, a you can run:

kubectl get cm kube-apiserver-legacy-service-account-token-tracking -n kube-system
For the complete Kubernetes 1.29 changelog, see https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.29.md#changelog-since-v1280.

@Matt-Alinosn Matt-Alinosn added this to the EKS: Upgrade to 1.29 milestone Oct 21, 2024
@jackstockley89
Copy link
Contributor Author

EKS Insights: Deprecated APIs removed in Kubernetes v1.29

Deprecation details: /apis/flowcontrol.apiserver.k8s.io/v1beta2/flowschemas
Replaced with: /apis/flowcontrol.apiserver.k8s.io/v1beta3/flowschemas

Deprecation details: /apis/flowcontrol.apiserver.k8s.io/v1beta2/prioritylevelconfigurations
Replaced with: /apis/flowcontrol.apiserver.k8s.io/v1beta3/prioritylevelconfigurations

@jackstockley89
Copy link
Contributor Author

Addons:
vpc-cni
coredns
kube-proxy

@jackstockley89
Copy link
Contributor Author

jackstockley89 commented Oct 23, 2024

https://marcincuber.medium.com/amazon-eks-upgrade-journey-from-1-28-to-1-29-say-hello-to-mandala-858ae0579f4f

I couple of enhancements worth a mention:

  • Advanced pod management feature reached beta status in Kubernetes v1.29. It introduces a sophisticated array of pod management features. #753 has graduated to beta and the SidecarContainers feature gate is enabled by default. This feature allows init containers to continuously run until pod terminates, effectively turning them into sidecar containers. This means that it solves the problem of managing long-running auxiliary processes that need to run alongside the main containers in a pod. For example, if a pod has a main application container and a logging container that collects and forwards logs from the main application, the logging container can be defined as a sidecar container. This allows the logging container to continue running and collecting logs for as long as the main application container is running, providing continuous log collection and forwarding.
  • #2799 has graduated to beta and the LegacyServiceAccountTokenCleanUp feature gate is enabled by default. This feature allows automatic cleanup of unused legacy service account tokens that are secret-based. Specifically, it labels legacy auto-generated secret-based tokens as invalid if they have not been used for a long time (1 year by default), and automatically removes them if use is not attempted for a long time after being marked as invalid (1 additional year by default). To check whether you are using unused tokens, run the following command:

@jackstockley89
Copy link
Contributor Author

EKS Module supported at target upgrade version?
There is nothing in the release documents to suggest 1.29 is incompatible

@jackstockley89
Copy link
Contributor Author

jackstockley89 commented Oct 23, 2024

Cluster Autoscaler

Kubernetes Version CA Version Chart Version
1.29.X 1.29.X 9.35.0+

Descheduler

Current Version: https://github.com/kubernetes-sigs/descheduler/tree/release-1.28/charts/descheduler

New Version for 1.29: https://github.com/kubernetes-sigs/descheduler/tree/release-1.29/charts/descheduler

Ingress Controller

https://github.com/kubernetes/ingress-nginx
Ingress-NGINX version k8s supported version Alpine Version Nginx Version Helm Chart Version  
v1.8.4 1.27, 1.26, 1.25, 1.24 3.18.2 1.21.6 4.7.* Current
v1.9.6 1.29, 1.28, 1.27, 1.26, 1.25 3.19.0 1.21.6 4.9.1 Minimum Suggested
v1.11.0 1.30, 1.29, 1.28, 1.27, 1.26 3.20.0 1.25.5 4.11.0 Minimun Supported

Kuberhealthy

version 104 ?

@jackstockley89
Copy link
Contributor Author

NAME NAMESPACE KIND VERSION REPLACEMENT DEPRECATED DEPRECATED IN REMOVED REMOVED IN REPL AVAIL REPL AVAIL IN
c100-application-cronjob-payments-production c100-application-production CronJob batch/v1beta1 batch/v1 true v1.21.0 true v1.25.0 true v1.21.0
c100-application-cronjob-production c100-application-production CronJob batch/v1beta1 batch/v1 true v1.21.0 true v1.25.0 true v1.21.0
poms-ingress polygraph-offender-management Ingress networking.k8s.io/v1beta1 networking.k8s.io/v1 true v1.19.0 true v1.22.0 true v1.19.0
c100-application-pdb-production c100-application-production PodDisruptionBudget policy/v1beta1 policy/v1 true v1.21.0 true v1.25.0 true v1.21.0
parliamentary-questions-duplicate parliamentary-questions-production Certificate cert-manager.io/v1alpha3 cert-manager.io/v1 true v1.4.0 false v1.6.0 false
track-a-query-production-certificate-duplicate track-a-query-production Certificate cert-manager.io/v1alpha3 cert-manager.io/v1 true v1.4.0 false v1.6.0 false

@jackstockley89 jackstockley89 moved this from 👀 Review/QA to 🥇 Done in Cloud Platform Oct 29, 2024
@jackstockley89 jackstockley89 closed this as completed by moving to 🥇 Done in Cloud Platform Oct 29, 2024
@jaskaransarkaria
Copy link
Contributor

deprecated apis from kubent

__________________________________________________________________________________________
>>> Deprecated APIs removed in 1.22 <<<
------------------------------------------------------------------------------------------
KIND      NAMESPACE                       NAME                                        API_VERSION                 REPLACE_WITH (SINCE)
Ingress   <undefined>                     hmpps-delius-interventions-event-listener   networking.k8s.io/v1beta1   networking.k8s.io/v1 (1.19.0)
Ingress   polygraph-offender-management   poms-ingress                                networking.k8s.io/v1beta1   networking.k8s.io/v1 (1.19.0)
Ingress   <undefined>                     hmpps-interventions-onboarding              networking.k8s.io/v1beta1   networking.k8s.io/v1 (1.19.0)
__________________________________________________________________________________________
>>> Deprecated APIs removed in 1.25 <<<
------------------------------------------------------------------------------------------
KIND                      NAMESPACE                     NAME                                           API_VERSION           REPLACE_WITH (SINCE)
HorizontalPodAutoscaler   <undefined>                   court-list-splitter                            autoscaling/v2beta1   autoscaling/v2 (1.23.0)
CronJob                   c100-application-production   c100-application-cronjob-production            batch/v1beta1         batch/v1 (1.21.0)
CronJob                   c100-application-production   c100-application-cronjob-payments-production   batch/v1beta1         batch/v1 (1.21.0)
CronJob                   <undefined>                   dlq-transfer-cronjob                           batch/v1beta1         batch/v1 (1.21.0)
PodDisruptionBudget       <undefined>                   court-list-splitter                            policy/v1beta1        policy/v1 (1.21.0)
PodDisruptionBudget       <undefined>                   pre-sentence-service                           policy/v1beta1        policy/v1 (1.21.0)
PodDisruptionBudget       <undefined>                   pre-sentence-service-gotenberg                 policy/v1beta1        policy/v1 (1.21.0)
PodDisruptionBudget       <undefined>                   pre-sentence-service-wproofreader              policy/v1beta1        policy/v1 (1.21.0)
PodDisruptionBudget       <undefined>                   hmpps-community-accommodation-wiremock         policy/v1beta1        policy/v1 (1.21.0)
PodDisruptionBudget       c100-application-production   c100-application-pdb-production                policy/v1beta1        policy/v1 (1.21.0)

@jackstockley89
Copy link
Contributor Author

NAME NAMESPACE KIND VERSION REPLACEMENT DEPRECATED DEPRECATED IN REMOVED REMOVED IN REPL AVAIL REPL AVAIL IN
c100-application-cronjob-payments-production c100-application-production CronJob batch/v1beta1 batch/v1 true v1.21.0 true v1.25.0 true v1.21.0
c100-application-cronjob-production c100-application-production CronJob batch/v1beta1 batch/v1 true v1.21.0 true v1.25.0 true v1.21.0
poms-ingress polygraph-offender-management Ingress networking.k8s.io/v1beta1 networking.k8s.io/v1 true v1.19.0 true v1.22.0 true v1.19.0
c100-application-pdb-production c100-application-production PodDisruptionBudget policy/v1beta1 policy/v1 true v1.21.0 true v1.25.0 true v1.21.0
parliamentary-questions-duplicate parliamentary-questions-production Certificate cert-manager.io/v1alpha3 cert-manager.io/v1 true v1.4.0 false v1.6.0 false
track-a-query-production-certificate-duplicate track-a-query-production Certificate cert-manager.io/v1alpha3 cert-manager.io/v1 true v1.4.0 false v1.6.0 false

replay for the two that will stop working after upgrade

Hi Jack,

Regarding your emails about track-a-query-production-certificate-duplicate and parliamentary-questions-duplicate. 
These are not certificates that are in use as far as I am aware. If you are looking to remove them, then that is not a problem.

Thanks
Andrew

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

No branches or pull requests

3 participants