Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kubernetes support policy #8040

Closed
3 tasks done
sbueringer opened this issue Feb 1, 2023 · 7 comments · Fixed by #8189
Closed
3 tasks done

Kubernetes support policy #8040

sbueringer opened this issue Feb 1, 2023 · 7 comments · Fixed by #8189
Assignees
Labels
area/testing Issues or PRs related to testing kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@sbueringer
Copy link
Member

sbueringer commented Feb 1, 2023

Context

Today every new Cluster API minor release adds support for additional Kubernetes minor releases. We rarely drop support for old Kubernetes minor releases.

Motivation

The support matrix for old Kubernetes releases has reached the considerable size of 9 Kubernetes minor releases as of today; the oldest of those releases is out of support since June 2021. The size of this matrix impacts maintenance effort, costs of infrastructure, as well as it prevents Cluster API from using new-ish Kubernetes features.

Proposal

A Cluster API minor release will support: (at the time of the “.0” release)

  • 4 Kubernetes minor releases for the management cluster
  • 6 Kubernetes minor releases for the workload cluster

When a new Kubernetes minor release will be available, it will be supported in an upcoming Cluster API patch release.

Note: Support for a new Kubernetes minor release will only be packport to the latest supported release.

E.g., Cluster API v1.4.0 would support Kubernetes versions:

  • v1.23.x to v1.26.x for the management cluster
  • v1.21.x to v1.26.x for the workload cluster
  • When Kubernetes 1.27 is released, it will be supported in v1.4.x

Details

The idea is to support a wide range of Kubernetes versions so our users are not forced to use the very latest Kubernetes releases, while not supporting too old versions to ensure maintainability of Cluster API.

We think 4 is a good number for management clusters as it's roughly the number of Kubernetes versions supported upstream by Kubernetes (xref: https://kubernetes.io/releases/)

We think 6 is a good number for workload clusters as end users using the workload clusters are usually slower when migrating to newer Kubernetes versions, so we want to give them more time. Also it's comparatively "cheap" for Cluster API to support more workload cluster versions as Cluster API depends way more on the Kubernetes version in the management cluster (as the controllers are running there).

We want to backport the support for a new Kubernetes minor release to the latest stable Cluster API release so users don't have to wait for the next Cluster API minor version to use the new Kubernetes release. We don't want to backport to all supported releases to reduce maintenance effort and to incentivize users to upgrade to the latest Cluster API version.

Tasks

/kind cleanup
/area testing

@k8s-ci-robot k8s-ci-robot added kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. area/testing Issues or PRs related to testing needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Feb 1, 2023
@sbueringer sbueringer self-assigned this Feb 1, 2023
@sbueringer
Copy link
Member Author

cc @kubernetes-sigs/cluster-api-release-team

@killianmuldoon
Copy link
Contributor

This seems like a really good idea to me vs the current unpredictable cadence which is driven by when a Kubernetes version is too hard to keep in the support matrix.

One thing to be aware of - given the CAPI support policy is n+1 - there's a wider matrix of CAPI support for older Kubernetes versions across releases. That is - when 1.4 is released 1.3 will stay in support and 1.3 will be supported for Kubernetes versions older than those supported by 1.4.

Additionally - we should consider putting hard checks on Kubernetes versions e.g. in clusterclt and in webhook - to enforce the support policy and inform users.

@fabriziopandini
Copy link
Member

/triage accepted
+1 to move forward with this effort, defining a rolling support window for the Kubernetes version is necessary for the health of the project

WRT to having safeguards in place, we already tried to give a push to this effort in the past, see #7011 and #7010, but unfortunately, no one volunteered for it (I assume because similar check already exists in downstream products, but I'm not 100% sure).

Given that I personally won't consider safeguards implementation on a critical path for introducing the policy, but definitely, now there is more reason for trying to get them staffed and it will be great if we can get them implemented in time.

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Feb 2, 2023
@CecileRobertMichon
Copy link
Contributor

Thank you @sbueringer for starting the conversation.

Overall I agree with reducing our maintenance + test matrix, just have two points I want to bring up:

  1. CAPI core itself should not care about k8s version. It's the bootstrap and control plane providers that should define a support matrix and enforce it. This is important because different bootstrap providers may support new k8s versions at a different pace. For example, managed Kubernetes offerings have their own support policies for k8s and may not always support the latest 4 releases. Let's say a managed k8s service supports k8s versions 1.23-1.25 and CAPI changes its support matrix to support only k8s 1.24-1.27: then a user using CAPI with the managed cluster infra would be restricted to 1.24 and 1.25 which is only 2 versions. So with this in mind I think the Kubernetes support policy for CAPI should only be "new versions" and blocking older versions should be done by bootstrap/controlplane providers (e.g. KCP and CABPK)

  2. Brought this up a bit during office hours, but there is a difference between strongly discouraging (i.e. "there are no support guarantees around these older versions and we do not test/backport fixes for them") and enforcing/blocking users from using them. IMO we should document our support policy and "drop" support for old versions in the sense that we no longer test them/keep code around for them, but we should still give users an option to override that safeguard with some "unsafe" flag that lets them opt into using older or unsupported versions. This would allow CAPI to remain a flexible tool for developers that may need to build old clusters for dev and testing purposes.

@elmiko
Copy link
Contributor

elmiko commented Feb 6, 2023

thanks for writing this up @sbueringer , in general i'm +1 to this proposal although i do agree with the points that @CecileRobertMichon raised.

also,

Additionally - we should consider putting hard checks on Kubernetes versions e.g. in clusterclt and in webhook - to enforce the support policy and inform users.

i think this is a cool idea, but i would want them to be warnings only for some of the reasons that @CecileRobertMichon pointed out. in specific i really agree with this:

Brought this up a bit during office hours, but there is a difference between strongly discouraging (i.e. "there are no support guarantees around these older versions and we do not test/backport fixes for them") and enforcing/blocking users from using them. IMO we should document our support policy and "drop" support for old versions in the sense that we no longer test them/keep code around for them, but we should still give users an option to override that safeguard with some "unsafe" flag that lets them opt into using older or unsupported versions. This would allow CAPI to remain a flexible tool for developers that may need to build old clusters for dev and testing purposes.

@sbueringer
Copy link
Member Author

sbueringer commented Feb 20, 2023

Thx for the feedback!

If I understand correctly, there are no objections against the proposed policy (i.e. which versions we want to support).

The discussion is mostly about how we could enforce versions:

  • We could have checks in clusterctl, webhooks, controller
  • Should we block versions in all controllers or only in CABPK & KCP

I would like to continue this dicussion on the issue which is specifically for this topic (#7010).

I think for the policy itself we can and should focus on which versions do we actively support, test and keep code paths around for.

I've opened a PR to document the policy: #8134

@sbueringer
Copy link
Member Author

sbueringer commented Feb 27, 2023

Opened a PR to update the release tasks accordingly: #8189

Move auditing the code for version-specific handling to an issue for v1.5: #8190

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/testing Issues or PRs related to testing kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants