Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Version validation according Kubernetes Version Skew Policy #7011

Closed
chrischdi opened this issue Aug 4, 2022 · 19 comments
Closed

Version validation according Kubernetes Version Skew Policy #7011

chrischdi opened this issue Aug 4, 2022 · 19 comments
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.

Comments

@chrischdi
Copy link
Member

User Story

As a operator I would like to ensure that creating/updating a Cluster/KubeadmControlPlane/MachineDeployment/MachineSet/Machine does not violate the [Kubernetes Version Skew Policy] for staying in supported upgrade paths and not break running applications.

Detailed Description

This issue proposes to add additional validation against the [Kubernetes Version Skew Policy].

Assuming version v1.X is the desired kubernetes version:

  • For a KubeadmControlPlane or a ControlPlane Machine:
    • [...] the newest and oldest kube-apiserver instances must be within one minor version.

  • For a MachineDeployment, MachineSet or workload Machine
    • [...] must not be newer than kube-apiserver, and may be up to two minor versions older.

To-be-discussed:

  • Should the Kubernetes Version Skew Policy validations only get implemented for ClusterClass based Clusters / Validation on Cluster.spec.topology.version?
    • An arguments for that is: on ClusterClass based Clusters the topology and relations are well-defined with a top-down approach.

Anything else you would like to add:

  • There is already version validation at the ClusterClass based clusters (via .spec.topology.version). xref

    • The current implementation does not allow to downgrade patch versions, which would be allowed from the Kubernetes Version Skew policy side.
  • Kubeadm also does validation against the Kubernetes Version Skew Policy, but this may not map to the current documentation at [Kubernetes Version Skew Policy].

/kind feature

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Aug 4, 2022
@neolit123
Copy link
Member

neolit123 commented Aug 4, 2022

Kubeadm also does validation against the Kubernetes Version Skew Policy, but this may not map to the current documentation at [Kubernetes Version Skew Policy].

the supported skew in kubeadm is documented here
https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/#version-skew-policy

that said it does not strictly follow some more controversial areas, such as the kube-proxy / kubelet skew.

@killianmuldoon
Copy link
Contributor

I think there's a lot of overlap with this issue #7010 and existing issues e.g. #4321, #6614, #6040 - there might be more.

It would be good to ensure that the older issues are all covered by the newer ones, close the older ones and continue the conversation on these more holistic issues.

@chrischdi
Copy link
Member Author

I agree for #6040 regarding this issue.

#4321 and #6614 are more related to #7010

@fabriziopandini
Copy link
Member

Thanks for looking for duplicates; I have consolidated everything in #7010 and in this issue

@sbueringer
Copy link
Member

sbueringer commented Aug 16, 2022

Q: Should the validation only consider the spec.version or should it also consider the current status?

E.g. if we only validate the spec it's possible to bump the version two minor versions before the rollout is finished.

@chrischdi
Copy link
Member Author

Q: Should the validation only consider the spec.version or should it also consider the current status?

E.g. if we only validate the spec it's possible to bump the version two minor versions before the rollout is finished.

Maybe yes, maybe no, also related here: #6651 .

@fabriziopandini
Copy link
Member

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Sep 30, 2022
@fabriziopandini
Copy link
Member

/help

@k8s-ci-robot
Copy link
Contributor

@fabriziopandini:
This request has been marked as needing help from a contributor.

Guidelines

Please ensure that the issue body includes answers to the following questions:

  • Why are we solving this issue?
  • To address this issue, are there any code changes? If there are code changes, what needs to be done in the code and what places can the assignee treat as reference points?
  • Does this issue have zero to low barrier of entry?
  • How can the assignee reach out to you for help?

For more details on the requirements of such an issue, please see here and ensure that they are met.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.

In response to this:

/help

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Sep 30, 2022
@ramineni
Copy link
Contributor

/assign

@sbueringer
Copy link
Member

sbueringer commented Oct 19, 2022

Do we have a clear idea what exactly we want to implement and where?
(just asking, because it would be good to have consensus on that before we go back and forth on an implementation)

@fabriziopandini
Copy link
Member

+1 to nail down some details first
@ramineni is it ok for your to some comments here as soon as you make up your mind around this topic/before starting implementation?

@ramineni
Copy link
Contributor

Do we have a clear idea what exactly we want to implement and where? (just asking, because it would be good to have consensus on that before we go back and forth on an implementation)

Have not gone that much far, just started to explore the details . I'll sure discuss here or in slack before going for implementation.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle stale
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 8, 2023
@chrischdi
Copy link
Member Author

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 8, 2023
@ramineni ramineni removed their assignment Mar 21, 2023
@k8s-triage-robot
Copy link

This issue has not been updated in over 1 year, and should be re-triaged.

You can:

  • Confirm that this issue is still relevant with /triage accepted (org members only)
  • Close this issue with /close

For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/

/remove-triage accepted

@k8s-ci-robot k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. and removed triage/accepted Indicates an issue or PR is ready to be actively worked on. labels Mar 20, 2024
@fabriziopandini
Copy link
Member

/priority important-longterm

@k8s-ci-robot k8s-ci-robot added the priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. label Apr 12, 2024
@fabriziopandini
Copy link
Member

The Cluster API project currently lacks enough active contributors to adequately respond to all issues and PRs.

We mitigated the issue with machine set preflight checks and we highly reccommend users to turn them on.
/close

@k8s-ci-robot
Copy link
Contributor

@fabriziopandini: Closing this issue.

In response to this:

The Cluster API project currently lacks enough active contributors to adequately respond to all issues and PRs.

We mitigated the issue with machine set preflight checks and we highly reccommend users to turn them on.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Projects
None yet
Development

No branches or pull requests

8 participants