Skip to content

Commit

Permalink
Merge pull request #4694 from pwschuurman/kep-3335-stable
Browse files Browse the repository at this point in the history
KEP-3335: Update stable milestone to 1.31
  • Loading branch information
k8s-ci-robot authored Jun 12, 2024
2 parents 1f17939 + c774fe9 commit fd25970
Show file tree
Hide file tree
Showing 3 changed files with 44 additions and 14 deletions.
2 changes: 2 additions & 0 deletions keps/prod-readiness/sig-apps/3335.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,5 @@ alpha:
approver: "@wojtek-t"
beta:
approver: "@wojtek-t"
stable:
approver: "@wojtek-t"
51 changes: 39 additions & 12 deletions keps/sig-apps/3335-statefulset-slice/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,7 @@ tags, and then generate with `hack/update-toc.sh`.
- [Graduation Criteria](#graduation-criteria)
- [Alpha](#alpha)
- [Beta](#beta)
- [GA](#ga)
- [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy)
- [Version Skew Strategy](#version-skew-strategy)
- [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire)
Expand Down Expand Up @@ -139,15 +140,15 @@ Items marked with (R) are required *prior to targeting to a milestone / release*
- [X] (R) Design details are appropriately documented
- [X] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
- [ ] e2e Tests for all Beta API Operations (endpoints)
- [ ] (R) Ensure GA e2e tests for meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
- [ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free
- [X] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
- [X] (R) Minimum Two Week Window for GA e2e tests to prove flake free
- [X] (R) Graduation criteria is in place
- [ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
- [ ] (R) Production readiness review completed
- [ ] (R) Production readiness review approved
- [X] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
- [X] (R) Production readiness review completed
- [X] (R) Production readiness review approved
- [X] "Implementation History" section is up-to-date for milestone
- [X] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
- [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
- [X] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes

<!--
**Note:** This checklist is iterative and should be reviewed and updated every time this enhancement is being considered for a milestone.
Expand Down Expand Up @@ -536,6 +537,11 @@ in back-to-back releases.
* Enable feature gate for e2e pipelines
* Add e2e tests

#### GA

* Real-world usage
* ([The LeaderWorkerSet API (LWS)](https://github.com/kubernetes-sigs/lws/blob/main/docs/setup/install.md))

### Upgrade / Downgrade Strategy

<!--
Expand All @@ -562,15 +568,19 @@ components? What are the guarantees? Make sure this is in the test plan.
Consider the following in developing a version skew strategy for this
enhancement:
- Does this enhancement involve coordinating behavior in the control plane and
in the kubelet? How does an n-2 kubelet without this feature available behave
when this feature is used?
- Does this enhancement involve coordinating behavior in the control plane and nodes?
- How does an n-3 kubelet or kube-proxy without this feature available behave when this feature is used?
- How does an n-1 kube-controller-manager or kube-scheduler without this feature available behave when this feature is used?
- Will any other components on the node change? For example, changes to CSI,
CRI or CNI may require updating that component before the kubelet.
-->

There are only `kube-controller-manager` changes involved (in addition to the apiserver changes for dealing with the new StatefulSet field). Node components are not involved so there is no version skew between nodes and the control plane.

An n-1 `kube-controller-manager` will have the same effect (when applicable) as rolling back
to a version where this feature is not enabled. See
[Rollout, upgrade and rollback planning](#rollout-upgrade-and-rollback-planning) for details.

## Production Readiness Review Questionnaire

<!--
Expand Down Expand Up @@ -729,7 +739,7 @@ Longer term, we may want to require automated upgrade/rollback tests, but we
are missing a bunch of machinery and tooling and can't do that now.
-->

Manual upgrade->downgrade->upgrade scenario (to be validated):
A manual upgrade->downgrade->upgrade scenario was performed:

- Create a cluster on a version that doesn't use this feature (eg: 1.26)
- Upgrade a cluster to a version that uses this feature (eg: 1.27)
Expand Down Expand Up @@ -784,7 +794,7 @@ and operation of this feature.
Recall that end users cannot usually observe component logs or access metrics.
-->

- [ ] Other (treat as last resort)
- [X] Other (treat as last resort)
- Details: The user can inspect the pods that are created by the StatefulSet
which match the StatefulSet's selector.

Expand Down Expand Up @@ -959,6 +969,22 @@ This through this both in small and large cases, again with respect to the

No. Resource usage remains the same with this feature.

###### Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)?

<!--
Focus not just on happy cases, but primarily on more pathological cases
(e.g. probes taking a minute instead of milliseconds, failed pods consuming resources, etc.).
If any of the resources can be exhausted, how this is mitigated with the existing limits
(e.g. pods per node) or new limits added by this KEP?
Are there any tests that were run/should be run to understand performance characteristics better
and validate the declared limits?
-->

No. This feature runs only on the control plane (StatefulSet controller within `kube-controller-manager`).
It also doesn't result in any increased node usage, as the number of expected StatefulSet replicas
remains constant whether this feature is enabled (`.spec.ordinals.start` is set).

### Troubleshooting

<!--
Expand Down Expand Up @@ -1045,7 +1071,8 @@ Major milestones might include:

- 2022-06-02: KEP created.
- 2022-10-06: Alpha implementation.
- 2023-02-09: Beta implementation.
- 2023-02-09: Beta graduation.
- 2024-06-04: Stable graduation.

## Drawbacks

Expand Down
5 changes: 3 additions & 2 deletions keps/sig-apps/3335-statefulset-slice/kep.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,17 +15,18 @@ approvers:
- "@soltysh"

# The target maturity stage in the current dev cycle for this KEP.
stage: beta
stage: stable

# The most recent milestone for which work toward delivery of this KEP has been
# done. This can be the current (upcoming) milestone, if it is being actively
# worked on.
latest-milestone: "v1.27"
latest-milestone: "v1.31"

# The milestone at which this feature was, or is targeted to be, at each stage.
milestone:
alpha: "v1.26"
beta: "v1.27"
stable: "v1.31"

# The following PRR answers are required at alpha release
# List the feature gate name and the components for which it must be enabled
Expand Down

0 comments on commit fd25970

Please sign in to comment.