diff --git a/keps/prod-readiness/sig-api-machinery/2839.yaml b/keps/prod-readiness/sig-api-machinery/2839.yaml new file mode 100644 index 00000000000..40eb40fdce5 --- /dev/null +++ b/keps/prod-readiness/sig-api-machinery/2839.yaml @@ -0,0 +1,6 @@ +# The KEP must have an approver from the +# "prod-readiness-approvers" group +# of http://git.k8s.io/enhancements/OWNERS_ALIASES +kep-number: 2839 +alpha: + approver: "@johnbelamaric" diff --git a/keps/sig-api-machinery/2839-in-use-protection/README.md b/keps/sig-api-machinery/2839-in-use-protection/README.md new file mode 100644 index 00000000000..db6c2d4cd54 --- /dev/null +++ b/keps/sig-api-machinery/2839-in-use-protection/README.md @@ -0,0 +1,740 @@ +# KEP-2839: In-use protection + + + + +- [Release Signoff Checklist](#release-signoff-checklist) +- [Summary](#summary) +- [Motivation](#motivation) + - [Goals](#goals) + - [Non-Goals](#non-goals) +- [Proposal](#proposal) +- [Glossary](#glossary) + - [User Stories (Optional)](#user-stories-optional) + - [Story 1](#story-1) + - [Story 2](#story-2) + - [Risks and Mitigations](#risks-and-mitigations) +- [Design Details](#design-details) + - [API Changes](#api-changes) + - [Other Design Considerations](#other-design-considerations) + - [Behavior with ownerReference](#behavior-with-) + - [Namespace Deletion](#namespace-deletion) + - [Block Adding Additional Liens while Deleting](#block-adding-additional--while-deleting) + - [Race of removing/adding liens](#race-of-removingadding-liens) + - [Unresolved issues](#unresolved-issues) + - [Test Plan](#test-plan) + - [Prerequisite testing updates](#prerequisite-testing-updates) + - [Unit tests](#unit-tests) + - [Integration tests](#integration-tests) + - [e2e tests](#e2e-tests) + - [Graduation Criteria](#graduation-criteria) + - [Alpha](#alpha) + - [Beta](#beta) + - [GA](#ga) + - [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy) + - [Version Skew Strategy](#version-skew-strategy) +- [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire) + - [Feature Enablement and Rollback](#feature-enablement-and-rollback) + - [Rollout, Upgrade and Rollback Planning](#rollout-upgrade-and-rollback-planning) + - [Monitoring Requirements](#monitoring-requirements) + - [Dependencies](#dependencies) + - [Scalability](#scalability) + - [Troubleshooting](#troubleshooting) +- [Implementation History](#implementation-history) +- [Drawbacks](#drawbacks) +- [Alternatives](#alternatives) +- [Infrastructure Needed (Optional)](#infrastructure-needed-optional) + + +## Release Signoff Checklist + + + +Items marked with (R) are required *prior to targeting to a milestone / release*. + +- [ ] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR) +- [ ] (R) KEP approvers have approved the KEP status as `implementable` +- [ ] (R) Design details are appropriately documented +- [ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors) + - [ ] e2e Tests for all Beta API Operations (endpoints) + - [ ] (R) Ensure GA e2e tests for meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md) + - [ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free +- [ ] (R) Graduation criteria is in place + - [ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md) +- [ ] (R) Production readiness review completed +- [ ] (R) Production readiness review approved +- [ ] "Implementation History" section is up-to-date for milestone +- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io] +- [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes + + + +[kubernetes.io]: https://kubernetes.io/ +[kubernetes/enhancements]: https://git.k8s.io/enhancements +[kubernetes/kubernetes]: https://git.k8s.io/kubernetes +[kubernetes/website]: https://git.k8s.io/website + +## Summary + +This KEP proposes a generic feature to protect objects from deletion while they are marked as in-use. + +## Motivation + +Currently, "a generic mechanism that prevents a resource from being deleted when it is in-use" doesn't exist. +Instead, for such a use case, each controller, like [pv-protection](https://github.com/kubernetes/enhancements/issues/499) and [pvc-protection](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/storage/postpone-pvc-deletion-if-used-in-a-pod.md), implements its own logic to protect used objects from being deleted while in-use. +These controllers use [Finalizers](https://kubernetes.io/docs/concepts/overview/working-with-objects/finalizers/) to block deletion. +Finalizers block the _completion_ of a delete operation, but they do not prevent the deletion from starting. +Once started, a delete _will_ complete and can not be aborted. + +Finalizers may not be appropriate for some resource protection use cases, because they don't prevent other controllers from executing their pre-deletion actions. +The order of execution of pre-deletion actions across finalizers is not defined. +As a result, other controllers will execute their pre-deletion actions while a finalizer for protection still exists. + +This KEP aims to provide a generic mechanism for protection that solves the above issues with Finalizers, because such use cases are also found for protecting secrets and configmaps in [KEP-2639](https://github.com/kubernetes/enhancements/pull/2640). + +Potential use cases are: +- To guarantee proper order of deletion: + - Protect `PersistentVolume` while being used by `PersistentVolumeClaim` (Replace the background logic for pv-protection) + - Protect `PersistentVolumeClaim` while being used by `Pod` (Replace the background logic for pvc-protection) + - Protect `Secret` while being used by `Pod`, `PersistentVolume`, `VolumeSnapshotContent` (Secret protection above) + - Protect `ConfigMap` while being used by `Pod` + - Protect `StorageClass` while being used by `PersistentVolume` + - Protect `VolumeSnapshot` while being used by `VolumeSnapshotContent` + - Protect `VolumeSnapshotClass` while being used by `VolumeSnapshotContent` + - Protect `User`, `Group`, `ServiceAccount` and `Role` from `RoleBinding` + - Protect `User`, `Group`, `ServiceAccount` and `ClusterRole` from `ClusterRoleBinding` + - Protect dependent resources while owner resources aren't request to be deleted +- To avoid accidental deletion of important resources: + - Protect resources that controllers or users marked as important + +### Goals + +- Provide a generic mechanism that prevents a resource, including CRD, from being deleted when it shouldn't be deleted +- Implement the feature as an advisory feature + +### Non-Goals + +- Provide a specific mechanism to decide which particular objects should be prevented from being deleted when other particular objects exist +- Provide a generic mechanism that prevents a resource from being updated when it shouldn't be updated +- Implement any of the above potential use cases + +## Proposal + +A new field `Liens` to mark the object not to be deleted is introduced in object Metadata. +Deletion requests for resources with non-empty `Liens` will be blocked by a newly introduced validation in api-server. + +## Glossary + +- Lien: An indication that some entity is relying on an API object. Deletion of objects with liens will be prevented until the interested party releases the lien. + +### User Stories (Optional) + +#### Story 1 + +Another controller to protect Secret from deletion watches all Secrets and their potential user objects, like `Pod`, `PersistentVolumes`, and `VolumeSnapshotContent`. Once it find the Secret is used by one of the objects, it updates the Secret's `Liens` and ask this mechanism to block deletion request for the secret. + +#### Story 2 + +A user knows this is an important object, and that terraform likes to delete and recreate objects, so the user sets a lien so that an accidental deletion will fail. + + + +### Risks and Mitigations + + + +## Design Details + +Most of the design ideas come from [here](https://github.com/kubernetes/kubernetes/issues/10179). +Users or controllers can add a string to `Liens` to ask to protect a resource. +A newly introduced validation in api-server will reject deletion requests for resources with non-empty `Liens` and return ["409 Conflict" error code](https://datatracker.ietf.org/doc/html/rfc2616#section-10.4.10). + +`Liens` is defined as a slice of strings, like `Finalizers`. +The strings need to be namespaced keys. +Multiple users or controllers can add their `Liens` for their own purpose, and they can remove their own `Liens` when it is no longer needed. +`Liens` should be added per controller or per user basis. +Deletion requests of the resource are blocked until the last `Liens` for the resource is removed +(Just to be clear, the difference between `Liens` and `Finalizers` is that `Liens` blocks the deletion request itself, while `Finalizers` blocks the deletion to be completed). + +A PoC implementation for in-use protection can be found, [here](https://github.com/mkimuram/kubernetes/commits/lien). +Also, how it can be consumed by Secret protection controller can be found, [here](https://github.com/mkimuram/secret-protection/tree/lien). + +### API Changes + +`Liens` is added to `ObjectMeta`. + +```go +Liens []string +``` + +Validation criteria of the field are as follows: +- Keys must be namespaced (example: `kubernetes.io/secret-protection`; `foo.example/bar`) +- Maximum length of each key is 253 characters +- Maximum number of keys is 32 + +### Other Design Considerations + +#### Behavior with `ownerReference` + +Lien itself only blocks a deletion request to an object that is added to. +However, it may not be clear how lien behaves when used with [`ownerReference`](https://kubernetes.io/docs/concepts/architecture/garbage-collection/#owners-dependents). +Therefore, this section describes the behavior. + +In summary, lien will block cascading deletion, but would not block deletion of any dependent resources individually. + +For example, let's assume that we have Deployment A which manages Replicaset B which manages Pod C. +In this situation, ownerRefernces are set from Pod C to Replicaset B and from Replicaset B to Deployment A. + +If a lien is set to Deployment A, only a deletion request to Deployment A is blocked. +Users can still request to delete Replicaset B and Pod C directly, but they can't request to delete them through Deployment A. + +If a lien is set to Replicaset B, only a deletion request to Replicaset B is blocked. +Users can still request to delete Deployment A and Pod C directly. +When deletion of the Deployment A is requested, whether it completes immediately or not depends on the cascading policy and Replicaset B's `blockOwnerDeletion`. +The behaviors are as follows: + +- Foreground cascading deletion: + - `blockOwnerDeletion=true`: Deletion of Deployment A isn't completed until Replicaset B is deleted + - otherwise: Deployment A is deleted immediately +- Background cascading deletion: Deployment A is deleted immediately + +#### Namespace Deletion + +It may also not be clear how [namespace deletion](https://kubernetes.io/docs/tasks/administer-cluster/namespaces/#deleting-a-namespace) behaves with lien. +Therefore, this section describes the behavior. + +If a lien is set to a namespace, only a deletion request to the namespace is blocked. +Users can still request to delete each object in the namespace directly, but they can't request to delete it through the namespace deletion. + +If a lien is set to some of objects in a namespace, a deletion request to the namespace isn't blocked. +This means that the resources with lien in the namespace will eventually be deleted by namespace lifecycle controller. +Users can request to delete its namespace or other objects in the namespace. + +In addition, users may expect that the order of resource deletions inside a namespace is guaranteed even when requested through the namespace deletion. +For alpha, resources with liens are deleted in a nondeterministic order on the namespace deletion. +Handling order of deletion is a beta blocker and won't be included in the initial implementation. + +#### Block Adding Additional `Liens` while Deleting + +`Liens` shouldn't be added after `DeletionTimestamp` is non-nil, which means pre-deletion processes for finalizers are being handled. +Adding `Liens` while deleting itself won't do any harm for the cluster, because the resource will be deleted without any additional deletion requests that will be blocked by `Liens`. +However, users may think that the successful addition of the `Liens` means that the resource won't be deleted until the `Liens` are deleted, which isn't true. +To avoid such a misunderstanding from users, the API server must block any request for adding additional `Liens` to a resource with non-nil `DeletionTimestamp`. + +#### Race of removing/adding liens + +A cluster-admin would be racing to update the object to remove the lien and delete the object before a namespace editor is able to place the lien back. As a result, a namespace admin or editor can create an object that a cluster-admin or namespace lifecycle controller cannot delete, which shouldn't be allowed. + +To mitigate the risk, a new `IgnoreLiens` API option in `DeleteOptions` to force delete a resource with liens will be added. + +Specifying the option should only be allowed to a limited set of users and groups, therefore deletion request will be blocked in api-server, if invalid users request deletion with the option. +Restricting the `IgnoreLiens` API option to specific users is beta blocker and won't be included in the initial implementation. +Therefore, in the initial implementation, any users with delete permission will be able to delete resources with liens by specifying the `IgnoreLiens` API option. + +#### Unresolved issues + +- Decision: When you delete a namespace and an object in that namespace has a lien: + - Does the namespace and object get deleted anyway? + - [x] Yes. + - Does the namespace deletion get blocked (i.e., the delete request fails) if any object in the namespace has a lien? + - [x] No. + - Does the other objects in the namespace get deleted, but that not object, preventing the namespace deletion from completing? + - [x] No. If you delete the namespace, the content will be removed. + - Does the namespace deletion not proceed until all objects in the namespace are free of liens? + - [ ] No. If you delete the namespace, the content will be removed. Namespace lifecycle cleanup will not honor liens. + - [x] No. If you delete the namespace, the content will be removed. Namespace lifecycle cleanup will honor liens only to guarantee the deletion order inside a namespace (guarantee of the deletion order is a beta blocker). + - [ ] Yes. Namespace lifecycle cleanup will honor liens. If you delete the namespace, it won't be deleted until all the content will be deleted. +- Name: + - [x] "liens" is short and precise, but an unusual word + - [ ] Other candidates: InUse, deleteInhibitors, hold, lease, claim, deletionBlockers, protections, guards + +Please also see the original comments [here](https://github.com/kubernetes/enhancements/pull/2840#issuecomment-1023774538) and [here](https://github.com/kubernetes/enhancements/pull/2840#issuecomment-1024437024). + +### Test Plan + +[x] I/we understand the owners of the involved components may require updates to +existing tests to make this code solid enough prior to committing the changes necessary +to implement this enhancement. + +##### Prerequisite testing updates + + + +##### Unit tests + + + + + +- `staging/src/k8s.io/apiserver/pkg/registry/rest`: `2022/6/13` - `83.3%` +- `staging/src/k8s.io/kubectl/pkg/cmd/delete`: `2022/6/13` - `76.1%` +- `pkg/controller/namespace/deletion`: `2022/6/13` - `68.6%` + +##### Integration tests + + + +- ["test-lien"](https://github.com/kubernetes/kubernetes/blob/master/test/integration/apiserver/apply/apply_crd_test.go): +- ["test-namespace-conditions"](https://github.com/kubernetes/kubernetes/blob/master/test/integration/namespace/ns_conditions_test.go#L47): + +##### e2e tests + + + +- Verify immediate deletion of a secret with empty liens: +- Verify that setting liens field is blocked if key is not namespaced: +- Verify that setting liens field is blocked if the length of any keys is longer than 253: +- Verify that setting liens field is blocked if the number of the keys is more than 32: +- Verify that secret with non-empty liens is not removed immediately: +- Verify that foreground owner deletion isn't complete while dependent with blockOwnerDeletion=true and lien exists: +- Verify that namespace deletion completes while a resource with lien exists in the namespace: +- Verify that adding liens to non-nil DeletionTimestamp fails: + +### Graduation Criteria + +#### Alpha + +- Feature implemented behind a feature flag +- Initial e2e tests completed and enabled + +#### Beta + +- Gather feedback from developers and surveys +- Additional tests are in Testgrid and linked in KEP + +#### GA + +- Allowing time for feedback + +**Note:** Generally we also wait at least two releases between beta and +GA/stable, because there's no opportunity for user feedback, or even bug reports, +in back-to-back releases. + +**For non-optional features moving to GA, the graduation criteria must include +[conformance tests].** + +[conformance tests]: https://git.k8s.io/community/contributors/devel/sig-architecture/conformance-tests.md + +### Upgrade / Downgrade Strategy + + +- Upgrade: + - Method: Enable the InUseProtection feature gate + - Behavior: + - Setting lien field is allowed unless `DeletionTimestamp` is non-nil + - Deletion request for resources with non-empty lien is blocked +- Downgrade: + - Method: Disable the InUseProtection feature gate + - Behavior: + - Setting lien field is blocked + - Deletion request for resources with non-empty lien isn't blocked + +### Version Skew Strategy + + + +## Production Readiness Review Questionnaire + + + +### Feature Enablement and Rollback + + + +###### How can this feature be enabled / disabled in a live cluster? + +- [x] Feature gate (also fill in values in `kep.yaml`) + - Feature gate name: InUseProtection + - Components depending on the feature gate: kube-apiserver +- [ ] Other + - Describe the mechanism: + - Will enabling / disabling the feature require downtime of the control + plane? + - Will enabling / disabling the feature require downtime or reprovisioning + of a node? (Do not assume `Dynamic Kubelet Config` feature is enabled). + +###### Does enabling the feature change any default behavior? + +Deletion requests for an object are blocked while its `Liens` field is non-empty. + +###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)? + +Yes, by disabling the feature gates. + +###### What happens if we reenable the feature if it was previously rolled back? + +Deletion requests for an object are blocked while its `Liens` field is non-empty, again. + +###### Are there any tests for feature enablement/disablement? + +Tests covering feature enablement/disablement will be added prior Alpha release. + +### Rollout, Upgrade and Rollback Planning + + + +###### How can a rollout or rollback fail? Can it impact already running workloads? + + + +###### What specific metrics should inform a rollback? + + + +###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested? + + + +###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.? + + + +### Monitoring Requirements + + + +###### How can an operator determine if the feature is in use by workloads? + + + +###### How can someone using this feature know that it is working for their instance? + + + +- [ ] Events + - Event Reason: +- [ ] API .status + - Condition name: + - Other field: +- [ ] Other (treat as last resort) + - Details: + +###### What are the reasonable SLOs (Service Level Objectives) for the enhancement? + + + +###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service? + + + +- [ ] Metrics + - Metric name: + - [Optional] Aggregation method: + - Components exposing the metric: +- [ ] Other (treat as last resort) + - Details: + +###### Are there any missing metrics that would be useful to have to improve observability of this feature? + + + +### Dependencies + + + +###### Does this feature depend on any specific services running in the cluster? + + + +### Scalability + + + +###### Will enabling / using this feature result in any new API calls? + +Not directly, but users or controllers may call more deletion requests to retry. + +###### Will enabling / using this feature result in introducing new API types? + +No. + +###### Will enabling / using this feature result in any new calls to the cloud provider? + +No. + +###### Will enabling / using this feature result in increasing size or count of the existing API objects? + +Describe them, providing: + - API type(s): `ObjectMeta` + - Estimated increase in size: a slice of `Liens` of size 8,096B (253 * 32) at most + - Estimated amount of new objects: a new slice of `Liens` for every existing `ObjectMeta` + +###### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs? + + + +###### Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, ...) in any components? + + + +### Troubleshooting + + + +###### How does this feature react if the API server and/or etcd is unavailable? + +Deletion won't also happens if the API server and/or etcd are unavailable for all controllers. +Therefore, it won't affect the protection. + +###### What are other known failure modes? + + +- Garbage collector continues to try to delete a dependent resource that lien is added + - Detection: + - Deletion of an owner of a resource isn't completed. + - A dependent resource isn't deleted. + - Mitigations: Delete the lien on the dependent resource. + - Diagnostics: + - Log messages: [message](https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/garbagecollector/garbagecollector.go#L475) like below repeatedly logged. + `I0908 22:00:47.800913 4063637 garbagecollector.go:475] "Processing object" object="default/nginx-deployment-66b6c48dd5" objectUID=d38e5bc2-1b10-4f08-8e54-6c6c9afbfe3c kind="ReplicaSet" virtual=false` + - Log Level: 2 + - Testing: E2E test `Verify that foreground owner deletion isn't complete while dependent with blockOwnerDeletion=true and lien exists` covers this failure mode. + +###### What steps should be taken if SLOs are not being met to determine the problem? + +## Implementation History + + + +## Drawbacks + + + +## Alternatives + + +- Each controller for protecting objects implements its own logic. However, it requires much implementations for the same logic and potentially provides inconsistent user interfaces, like many different finalizers and different ways to opt-out, to users. +- Implement similar mechanism by using finalizers or admission webhook to block deletion + +## Infrastructure Needed (Optional) + + diff --git a/keps/sig-api-machinery/2839-in-use-protection/kep.yaml b/keps/sig-api-machinery/2839-in-use-protection/kep.yaml new file mode 100644 index 00000000000..4559d47c174 --- /dev/null +++ b/keps/sig-api-machinery/2839-in-use-protection/kep.yaml @@ -0,0 +1,52 @@ +title: In-use protection +kep-number: 2839 +authors: + - "@mkimuram" +owning-sig: sig-api-machinery +participating-sigs: + - sig-api-machinery + - sig-storage +status: implementable +creation-date: 2021-07-26 +reviewers: + - "@liggitt" + - "@lavalamp" +approvers: + - "@deads2k" + - "@lavalamp" + - "@fedebongio" + +##### WARNING !!! ###### +# prr-approvers has been moved to its own location +# You should create your own in keps/prod-readiness +# Please make a copy of keps/prod-readiness/template/nnnn.yaml +# to keps/prod-readiness/sig-xxxxx/00000.yaml (replace with kep number) +#prr-approvers: + +see-also: + - "/keps/sig-storage/2639-secret-protection/kep.yaml" + +# The target maturity stage in the current dev cycle for this KEP. +stage: alpha + +# The most recent milestone for which work toward delivery of this KEP has been +# done. This can be the current (upcoming) milestone, if it is being actively +# worked on. +latest-milestone: "v1.26" + +# The milestone at which this feature was, or is targeted to be, at each stage. +milestone: + alpha: "v1.26" + beta: "v1.27" + stable: "v1.29" + +# The following PRR answers are required at alpha release +# List the feature gate name and the components for which it must be enabled +feature-gates: + - name: InUseProtection + components: + - kube-apiserver +disable-supported: true + +# The following PRR answers are required at beta release +metrics: