diff --git a/keps/sig-node/2008-forensic-container-checkpointing/README.md b/keps/sig-node/2008-forensic-container-checkpointing/README.md new file mode 100644 index 000000000000..c670fe5bb85a --- /dev/null +++ b/keps/sig-node/2008-forensic-container-checkpointing/README.md @@ -0,0 +1,256 @@ +# KEP-2008: Forensic Container Checkpointing + + +- [Release Signoff Checklist](#release-signoff-checklist) +- [Summary](#summary) +- [Motivation](#motivation) + - [Goals](#goals) + - [Non-Goals](#non-goals) +- [Proposal](#proposal) + - [Implementation](#implementation) + - [User Stories](#user-stories) + - [Risks and Mitigations](#risks-and-mitigations) +- [Design Details](#design-details) + - [Future Enhancements](#future-enhancements) + - [Test Plan](#test-plan) + - [Graduation Criteria](#graduation-criteria) + - [Alpha](#alpha) + - [Alpha to Beta Graduation](#alpha-to-beta-graduation) + - [Beta to GA Graduation](#beta-to-ga-graduation) + - [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy) + - [Version Skew Strategy](#version-skew-strategy) +- [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire) + - [Feature Enablement and Rollback](#feature-enablement-and-rollback) +- [Implementation History](#implementation-history) +- [Drawbacks](#drawbacks) +- [Alternatives](#alternatives) + + +## Release Signoff Checklist + +Items marked with (R) are required *prior to targeting to a milestone / release*. + +- [ ] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR) +- [ ] (R) KEP approvers have approved the KEP status as `implementable` +- [ ] (R) Design details are appropriately documented +- [ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input +- [ ] (R) Graduation criteria is in place +- [ ] (R) Production readiness review completed +- [ ] Production readiness review approved +- [ ] "Implementation History" section is up-to-date for milestone +- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io] +- [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes + +[kubernetes.io]: https://kubernetes.io/ +[kubernetes/enhancements]: https://git.k8s.io/enhancements +[kubernetes/kubernetes]: https://git.k8s.io/kubernetes +[kubernetes/website]: https://git.k8s.io/website + +## Summary + +Provide an interface to trigger a container checkpoint for forensic analysis. + +## Motivation + +Container checkpointing provides the functionality to take a snapshot of a +running container. The checkpointed container can be transferred to another +node and the original container will never know that it was checkpointed. + +Restoring the container in a sandboxed environment provides a mean to +forensically analyse a copy of the container to understand if it might +have been a possible threat. As the analysis is happening on a copy of +the original container a possible attacker of the original container +will not be aware of any sandboxed analysis. + +### Goals + +The goal of this KEP is to introduce *checkpoint* and *restore* to the CRI API. +This includes extending the *kubelet* API to support checkpointing single +containers with the forensic use case in mind. + +### Non-Goals + +Although *checkpoint* and *restore* can be used to implement container +migration this KEP is only about enabling the forensic use case. Checkpointing +a pod is not part of this proposal and left for future enhancements. + +## Proposal + +### Implementation + +For the forensic use case we want to offer the functionality to checkpoint a +container out of a running Pod without stopping the checkpointed container or +letting the container know that it was checkpointed. + +The corresponding code changes for the forensic use case can be found in the +following pull request: + +* https://github.com/kubernetes/kubernetes/pull/104907 + +The goal is to introduce *checkpoint* and *restore* in a bottom-up approach. +In a first step we only want to extend the CRI API to trigger a checkpoint +by the container engine and to have the low level primitives in the *kubelet* +to trigger a checkpoint. It is necessary to enable the feature gate +`ContainerCheckpointRestore` to be able to checkpoint containers. + +In the corresponding pull request a checkpoint is triggered using the *kubelet* +API: + +``` +curl -skv -X POST "https://localhost:10250/checkpoint/default/counters/wildfly" +``` + +For the first implementation we do not want to support restore in the +*kubelet*. With the focus on the forensic use case the restore should happen +outside of Kubernetes. The restore is a container engine only operation +in this first step. + +The forensic use case is targeted to be part of the next (1.24) release. + +Although this KEP only adds checkpointing support to the kubelet the CRI API in +the corresponding code pull request is extended to support *checkpoint* and +*restore* in the CRI API. The reason to add *restore* to the CRI API without +implementing it in the kubelet is to make development and especially testing +easier on the container engine level. + +### User Stories + +To analyze unusual activities in a container, the container should +be checkpointed without stopping the container or without the container +knowing it was checkpointed. Using checkpointing it is possible to take +a copy of a running container for forensic analysis. The container will +continue to run without knowing a copy was created. This copy can then +be restored in another (sandboxed) environment in the context of another +container engine for detailed analysis of a possible attack. + +### Risks and Mitigations + +In its first implementation the risks are low as it tries to be a CRI API +change with minimal changes to the kubelet and it is gated by the feature +gate `ContainerCheckpointRestore`. + +## Design Details + +The feature gate `ContainerCheckpointRestore` will ensure that the API +graduation can be done in the standard Kubernetes way. + +A kubelet API to trigger the checkpointing of a container will be +introduced as described in [Implementation](#implementation). + +Also see https://github.com/kubernetes/kubernetes/pull/104907 for details. + +### Future Enhancements + +The initial implementation is only about checkpointing specific containers +out of a pod. In future versions we probably want to support checkpointing +complete pods. To checkpoint a complete pod the expectation on the container +engine would be to do a pod level cgroup freeze before checkpointing the +containers in the pod to ensure that all containers are checkpointed at the +same point in time and that the containers do not keep running while other +containers in the pod are checkpointed. + +One possible result of being able to checkpoint and restore containers and pods +might be the possibility to migrate containers and pods in the future as +discussed in [#3949](https://github.com/kubernetes/kubernetes/issues/3949). + +### Test Plan + +For alpha: +- Unit tests available + +For beta: +- CRI API changes need to be implemented by at least one + container engine +- Enable e2e testing + +### Graduation Criteria + +#### Alpha + +- [ ] Implement the new feature gate and kubelet implementation +- [ ] Ensure proper tests are in place +- [ ] Update documentation to make the feature visible + +#### Alpha to Beta Graduation + +At least one container engine has to have implemented the +corresponding CRI APIs to introduce e2e test for checkpointing. + +- [ ] Enable the feature per default +- [ ] No major bugs reported in the previous cycle + +#### Beta to GA Graduation + +TBD + +### Upgrade / Downgrade Strategy + +No changes are required on upgrade if the container engine supports +the corresponding CRI API changes. + +CRIU needs to be installed, but on most distributions it is already +a dependency of runc/crun. + +### Version Skew Strategy + +There is no explicit version skew strategy required because the feature acts as +a toggle switch. + +## Production Readiness Review Questionnaire + +### Feature Enablement and Rollback + +###### How can this feature be enabled / disabled in a live cluster? + +Using the feature gate `ContainerCheckpointRestore` the feature can be enabled. + +###### Does enabling the feature change any default behavior? + +No. + +###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)? + +Yes. By disabling the feature gate `ContainerCheckpointRestore` again. + +###### What happens if we reenable the feature if it was previously rolled back? + +Checkpointing containers will be possible again. + +###### Are there any tests for feature enablement/disablement? + +Unit tests will temporarily enable the `ContainerCheckpointRestore` feature gate +to ensure that the unit tests are always running. + +## Implementation History + +* 2020-09-16: Initial version of this KEP +* 2020-12-10: Opened pull request showing an end-to-end implementation of a possible use case +* 2021-02-12: Changed KEP to mention the *experimental* API as suggested in the SIG Node meeting 2021-02-09 +* 2021-04-08: Added section about Pod Lifecycle, Checkpoint Storage, Alternatives and Hooks +* 2021-07-08: Reworked structure and added missing details +* 2021-08-03: Added the forensic user story and highlight the goal to implement it in small steps +* 2021-08-10: Added future work with information about pod level cgroup freezing +* 2021-09-15: Removed references to first proof of concept implementation +* 2021-09-21: Mention feature gate `ContainerCheckpointRestore` +* 2021-09-22: Removed everything which is not directly related to the forensic use case +* 2022-01-06: Reworked based on review + +## Drawbacks + +Not aware of any. + +## Alternatives + +Another possibility to use checkpoint restore would be, for example, to trigger +the checkpoint by a privileged sidecar container (`CAP_SYS_ADMIN`) and do the +restore through an Init container. + +The reason to integrate checkpoint restore directly into Kubernetes and not +with helpers like sidecar and init containers is that checkpointing is already, +for many years, deeply integrated into multiple container runtimes and engines +and this integration has been reliable and well tested. Going another way in +Kubernetes would make the whole process much more complicated and fragile. Not +using checkpoint and restore in Kubernetes through the existing paths of +runtimes and engines is not well known and maybe not even possible as +checkpointing and restoring is tightly integrated as it requires much +information only available by working closely with runtimes and engines. diff --git a/keps/sig-node/2008-forensic-container-checkpointing/kep.yaml b/keps/sig-node/2008-forensic-container-checkpointing/kep.yaml new file mode 100644 index 000000000000..80fa6945668b --- /dev/null +++ b/keps/sig-node/2008-forensic-container-checkpointing/kep.yaml @@ -0,0 +1,43 @@ +title: Forensic Container Checkpointing +kep-number: 2008 +authors: + - "@adrianreber" +owning-sig: sig-node +participating-sigs: + - TBD +status: implementable +creation-date: 2020-09-16 +last-updated: 2022-01-17 +reviewers: + - "@mrunalp" + - "@elfinhe" +approvers: + - "@dchen1107" +prr-approvers: + - "@ehashman" + +# The target maturity stage in the current dev cycle for this KEP. +stage: alpha + +# The most recent milestone for which work toward delivery of this KEP has been +# done. This can be the current (upcoming) milestone, if it is being actively +# worked on. +latest-milestone: "v1.24" + +# The milestone at which this feature was, or is targeted to be, at each stage. +milestone: + alpha: "v1.24" + beta: "v1.25" + stable: "v1.27" + +# The following PRR answers are required at alpha release +# List the feature gate name and the components for which it must be enabled +feature-gates: + - name: ContainerCheckpointRestore + components: + - kubelet +disable-supported: true + +# The following PRR answers are required at beta release +metrics: + - "N/A"