Skip to content

Commit

Permalink
Add Forensic Container Checkpointing KEP
Browse files Browse the repository at this point in the history
Signed-off-by: Adrian Reber <areber@redhat.com>
  • Loading branch information
adrianreber committed Jan 17, 2022
1 parent 6ec5481 commit 8f8cc15
Show file tree
Hide file tree
Showing 3 changed files with 302 additions and 0 deletions.
3 changes: 3 additions & 0 deletions keps/prod-readiness/sig-node/2008.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
kep-number: 2008
alpha:
approver: "@ehashman"
256 changes: 256 additions & 0 deletions keps/sig-node/2008-forensic-container-checkpointing/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,256 @@
# KEP-2008: Forensic Container Checkpointing

<!-- toc -->
- [Release Signoff Checklist](#release-signoff-checklist)
- [Summary](#summary)
- [Motivation](#motivation)
- [Goals](#goals)
- [Non-Goals](#non-goals)
- [Proposal](#proposal)
- [Implementation](#implementation)
- [User Stories](#user-stories)
- [Risks and Mitigations](#risks-and-mitigations)
- [Design Details](#design-details)
- [Future Enhancements](#future-enhancements)
- [Test Plan](#test-plan)
- [Graduation Criteria](#graduation-criteria)
- [Alpha](#alpha)
- [Alpha to Beta Graduation](#alpha-to-beta-graduation)
- [Beta to GA Graduation](#beta-to-ga-graduation)
- [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy)
- [Version Skew Strategy](#version-skew-strategy)
- [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire)
- [Feature Enablement and Rollback](#feature-enablement-and-rollback)
- [Implementation History](#implementation-history)
- [Drawbacks](#drawbacks)
- [Alternatives](#alternatives)
<!-- /toc -->

## Release Signoff Checklist

Items marked with (R) are required *prior to targeting to a milestone / release*.

- [ ] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR)
- [ ] (R) KEP approvers have approved the KEP status as `implementable`
- [ ] (R) Design details are appropriately documented
- [ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input
- [ ] (R) Graduation criteria is in place
- [ ] (R) Production readiness review completed
- [ ] Production readiness review approved
- [ ] "Implementation History" section is up-to-date for milestone
- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
- [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes

[kubernetes.io]: https://kubernetes.io/
[kubernetes/enhancements]: https://git.k8s.io/enhancements
[kubernetes/kubernetes]: https://git.k8s.io/kubernetes
[kubernetes/website]: https://git.k8s.io/website

## Summary

Provide an interface to trigger a container checkpoint for forensic analysis.

## Motivation

Container checkpointing provides the functionality to take a snapshot of a
running container. The checkpointed container can be transferred to another
node and the original container will never know that it was checkpointed.

Restoring the container in a sandboxed environment provides a mean to
forensically analyse a copy of the container to understand if it might
have been a possible threat. As the analysis is happening on a copy of
the original container a possible attacker of the original container
will not be aware of any sandboxed analysis.

### Goals

The goal of this KEP is to introduce *checkpoint* and *restore* to the CRI API.
This includes extending the *kubelet* API to support checkpointing single
containers with the forensic use case in mind.

### Non-Goals

Although *checkpoint* and *restore* can be used to implement container
migration this KEP is only about enabling the forensic use case. Checkpointing
a pod is not part of this proposal and left for future enhancements.

## Proposal

### Implementation

For the forensic use case we want to offer the functionality to checkpoint a
container out of a running Pod without stopping the checkpointed container or
letting the container know that it was checkpointed.

The corresponding code changes for the forensic use case can be found in the
following pull request:

* https://github.com/kubernetes/kubernetes/pull/104907

The goal is to introduce *checkpoint* and *restore* in a bottom-up approach.
In a first step we only want to extend the CRI API to trigger a checkpoint
by the container engine and to have the low level primitives in the *kubelet*
to trigger a checkpoint. It is necessary to enable the feature gate
`ContainerCheckpointRestore` to be able to checkpoint containers.

In the corresponding pull request a checkpoint is triggered using the *kubelet*
API:

```
curl -skv -X POST "https://localhost:10250/checkpoint/default/counters/wildfly"
```

For the first implementation we do not want to support restore in the
*kubelet*. With the focus on the forensic use case the restore should happen
outside of Kubernetes. The restore is a container engine only operation
in this first step.

The forensic use case is targeted to be part of the next (1.24) release.

Although this KEP only adds checkpointing support to the kubelet the CRI API in
the corresponding code pull request is extended to support *checkpoint* and
*restore* in the CRI API. The reason to add *restore* to the CRI API without
implementing it in the kubelet is to make development and especially testing
easier on the container engine level.

### User Stories

To analyze unusual activities in a container, the container should
be checkpointed without stopping the container or without the container
knowing it was checkpointed. Using checkpointing it is possible to take
a copy of a running container for forensic analysis. The container will
continue to run without knowing a copy was created. This copy can then
be restored in another (sandboxed) environment in the context of another
container engine for detailed analysis of a possible attack.

### Risks and Mitigations

In its first implementation the risks are low as it tries to be a CRI API
change with minimal changes to the kubelet and it is gated by the feature
gate `ContainerCheckpointRestore`.

## Design Details

The feature gate `ContainerCheckpointRestore` will ensure that the API
graduation can be done in the standard Kubernetes way.

A kubelet API to trigger the checkpointing of a container will be
introduced as described in [Implementation](#implementation).

Also see https://github.com/kubernetes/kubernetes/pull/104907 for details.

### Future Enhancements

The initial implementation is only about checkpointing specific containers
out of a pod. In future versions we probably want to support checkpointing
complete pods. To checkpoint a complete pod the expectation on the container
engine would be to do a pod level cgroup freeze before checkpointing the
containers in the pod to ensure that all containers are checkpointed at the
same point in time and that the containers do not keep running while other
containers in the pod are checkpointed.

One possible result of being able to checkpoint and restore containers and pods
might be the possibility to migrate containers and pods in the future as
discussed in [#3949](https://github.com/kubernetes/kubernetes/issues/3949).

### Test Plan

For alpha:
- Unit tests available

For beta:
- CRI API changes need to be implemented by at least one
container engine
- Enable e2e testing

### Graduation Criteria

#### Alpha

- [ ] Implement the new feature gate and kubelet implementation
- [ ] Ensure proper tests are in place
- [ ] Update documentation to make the feature visible

#### Alpha to Beta Graduation

At least one container engine has to have implemented the
corresponding CRI APIs to introduce e2e test for checkpointing.

- [ ] Enable the feature per default
- [ ] No major bugs reported in the previous cycle

#### Beta to GA Graduation

TBD

### Upgrade / Downgrade Strategy

No changes are required on upgrade if the container engine supports
the corresponding CRI API changes.

CRIU needs to be installed, but on most distributions it is already
a dependency of runc/crun.

### Version Skew Strategy

There is no explicit version skew strategy required because the feature acts as
a toggle switch.

## Production Readiness Review Questionnaire

### Feature Enablement and Rollback

###### How can this feature be enabled / disabled in a live cluster?

Using the feature gate `ContainerCheckpointRestore` the feature can be enabled.

###### Does enabling the feature change any default behavior?

No.

###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?

Yes. By disabling the feature gate `ContainerCheckpointRestore` again.

###### What happens if we reenable the feature if it was previously rolled back?

Checkpointing containers will be possible again.

###### Are there any tests for feature enablement/disablement?

Unit tests will temporarily enable the `ContainerCheckpointRestore` feature gate
to ensure that the unit tests are always running.

## Implementation History

* 2020-09-16: Initial version of this KEP
* 2020-12-10: Opened pull request showing an end-to-end implementation of a possible use case
* 2021-02-12: Changed KEP to mention the *experimental* API as suggested in the SIG Node meeting 2021-02-09
* 2021-04-08: Added section about Pod Lifecycle, Checkpoint Storage, Alternatives and Hooks
* 2021-07-08: Reworked structure and added missing details
* 2021-08-03: Added the forensic user story and highlight the goal to implement it in small steps
* 2021-08-10: Added future work with information about pod level cgroup freezing
* 2021-09-15: Removed references to first proof of concept implementation
* 2021-09-21: Mention feature gate `ContainerCheckpointRestore`
* 2021-09-22: Removed everything which is not directly related to the forensic use case
* 2022-01-06: Reworked based on review

## Drawbacks

Not aware of any.

## Alternatives

Another possibility to use checkpoint restore would be, for example, to trigger
the checkpoint by a privileged sidecar container (`CAP_SYS_ADMIN`) and do the
restore through an Init container.

The reason to integrate checkpoint restore directly into Kubernetes and not
with helpers like sidecar and init containers is that checkpointing is already,
for many years, deeply integrated into multiple container runtimes and engines
and this integration has been reliable and well tested. Going another way in
Kubernetes would make the whole process much more complicated and fragile. Not
using checkpoint and restore in Kubernetes through the existing paths of
runtimes and engines is not well known and maybe not even possible as
checkpointing and restoring is tightly integrated as it requires much
information only available by working closely with runtimes and engines.
43 changes: 43 additions & 0 deletions keps/sig-node/2008-forensic-container-checkpointing/kep.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
title: Forensic Container Checkpointing
kep-number: 2008
authors:
- "@adrianreber"
owning-sig: sig-node
participating-sigs:
- TBD
status: implementable
creation-date: 2020-09-16
last-updated: 2022-01-17
reviewers:
- "@mrunalp"
- "@elfinhe"
approvers:
- "@dchen1107"
prr-approvers:
- "@ehashman"

# The target maturity stage in the current dev cycle for this KEP.
stage: alpha

# The most recent milestone for which work toward delivery of this KEP has been
# done. This can be the current (upcoming) milestone, if it is being actively
# worked on.
latest-milestone: "v1.24"

# The milestone at which this feature was, or is targeted to be, at each stage.
milestone:
alpha: "v1.24"
beta: "v1.25"
stable: "v1.27"

# The following PRR answers are required at alpha release
# List the feature gate name and the components for which it must be enabled
feature-gates:
- name: ContainerCheckpointRestore
components:
- kubelet
disable-supported: true

# The following PRR answers are required at beta release
metrics:
- "N/A"

0 comments on commit 8f8cc15

Please sign in to comment.