Skip to content

Commit

Permalink
Add initial example with Kubernetes (#21)
Browse files Browse the repository at this point in the history
* Add initial example with Kubernetes

This commit adds an example for container migration in Kubernetes
cluster. This example consists of several components:

 - `http-server`: an HTTP server that responds with the hostname
   and IP address of the container it is currently running into.
 - `local-registry`: a set of scripts for setting up a local container
    registry that can be used to distribute container checkpoints
    across cluster nodes.
 - `build-image`: a script building an OCI container image from
   checkpoint tar archive
 - `kubectl-plugin`: a plugin introducing a `checkpoint` command
   to kubectl
 - `manifests`: yaml files for deploying the HTTP server example and
  kube-router
 - `cni`: a cni configuration for kube-router

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>

* local-registry: add predefined folders

The shell scripts for generating certificates, password and running a
local registry assume that "auth/", "certs/", and "data/" directories
exist.

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>

* local-registry: enable support for ubuntu/fedora

Installing root CA certificate requires different commands on different
Linux distributions. This patch updates the script to support Ubuntu
in addition to Fedora/RHEL.

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>

* Use fully-qualified container image references

It is highly recommended to always use fully-qualified image references
as short-name aliases can create ambiguity.

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>

* readme: use sudo when running commands as root

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>

---------

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
Co-authored-by: Stanislav Kosorin <stanokosorin4@gmail.com>
  • Loading branch information
rst0git and stano45 authored Aug 16, 2024
1 parent 4846084 commit 998d1dc
Show file tree
Hide file tree
Showing 20 changed files with 672 additions and 0 deletions.
102 changes: 102 additions & 0 deletions examples/container_migration_in_kubernetes/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
# Container migration in Kubernetes

In this example, we have HTTP server running in a Kubernetes Pod where a BMv2
switch is used for traffic load balancing to dynamically reroutes packets to
the correct IP address after container migration.

This example assumes that the Kubernenes cluster has been configured with
recent version of CRI-O that supports container checkpointing, and Kubelet
Checkpoint API has been enabled. To learn more about the container
checkpointing feature in Kubernetes, please refer to the following pages:

- https://kubernetes.io/blog/2022/12/05/forensic-container-checkpointing-alpha/
- https://kubernetes.io/docs/reference/node/kubelet-checkpoint-api/

## Running the example

1. Install CNI Plugins on each node

The CNI configuration file is expected to be present as `/etc/cni/net.d/10-kuberouter.conf`
```
sudo mkdir -p /etc/cni/net.d/
sudo cp cni/10-kuberouter.conf /etc/cni/net.d/
```

Install `bridge` CNI plugin and `host-local` IP address management plugin:

```
git clone https://github.com/containernetworking/plugins
cd plugins
git checkout v1.1.1
./build_linux.sh
sudo mkdir -p /opt/cni/bin
sudo cp bin/* /opt/cni/bin/
```

2. Deploy daemonset
```
kubectl apply -f manifests/kube-router-daemonset.yaml
```

3. Setup a local container registry

```
cd local-registry/
./generate-password.sh <user>
./generate-certificates.sh <hostname>
./trust-certificates.sh
./run.sh
buildah login <hostname>:5000
```

3. Deploy an HTTP server

```
kubectl apply -f manifests/http-server-deployment.yaml
kubectl apply -f manifests/http-server-service.yaml
# Check the status of the deployment
kubectl get deployments
# Check the assigned IP address
kubectl get service http-server
```

4. Install kubectl checkpoint plugin

```
sudo cp kubectl-plugin/kubectl-checkpoint /usr/local/bin/
```

5. Enable checkpoint/restore with established TCP connections
```
sudo mkdir -p /etc/criu/
echo "tcp-established" | sudo tee -a /etc/criu/runc.conf
```

6. Create container checkpoint

```
kubectl checkpoint <pod> <container>
```

7. Build a checkpoint OCI image and push to registry

```
build-image/build-image.sh -a <annotations-file> -c <checkpoint-path> -i <hostname>:5000/<image>:<tag>
buildah push <hostname>:5000/<image>:<tag>
```

7. Restore container from checkpoint image

Replace the container `image` filed in `http-server-deployment.yaml` with the
checkpoint OCI image `<hostname>:5000/<image>:<tag>` and apply the new deployment.

```
kubectl apply -f manifests/http-server-deployment.yaml
```

Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
io.kubernetes.cri-o.annotations.checkpoint.name=<container-name>
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
#!/bin/bash

set -euo pipefail

usage() {
cat <<EOF
Usage: ${0##*/} [-a ANNOTATIONS_FILE] [-c CHECKPOINT_PATH] [-i IMAGE_NAME]
Create OCI image from a checkpoint tar file.
-a path to the annotations file
-c path to the checkpoint file
-i name of the resulting image
EOF
exit 1
}

annotationsFilePath=""
checkpointPath=""
imageName=""

while getopts ":a:c:i:" opt; do
case ${opt} in
a)
annotationsFilePath=$OPTARG
;;
c)
checkpointPath=$OPTARG
;;
i)
imageName=$OPTARG
;;
:)
echo "Option -$OPTARG requires an argument."
usage
;;
\?)
echo "Invalid option: -$OPTARG"
usage
;;
esac
done
shift $((OPTIND - 1))

if [[ -z $annotationsFilePath || -z $checkpointPath || -z $imageName ]]; then
echo "All options (-a, -c, -i) are required."
usage
fi

if ! command -v buildah &>/dev/null; then
echo "buildah is not installed. Please install buildah before running 'checkpointctl build' command."
exit 1
fi

if [[ ! -f $annotationsFilePath ]]; then
echo "Annotations file not found: $annotationsFilePath"
exit 1
fi

if [[ ! -f $checkpointPath ]]; then
echo "Checkpoint file not found: $checkpointPath"
exit 1
fi

newcontainer=$(buildah from scratch)

buildah add "$newcontainer" "$checkpointPath"

while IFS= read -r line; do
key=$(echo "$line" | cut -d '=' -f 1)
value=$(echo "$line" | cut -d '=' -f 2-)
buildah config --annotation "$key=$value" "$newcontainer"
done <"$annotationsFilePath"

buildah commit "$newcontainer" "$imageName"

buildah rm "$newcontainer"

echo "Checkpoint image created successfully: $imageName"
10 changes: 10 additions & 0 deletions examples/container_migration_in_kubernetes/cni/10-kuberouter.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
{
"cniVersion": "0.3.0",
"name":"mynet",
"type":"bridge",
"bridge":"kube-bridge",
"isDefaultGateway":true,
"ipam": {
"type":"host-local"
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
FROM docker.io/library/python:3.11-slim

WORKDIR /app
COPY main.py /app/
EXPOSE 12345

CMD ["python3", "main.py"]

27 changes: 27 additions & 0 deletions examples/container_migration_in_kubernetes/http-server/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Simple HTTP Server with Request Counter

This Python program implements a simple HTTP server that keeps track of the
number of GET requests it has received. Each client receives an initial
response that includes the request count, server hostname, and IP address.
After the initial response, the server sends a dot (`.`) every second for 10
seconds.

## Usage

1. Run server

By default, the server listens on port 12345. To specify a different port, use
the `-p` option:

```
python3 main.py -p <port>
```

2. Send an HTTP request

You can use any HTTP client to make a GET request to the server. The following
script uses `curl` with the default port number:

```
./send_request.sh
```
118 changes: 118 additions & 0 deletions examples/container_migration_in_kubernetes/http-server/main.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
#!/usr/bin/env python3

import argparse
import logging
import socket
import time
from http.server import BaseHTTPRequestHandler, HTTPServer

# Initialize counter
counter = 0

# Configure logging
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s - %(levelname)s - %(message)s'
)


class SimpleHTTPRequestHandler(BaseHTTPRequestHandler):
def do_GET(self):
global counter
# Increment the counter
counter += 1
# Log the current count
logging.info(f"Request count: {counter}")

# Determine the hostname and IP address
hostname = socket.gethostname()
ip_address = socket.gethostbyname(hostname)

# Respond with the initial headers
self.send_response(200)
self.send_header("Content-type", "text/plain")
# Indicate that the connection should be kept alive
self.send_header("Connection", "keep-alive")
# Use chunked transfer encoding
self.send_header("Transfer-Encoding", "chunked")
self.end_headers()

# Send the initial message with the counter, hostname, and IP address
message_str = (
f"Request [{counter}]: "
f"Hostname: {hostname}, "
f"IP Address: {ip_address}\n"
)
initial_message = message_str.encode('utf-8')

self.send_chunk(initial_message)

# Send a dot every second for 10 seconds
self.send_dots_for_duration(10)

def send_chunk(self, data):
"""Send a chunk of data."""
try:
self.wfile.write(f"{len(data):X}\r\n".encode('utf-8'))
self.wfile.write(data)
# End of the chunk
self.wfile.write(b"\r\n")
# Ensure the data is sent immediately
self.wfile.flush()
except ConnectionResetError as e:
logging.warning("Connection reset by peer: %s", e)
raise
except BrokenPipeError as e:
logging.warning("Error while sending data: %s", e)
raise
except Exception as e:
logging.error("Unexpected error: %s", e)
raise

def send_dots_for_duration(self, duration):
"""Send a dot every second for the given duration."""
end_time = time.time() + duration
while time.time() < end_time:
# Send a dot as a separate chunk
dot = b"."
try:
self.send_chunk(dot)
except (ConnectionResetError, BrokenPipeError) as e:
logging.warning("Connection error: %s", e)
break
except Exception as e:
logging.error("Unexpected error: %s", e)
break
# Wait for 1 second before sending another chunk
time.sleep(1)
else:
# Send an empty chunk to indicate the end of the response
try:
self.send_chunk(b"")
except (ConnectionResetError, BrokenPipeError) as e:
logging.warning("Error while sending end of response: %s", e)
except Exception as e:
logging.error("Unexpected error: %s", e)


def run(
server_class=HTTPServer,
handler_class=SimpleHTTPRequestHandler,
port=12345
):
server_address = ('', port)
httpd = server_class(server_address, handler_class)
logging.info(f'Starting HTTP server on port {port}...')
httpd.serve_forever()


if __name__ == "__main__":
parser = argparse.ArgumentParser(
description="A simple HTTP server with a request counter"
)
parser.add_argument(
'-p', '--port', type=int, default=12345,
help='Port number to run the server on (default: 12345)'
)
args = parser.parse_args()
run(port=args.port)
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
#!/usr/bin/sh

# The -N option ensures that curl does not buffer the output and
# you should see the data as it is received from the server in real-time.
curl -N http://localhost:12345

Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
#!/bin/bash

HOST="$1"
POD="$2"
CTR="$3"

if [ "$#" -eq 2 ]; then
HOST=localhost
POD="$1"
CTR="$2"
elif [ "$#" -ne 3 ]; then
echo "Usage: $(basename $0) <pod> <container>"
exit 1
fi

curl --insecure \
--cert /var/lib/kubelet/pki/kubelet-client-current.pem \
--key /var/lib/kubelet/pki/kubelet-client-current.pem \
-X POST \
"https://${HOST}:10250/checkpoint/default/${POD}/${CTR}"

Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
auth/*
certs/*
data/*
Empty file.
Empty file.
Empty file.
Loading

0 comments on commit 998d1dc

Please sign in to comment.