Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ks fyp v2 #62

Open
wants to merge 53 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
53 commits
Select commit Hold shift + click to select a range
bc4bcdd
Manually deploy k8s cluster (#4)
csegarragonz Sep 15, 2023
48ec473
Code formatting checks (#5)
csegarragonz Sep 15, 2023
5df4d73
Install operator and CC-runtime (#6)
csegarragonz Sep 18, 2023
8421dec
Kata Hello World (#7)
csegarragonz Sep 19, 2023
0f8b6f1
CoCo Hello World (#8)
csegarragonz Sep 19, 2023
2a83bfe
Knative Hello World (#9)
csegarragonz Sep 20, 2023
40cca98
Standalone Installation (#11)
csegarragonz Sep 25, 2023
81614ab
Knative on CoCo (#12)
csegarragonz Sep 28, 2023
5345abc
Add task to calculate launch digest (#15)
csegarragonz Sep 29, 2023
397f808
Guest FW Attestation (#16)
csegarragonz Oct 6, 2023
41ba875
Update docs (#18)
csegarragonz Oct 9, 2023
c4ff04a
Sign (and verify) container images (#19)
csegarragonz Oct 9, 2023
d91e00d
Use encrypted container images (#20)
csegarragonz Oct 10, 2023
80cf84d
Plots: start-up latency (#22)
csegarragonz Oct 13, 2023
ab3378e
Include Image Pulling Time (#23)
csegarragonz Oct 16, 2023
a29e206
Include further events in breakdown costs (#24)
csegarragonz Oct 17, 2023
0536e75
Fix Kata baseline (#26)
csegarragonz Oct 18, 2023
bbda53a
Throughput-Latency Plot (#30)
csegarragonz Oct 19, 2023
3609535
Memory Size Plot (#31)
csegarragonz Oct 19, 2023
ace873a
Add attestation docs (#32)
csegarragonz Oct 19, 2023
ef6c867
VM Start Up Breakdown (#33)
csegarragonz Oct 27, 2023
e1ca652
Breakdown Image Pulling Costs (#36)
csegarragonz Oct 30, 2023
8ca8b29
Breakdown the Instantiation Throughput Costs (#41)
csegarragonz Nov 6, 2023
1a41da1
eval: add real-kata baseline (#43)
csegarragonz Nov 7, 2023
b5bfe6d
eval: stabilise memory size plot and add more data points (#45)
csegarragonz Nov 7, 2023
b67f960
eval(vm-detail): add further detail for ovmf plots (#46)
csegarragonz Nov 13, 2023
12f939f
eval(initrd-size): add plot to measure the impact of the initrd size …
csegarragonz Nov 13, 2023
0a2c199
eval(ovmf-detail): compare ovmf start-up latency on a per-task basis …
csegarragonz Nov 15, 2023
7fe3987
eval: add task to prune running pods (#54)
csegarragonz Nov 15, 2023
c9b8ea6
qemu: document --datadir flag and make it an optional parameter (#55)
csegarragonz Nov 15, 2023
d4e6ee2
containerd: make build and cli self-containerd (#56)
csegarragonz Nov 15, 2023
e202395
kata: make build and work-on self-contained (#58)
csegarragonz Nov 15, 2023
1641437
kbs: track as submodule and add docs (#59)
csegarragonz Nov 16, 2023
5849898
gc: add method and docs (#60)
csegarragonz Nov 16, 2023
134423e
task bug fixes
csegarragonz Feb 16, 2024
8e0de2f
bug fixes
csegarragonz Feb 16, 2024
f06d78a
nydus experiment
konsougiou Apr 5, 2024
7d01f8d
xput and external registry
konsougiou Apr 17, 2024
81d8f5d
experiment changes and task changes
konsougiou Jun 6, 2024
84316f8
gitgnore
konsougiou Jun 6, 2024
5bb96e3
gitgnore
konsougiou Jun 6, 2024
e20e06c
Stop tracking apps/tensorflow-serving
konsougiou Jun 6, 2024
240e865
plots
konsougiou Jun 16, 2024
0528876
remove redundent
konsougiou Jun 16, 2024
78d550a
Track nydus-image.tar with Git LFS
konsougiou Jun 19, 2024
809726e
startup
konsougiou Jun 19, 2024
ed57d32
eval fixes
konsougiou Jun 19, 2024
666f132
add deps install
konsougiou Jun 20, 2024
11652ad
readme
konsougiou Jun 20, 2024
51f8c8e
fix
konsougiou Jun 20, 2024
7348133
Update README.md
konsougiou Jun 20, 2024
99fcd88
Update README.md
konsougiou Jun 20, 2024
7414977
Update README.md
konsougiou Jun 21, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
1 change: 1 addition & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
nydus-image.tar filter=lfs diff=lfs merge=lfs -text
19 changes: 19 additions & 0 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
name: "Integration tests"

on:
push:
branches: [main]
pull_request:
branches: [main]
types: [opened, synchronize, reopened, ready_for_review]

jobs:
python-format:
if: github.event.pull_request.draft == false
runs-on: ubuntu-latest
steps:
- name: "Checkout code"
uses: actions/checkout@v3
# Formatting checks
- name: "Code formatting check"
run: ./bin/inv_wrapper.sh format-code
14 changes: 14 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,6 +1,20 @@
# Installed binaries
cosign
crictl
kubeadm
kubectl
kubelet
k9s

# Kubernetes stuff
.config

# Python-related stuff
venv

# Templated files
templated

# app models
/apps/tensorflow-serving/
/apps/tensorflow-serving/*
3 changes: 3 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[submodule "components/simple-kbs"]
path = components/simple-kbs
url = https://github.com/csegarragonz/simple-kbs.git
143 changes: 142 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,146 @@
# CoCo Serverless

The goal of this project is to run Knative using confidential containes instead than plain Docker containers.
This is project is an extention of the [coco-serverless](https://github.com/coco-serverless/coco-serverless/edit/main/README.md) repository. The original repository has the goal to deploy [Knative](https://knative.dev/docs/) on [CoCo](https://github.com/confidential-containers) and run some baseline benchmarks. This project extends its functionality, introducing our custom CoCo implementation with an imporved image pulling mechanism (CoCo-Hybrid). This repository hosts benchmarks for CoCo-Hybrid, providing a means to compare with previously established baselines.

Our CoCo-hybrid mode makes adjustments so several of the CoCo components. The adjusted components are found in the following branches of our forked repositories:
* [Nydus-snapshotter](https://github.com/konsougiou/nydus-snapshotter/tree/ks-main-0.13.3)
* [kata-containers](https://github.com/coco-serverless/kata-containers/tree/ks-prod)
* [guest-components](https://github.com/coco-serverless/guest-components/tree/KS-prod)
* [nydus](https://github.com/konsougiou/nydus/tree/ks-prod)

All instructions in this repository assume that you have checked-out the source
code, and have activated the python virtual environment:

```bash
source ./bin/workon.sh

# List available tasks
inv -l
```

## Pre-Requisites

You will need CoCo's fork of containerd built and running. To this extent you
may run:

```bash
inv containerd.build
inv containerd.install
```

You also need all the kubernetes-related tooling: `kubectl`, `kubeadm`, and
`kubelet`:

```bash
inv k8s.install [--clean]
```

You may also want to install `k9s`, a kubernetes monitoring tool:

```bash
inv k9s.install-k9s
```

## Quick Start

Deploy a (single-node) kubernetes cluster using `kubeadm`:

```bash
inv kubeadm.create
```

Second, install both the operator and the CC runtime from the upstream tag.
We currently pin to version `v0.7.0` (see the [`COCO_RELEASE_VERSION` variable](
https://github.com/csegarragonz/coco-serverless/tree/main/tasks/util/env.py)).

```bash
inv operator.install
inv operator.install-cc-runtime
```

Third, update the `initrd` file to include our patched `kata-agent`:

```bash
inv kata.replace-agent
```

if it is the first time, you will have to manually build the agent following
[these instructions](./docs/kata.md#replacing-the-kata-agent).

Then, you are ready to run one of the supported apps:
* [Hello World! (Py)](./docs/helloworld_py.md) - simple HTTP server running in Python to test CoCo and Kata.
* [Hello World! (Knative)](./docs/helloworld_knative.md) - same app as before, but invoked over Knatvie.
* [Hello Attested World! (Knative + Attestation)](./docs/helloworld_knative_attestation.md) - same setting as the Knative hello world, but with varying levels of attestation configured.

If your app uses Knative, you will have to install it first:

```bash
inv knative.install
```
## Setting Up CoCo-Hybrid

In order to enable the CoCo-Hybrid mode, the following configuration steps need to be taken:

Our customised nydus-snapshotter binary, linux Kernel and VM initrd image are necessary. These can be installed using the following command:

```bash
inv hybrid.install-cc-hybrid-deps
```

The kata configs can then be adjusted to point to the nre kernel and initrd using the following command

```bash
inv hybrid.update-configs
```

Additionally, in order to configure the snapshotter to operate in our hybrid modem the following commands should be run:

```bash
inv nydus-snapshotter.populate-host-sharing-config
inv nydus-snapshotter.toggle-mode --hybrid
```

Finally, the private nyuds imag and the public blob-cache image can be generated using the command:
```bash
inv hybrid.generate-images --image-name your_image_name --workdir /path/to/workdir
```

This assume that workdir is populated with the whole image Dockefile,, and the public/ directory is populated with the Dockerflie that includes the public layers.

The script will push two images to the external registry, one with tag `unencrypted-nydus`, and the other with tag `blob-cache`.

## Evaluation

The goal of the project is to measure the performance of Knative with CoCo,
and compare it to other isolation mechanisms using standarised benchmarks. To
This extent, we provide a thorough evaluation in the [evaluation](./eval)
directory.

## Uninstall

In order to uninstall components for debugging purposes, you may un-install the CoCo runtime, and then the operator as follows:

```bash
inv operator.uninstall-cc-runtime
inv operator.uninstall
```

Lastly, you can completely remove the `k8s` cluster by running:

```bash
inv kubeadm.destroy
```

## Further Reading

For further documentation, you may want to check these other documents:
* [Attestation](./docs/attestation.md) - attestation particularities of CoCo and SEV(-ES).
* [Guest Components](./docs/guest_components.md) - patch `image-rs` or other guest components.
* [K8s](./docs/k8s.md) - documentation about configuring a single-node Kubernetes cluster.
* [Kata](./docs/kata.md) - instructions to build our custom Kata fork and `initrd` images.
* [Key Broker Service](./docs/kbs.md) - docs on using and patching the KBS.
* [Knative](./docs/knative.md) - documentation about Knative, our serverless runtime of choice.
* [Local Registry](./docs/registry.md) - configuring a local registry to store OCI images.
* [OVMF](./docs/ovmf.md) - notes on building OVMF and CoCo's OVMF boot process.
* [SEV](./docs/sev.md) - speicifc documentation to get the project working with AMD SEV machines.
* [Troubleshooting](./docs/troubleshooting.md) - tips to debug when things go sideways.
20 changes: 20 additions & 0 deletions apps/fio/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@

FROM alpine:latest

RUN apk update && apk add fio nano

COPY benchmark.sh .
COPY file_gen_?.sh .

RUN chmod +x benchmark.sh && chmod +x file_gen_1.sh && chmod +x file_gen_2.sh

RUN source file_gen_1.sh

RUN source file_gen_2.sh

RUN source benchmark.sh

#CMD find / \( -path /proc -o -path /sys \) -prune -o -type f -exec grep "test" {} +
#CMD tail -f /dev/null
#CMD while true; do sleep 3600; done
CMD fio fio_read_jobfile.fio
42 changes: 42 additions & 0 deletions apps/fio/benchmark.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
#!/bin/bash

fio_job_file="fio_read_jobfile.fio"
target_file_name_1="random_75mb_file_1"
target_file_name_2="random_75mb_file_2"

echo "" > "$fio_job_file"

for dir in $(find / -maxdepth 1 -type d ! -name "proc" ! -name "sys" ! -name "tmp" ! -path "/")
do
file_path="$dir/$target_file_name_1"
if [ -f "$file_path" ]; then
echo "[job_$dir]" >> "$fio_job_file"
echo "filename=$file_path" >> "$fio_job_file"
echo "rw=read" >> "$fio_job_file"
echo "ioengine=libaio" >> "$fio_job_file"
echo "iodepth=1" >> "$fio_job_file"
echo "size=75m" >> "$fio_job_file"
echo "direct=0" >> "$fio_job_file"
echo "thinktime=200" >> "$fio_job_file"
echo "blocksize=4k" >> "$fio_job_file"
echo "numjobs=1" >> "$fio_job_file"
echo "group_reporting" >> "$fio_job_file"
fi

file_path="$dir/$target_file_name_2"
if [ -f "$file_path" ]; then
echo "[job_$dir]" >> "$fio_job_file"
echo "filename=$file_path" >> "$fio_job_file"
echo "rw=read" >> "$fio_job_file"
echo "ioengine=libaio" >> "$fio_job_file"
echo "iodepth=1" >> "$fio_job_file"
echo "size=75m" >> "$fio_job_file"
echo "direct=0" >> "$fio_job_file"
echo "thinktime=200" >> "$fio_job_file"
echo "blocksize=4k" >> "$fio_job_file"
echo "numjobs=1" >> "$fio_job_file"
echo "group_reporting" >> "$fio_job_file"
fi
done

echo "FIO job file creation complete."
27 changes: 27 additions & 0 deletions apps/fio/deployment.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: coco-fio-benchmark
labels:
apps.coco-serverless/name: fio-benchmark
annotations:
io.containerd.cri.runtime-handler: kata-qemu-sev
spec:
replicas: 1
selector:
matchLabels:
apps.coco-serverless/name: fio-benchmark
template:
metadata:
labels:
apps.coco-serverless/name: fio-benchmark
io.katacontainers.config.pre_attestation.enabled: "false"
spec:
runtimeClassName: kata-qemu-sev
containers:
- name: fio-benchmark
image: registry.coco-csg.com/fio-benchmark-nydus:unencrypted #registry.coco-csg.com/fio-benchmark-nydus:unencrypted #registry.coco-csg.com/openjdk:unencrypted #localhost:5000/hello-world-flask-nydus:latest
imagePullPolicy: IfNotPresent
ports:
- containerPort: 8080

18 changes: 18 additions & 0 deletions apps/fio/file_gen_1.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
#!/bin/bash

file_size_mb=75

block_count=$file_size_mb

for dir in $(find / -maxdepth 1 -type d ! -name "proc" ! -name "sys" ! -name "tmp" ! -name "dev" ! -name "run" ! -path "/")
do
if [ -w "$dir" ]; then
file_path="$dir/random_75mb_file_1"
echo "Creating a 75MB random file in $dir"
dd if=/dev/urandom of="$file_path" bs=1M count=$block_count status=none
else
echo "Skipping $dir, no write permission."
fi
done

echo "File creation complete."
18 changes: 18 additions & 0 deletions apps/fio/file_gen_2.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
#!/bin/bash

file_size_mb=75

block_count=$file_size_mb

for dir in $(find / -maxdepth 1 -type d ! -name "proc" ! -name "sys" ! -name "tmp" ! -name "dev" ! -name "run" ! -path "/")
do
if [ -w "$dir" ]; then
file_path="$dir/random_75mb_file_2"
echo "Creating a 75MB random file in $dir"
dd if=/dev/urandom of="$file_path" bs=1M count=$block_count status=none
else
echo "Skipping $dir, no write permission."
fi
done

echo "File creation complete."
26 changes: 26 additions & 0 deletions apps/fio/fio-job-file.fio
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
; fio-job-file.fio
[global]
rw=randrw ; Random read and write
rwmixread=70 ; 70% read, 30% write
bs=4k ; Block size set to 4KB
size=150M ; Each file is 150MB
runtime=180 ; Limit total runtime to 3 minutes
direct=1 ; Use direct I/O
ioengine=libaio ; Asynchronous I/O
iodepth=4 ; Queue depth per job

[file1]
filename=file1.fio
stonewall

[file2]
filename=file2.fio
stonewall

[file3]
filename=file3.fio
stonewall

[file4]
filename=file4.fio
stonewall
Loading