Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Add some description of container storage #503

Merged
merged 3 commits into from
May 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions docs/book.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,9 @@ language = "en"
multilingual = false
src = "src"
title = "bootc"

[preprocessor.mermaid]
command = "mdbook-mermaid"

[output.html]
additional-js = ["mermaid.min.js", "mermaid-init.js"]
1 change: 1 addition & 0 deletions docs/src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@
- [Image layout](bootc-images.md)
- [Filesystem](filesystem.md)
- [Filesystem: sysroot](filesystem-sysroot.md)
- [Container storage](filesystem-storage.md)

# More information

Expand Down
90 changes: 90 additions & 0 deletions docs/src/filesystem-storage.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
# Container storage

The bootc project uses [ostree](https://github.com/ostreedev/ostree/) and specifically
the [ostree-rs-ext](https://github.com/ostreedev/ostree-rs-ext/) Rust library
which handles storage of container images on top of an ostree-based system.

## Architecture

```mermaid
flowchart TD
bootc --- ostree-rs-ext --- ostree-rs --- ostree
ostree-rs-ext --- containers-image-proxy-rs --- skopeo --- containers/image
```

There were two high level goals that drove the design of the current system
architecture:

- Support seamless in-place migrations from existing ostree systems
- Avoid requiring deep changes to the podman stack

A simple way to explain the current architecture is that podman uses
two Go libraries:

- <https://github.com/containers/image>
- <https://github.com/containers/storage>

Whereas ostree uses a custom container storage, not `containers/storage`.

## Mapping container images to ostree

[OCI images](https://github.com/opencontainers/image-spec) are effectively
just a standardized format of tarballs wrapped with JSON - specifically
"layers" of tarballs.

The ostree-rs-ext project maps layers to OSTree commits. Each layer
is stored separately, under an ostree "ref" (like a git branch)
under the `ostree/container/` namespace:

```
$ ostree refs ostree/container
```

### Layers

The `ostree/container/blob` namespace tracks storage of a container layer
identified by its blob ID (sha256 digest).

### Images

At the current time, ostree always boots into a "flattened" filesystem
tree. This is generated as both a hardlinked checkout as well as
a composefs image.

The flattened tree is constructed and committed into the
`ostree/container/image` namespace. The commit metadata also includes
the OCI manifest and config objects.

This is implmented in the [ostree-rs-ext/container module](https://docs.rs/ostree-ext/latest/ostree_ext/container/index.html).

### SELinux labeling

A major wrinkle is supporting SELinux labeling. The labeling configuration
is defined as regular expressions included in `/etc/selinux/$policy/contexts/`.

The current implementation relies on the fact that SELinux labels for
base images were pre-computed. The first step is to check out the "ostree base"
layers for the base image.

All derived layers have labels computed from the base image policy. This
causes a known bug where derived layers can't include custom policy:
<https://github.com/ostreedev/ostree-rs-ext/issues/510>

### Origin files

ostree has the concept of an `origin` file which defines the source
of truth for upgrades. The container image reference for each deployment
is included in its origin.

## Booting

A core aspect of this entire design is that once a container image is
fetched into the ostree storage, from there on it just appears as
an "ostree commit", and so all code built on top can work with it.

For example, the `ostree-prepare-root.service` which runs in
the initramfs is currently agnostic to whether the filesystem tree originated
from an OCI image or some other mechanism; it just targets a
prepared flattened filesystem tree.

This is what is referenced by the `ostree=` kernel commandline.
Loading