Skip to content

Latest commit

 

History

History
125 lines (84 loc) · 6.39 KB

README-internals.md

File metadata and controls

125 lines (84 loc) · 6.39 KB

CoreOS Internals docs

This document intends to be a dumping ground to briefly describe various problem domains we've hit around building/delivering/testing CoreOS style systems.

Other important links:

Initramfs

This topic is big enough to have its own document: README-initramfs.md.

CPU microcode

rpm-ostree runs dracut on the server side, and dracut knows how to pick up CPU microcode and prepend it to the initramfs. Relevant bugs:

Entropy

As of recently we enable CONFIG_RANDOM_TRUST_CPU which covers modern x86_64 systems for example.

Networking

In this tracker issue a decision was made to use NetworkManager. As of recently we use NetworkManager in the initramfs. And even more recently, things have been reworked so that afterburn can control initramfs networking on specific clouds.

Time synchronization

We use chrony, with some additional custom logic for specific clouds. See also DHCP propagation: coreos/fedora-coreos-config#412

Aleph version

rpm-ostree status will show admins the state of the ostree, but a few things live outside that and are not subject to in place updates. For example, the on-disk filesystem (default xfs) and its specific layout, as well as the bootloader.

See this pull request which added /sysroot/.coreos-aleph-version.json that can be used to track the version of that data.

ignition.platform.id

See https://docs.fedoraproject.org/en-US/fedora-coreos/platforms/

The design we have today is that each CoreOS system is the same OS content - the same OSTree commit, and beyond that the exact same bootloader version, etc.

There are differences per platform on the image formats (VHD versus qcow2 vs raw, etc). However, what's inside the disk image for each platform is almost the same.

A key difference between each image is the ignition.platform.id kernel argument. From the moment the system boots and the kernel loads the initramfs, our userspace code uses this to reliably know its target platform. As could be guessed from the name, https://github.com/coreos/ignition/ uses this, and it runs early on.

But there's other code which dynamically dispatches on the platform ID:

Notice in particular how the time synchronization code ends up reconfiguring chrony dynamically. For other operating systems which do "per cloud" disk images, it would have been more natural to just change /etc/chrony.conf per platform. But that would mean we have a different ostree commit checksum per platform, breaking our "image based" update model.

Multipath

Multipath differs from other storage configurations by a major aspect: it is usually not configured by Ignition. If we mount an individual path for e.g. /sysroot, multipathd will not be able to take ownership afterwards. Furthermore, directly accessing individual paths before multipathd takes over is unsafe (e.g. it could be a non-optimized path). And since we need to mount /boot very early on, this naturally pushes multipath configuration into kernel arguments (and ideally soon, initramfs overlays).

What we ended up with is adding an rd.multipath=default kernel argument which triggers dracut to do "basic" automatic multipath setup in the stock initramfs: dracutdevs/dracut#780

By the nature of multipath, a tricky aspect is that e.g. the by-label/root symlink is valid both before and after multipathd takes ownership. In order to safely wait for the multipathed rootfs to show up, we have these udev rules which create, for example, by-label/dm-mpath-root:

https://github.com/coreos/fedora-coreos-config/blob/94e0daa567a658f023d48ac5929c72ed910792bd/overlay.d/05core/usr/lib/udev/rules.d/90-coreos-device-mapper.rules#L1

This is why we require the root=/dev/disk/by-label/dm-mpath-root kernel argument; so that the mount generated by systemd-fstab-generator waits for the the multipath version to show up and doesn't just mount an individual path.

Firstboot (day-1) support is usually done at coreos-installer time by doing:

coreos-installer install \
  --append-karg rd.multipath=default \
  --append-karg root=/dev/disk/by-label/dm-mpath-root \
  --append-karg rw
  ...

The rw bit is necessary because systemd-fstab-generator will create a read-only mount by default (usually, rw is injected by rdcore rootmap for subsequent boots, but this does not happen if there is already a root karg).

That said, turning on multipath on a subsequent (day-2) boot is still supported if the multipath setup itself is compatible with this. This is done by appending the same kargs as above using e.g. rpm-ostree kargs. (Appending the kargs can also be done via ignition-kargs, though this still counts as "day-2" since on first boot we'd still access the boot partition directly.)

We don't yet document multipath for FCOS, but we do document this setup for OpenShift that has a kola test:

We also support multipath on an individual non-root partition. See the test above for how this works.

More links: