Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

35coreos-ignition: skip reboot if changed kargs match current boot #1409

Open
wants to merge 6 commits into
base: testing-devel
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,12 +1,65 @@
#!/bin/bash
set -euo pipefail


# There are a few cases we need to handle here. To illustrate this
# we'll use scenarios below where:
#
# - "booted": The kernel arguments from the currently booting system.
# - "ignition": The kernel arguments in the Ignition configuration.
# - "bls": The kernel arguments currently baked into the disk
# image BLS configs.
#
# The scenarios are:
#
# A.
# - Scenario:
# - booted: ""
# - ignition: "foobar"
# - bls: ""
# - Action: -> Update BLS configs, perform reboot
# B.
# - Scenario:
# - booted: "foobar"
# - ignition: "foobar"
# - bls: ""
# - Action: -> Update BLS configs, skip reboot
# C.
# - Scenario:
# - booted: ""
# - ignition: "foobar"
# - bls: "foobar"
# - Action: -> Skip update of BLS configs (they match already), perform reboot
#
# The logic here boils down to:
# if "ignition" != "booted"; then needreboot=1; fi
# if "ignition" != "bls"; then updatebls(); fi

# NOTE: we write info messages to kmsg here because stdout gets swallowed
# by Ignition if there is no failure. This forces the info into the
# journal, but sometimes the journal will miss these messages because
# of ratelimiting. We've decided to accept this limitation rather than
# add the systemd-cat or logger utlities to the initramfs.

# If the desired state isn't reflected by the current boot we'll need to reboot.
/usr/bin/rdcore kargs --current --create-if-changed /run/coreos-kargs-reboot "$@"
if [ -e /run/coreos-kargs-reboot ]; then
msg="Desired kernel arguments don't match current boot. Requesting reboot."
echo "$msg" > /dev/kmsg
Comment on lines +47 to +48
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: feels like we can collapse these here and elsewhere now that we're only outputting to /dev/kmsg?

fi

if is-live-image; then
/usr/bin/rdcore kargs --current --create-if-changed /run/coreos-kargs-changed "$@"
if [ -e /run/coreos-kargs-changed ]; then
# If we're in a live system and the kargs don't match then we must error.
Copy link
Member

@jlebon jlebon Mar 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: tab

if [ -e /run/coreos-kargs-reboot ]; then
# Since we exit with error here the stderr will get shown by Ignition
echo "Need to modify kernel arguments, but cannot affect live system." >&2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this go to /dev/kmsg for consistency?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. In the case there is an error Ignition will bubble that up the user so we need to write to stderr. We only need to write to kmsg when the messages are informational (i.e. Ignition swallows any I/O when there isn't an error).

exit 1
fi
else
/usr/bin/rdcore kargs --boot-device /dev/disk/by-label/boot --create-if-changed /run/coreos-kargs-reboot "$@"
# Update the BLS configs if they need to be updated.
/usr/bin/rdcore kargs --boot-device /dev/disk/by-label/boot --create-if-changed /run/coreos-kargs-changed "$@"
if [ -e /run/coreos-kargs-changed ]; then
msg="Kernel arguments in BLS config were updated."
echo "$msg" > /dev/kmsg
fi
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, there is also the case where the kargs in the BLS config already match the Ignition config, but neither match the current boot. Before and after this patch, we don't reboot, but we likely should now that we care about firstboot kargs in the diskful path.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added another commit for this case. Untested right now. If the strategy looks good I'll try to add an ext test.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not 100% sure how to add an ext test for this. Since we don't have any of the logs from the first boot it's hard to get that information. Any ideas?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we put a karg in appendFirstbootKernelArgs and in the Ignition config's shouldNotExist section and then verify that the karg isn't present in /proc/cmdline in the test? We don't have logs from first boot, but we can deduce that it worked since it was present on first boot.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I think the problem is that the implementation for appendFirstbootKernelArgs (the file that exists in /boot) doesn't get cleaned up until Ignition fully runs. So any karg we put there will be present.

We could add another reboot and test then.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Offhand, I think one way to fix this is:

  1. store the fetched Ignition config in /boot
  2. store any NM keyfiles generated from kargs in /boot
  3. empty the stamp file but leave it there; it's still only deleted by firstboot complete

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems like a lot of effort and maybe not worth the risk.

We could prevent the bootloop by telling ourselves that we ran once before either through a stamp file or by adding another entry to the BLS config (coreos.karg.ran), but that wouldn't prevent the karg booted versus expected mismatch. It would only prevent the bootloop, though then we could error if we wanted to.

Interested to see what @bgilbert thinks here. We could just drop 35coreos-ignition: handle the case where booted kargs don't match desired too if it's not worth it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I think caching the Ignition config in /boot is something we should probably do anyway. It's more efficient of course, but also because then config servers like the MCS can enforce a "one fetch only" rule as a way to protect secrets.

From there, adding the NM keyfiles and emptying the stamp file doesn't seem like much work. The underlying mechanism for getting NM keyfiles from /boot is already in place.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not seeing a use case for adding a karg to firstboot-kargs and also Ignition shouldNotExist: that would imply the user wants to remove a default karg except on the first boot, and I don't think we have any default kargs where that's obviously reasonable. And in general, I don't think we need to be especially robust to invalid user configuration. If we deterministically boot-loop in this case, the user will notice and can fix their config.

If we think caching the config is worthwhile, I'm okay with pursuing it, but I don't think it needs to be a blocker here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An easy way this can happen I think is users relying on the auto-forwarding kargs at install time, and then wanting a different configuration at first boot time (e.g. different location, or different NIC used for install vs run). But agreed we don't need to block on this.

fi
7 changes: 7 additions & 0 deletions tests/kola/ignition/kargs/basic/config.bu
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
variant: fcos
version: 1.4.0
kernel_arguments:
should_exist:
- foobar
should_not_exist:
- mitigations=auto,nosmt
1 change: 1 addition & 0 deletions tests/kola/ignition/kargs/basic/data/commonlib.sh
30 changes: 30 additions & 0 deletions tests/kola/ignition/kargs/basic/test.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
#!/bin/bash
# TODO: Doc

set -xeuo pipefail
# This test runs on all platforms and verifies Ignition kernel argument setting.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The message of the commit introducing this hunk says "- Use ok/fatal from commonlib.sh" but that seems to already be the case now.


. $KOLA_EXT_DATA/commonlib.sh

kargchecks() {
if ! grep foobar /proc/cmdline; then
fatal "missing foobar in kernel cmdline"
fi
if grep mitigations /proc/cmdline; then
fatal "found mitigations in kernel cmdline"
fi
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this similarly also check the BLS?

}

case "${AUTOPKGTEST_REBOOT_MARK:-}" in
"")
kargchecks
# Now reboot the machine and verify the kernel argument persists
/tmp/autopkgtest-reboot nextboot
;;
nextboot)
kargchecks
;;
*) fatal "unexpected mark: ${AUTOPKGTEST_REBOOT_MARK}";;
esac

ok "Ignition kargs"
9 changes: 0 additions & 9 deletions tests/kola/ignition/kargs/config.ign

This file was deleted.

5 changes: 5 additions & 0 deletions tests/kola/ignition/kargs/skipreboot/config.bu
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
variant: fcos
version: 1.4.0
kernel_arguments:
should_exist:
- foobar
1 change: 1 addition & 0 deletions tests/kola/ignition/kargs/skipreboot/data/commonlib.sh
43 changes: 43 additions & 0 deletions tests/kola/ignition/kargs/skipreboot/test.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
#!/bin/bash
set -xeuo pipefail
# kola: { "platforms": "qemu", "appendFirstbootKernelArgs": "foobar" }
# This test verifies that if a kernel argument that is set as "should_exist"
# in the Ignition config already exists on the kernel command line of the machine
# then we can skip the reboot when applying kernel arguments but we must still
# update the BLS configs to make it permanent. This is Scenario B from
# the documentation in 35coreos-ignition/coreos-kargs.sh.
#
# - platforms: qemu
# - appendFirstbootKernelArgs is only supported on qemu.
# - appendFirstbootKernelArgs: foobar
# - The kernel argument to apply transiently only on the first boot.

. $KOLA_EXT_DATA/commonlib.sh

kargchecks() {
if ! grep foobar /proc/cmdline; then
fatal "missing expected kernel arg in kernel cmdline"
fi
if ! grep foobar /boot/loader/entries/*.conf; then
fatal "missing expected kernel arg in BLS config"
fi
}

case "${AUTOPKGTEST_REBOOT_MARK:-}" in
"")
kargchecks
# If this file exists then reboot was skipped. See
# 35coreos-ignition/coreos-kargs.sh
if [ ! -e /run/coreos-kargs-changed ]; then
fatal "missing file that should exist if no reboot happened"
fi
# Now reboot the machine and verify the kernel argument persists
/tmp/autopkgtest-reboot nextboot
;;
nextboot)
kargchecks
;;
*) fatal "unexpected mark: ${AUTOPKGTEST_REBOOT_MARK}";;
esac

ok "Ignition kargs skip reboot"
14 changes: 0 additions & 14 deletions tests/kola/ignition/kargs/test.sh

This file was deleted.