Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to enable kdump feature on Fedora CoreOS #622

Closed
k-keiichi-rh opened this issue Sep 14, 2020 · 38 comments
Closed

How to enable kdump feature on Fedora CoreOS #622

k-keiichi-rh opened this issue Sep 14, 2020 · 38 comments
Labels
jira for syncing to jira

Comments

@k-keiichi-rh
Copy link

Kdump feature to collect kernel crash dumps would be useful to find the reason or the root cause of system failure.
And Fedora already has a document to enable it:
https://fedoraproject.org/wiki/How_to_use_kdump_to_debug_kernel_crashes

So we can also enable kdump feature on Fedora CoreOS with minor adjustments with the following steps:

  1. Install kexec-tools package and the related packages using "rpm-ostree install"
    => kexec-tools, dracut-squash, snappy, ethtool and squashfs-tools are required.
  2. Updating the kernel parameter to include crashkernel using "rpm-ostree kargs --append='crashkernel=256m'"
  3. Configure /etc/kdump.conf
    => Change the default path from "path /var/crash" to "path /sysroot/ostree/deploy/rhcos/var/crash"

After done the above steps, we would be able to collect kernel crash dumps in Fedora CoreOS.
If we don't have any issues, adding kdump instructions as a part of Troubleshooting in Fedora CoreOS docs would be helpful.

However there are several discussion points I think. So I would like to discuss them before proposing the doc.
I really appreciate if anyone suggest issues other than the following.

1. Interactions with rpm-ostree around kdump initramfs

The kdump service in kexec-tools loads a kernel image and the corresponding initramfs
into the reserved memory space. Howerver configuring kdump is outside of the rpm-ostree
and there is a possibility that the initramfs won't be updated when upgrading the kernel.
If the initramfs isn't updated, a mismatch between kernel and initramfs might occur.

As far as I can see, there were no problems when upgrading to different stream like
stable=>testing and testing=>next in Fedora CoreOS.
Because the new /boot/ostree/fedrora-coreos-XXXX is generated during upgrading and the initramfs that matches
the upgraded kernel is also generated during boot.
So it looks like the mismatch won't occur if the boot directory is recreated every time we upgrade.

But I heard that some issues regarding this kind of interaction are reported in the area of OpenShift and Atomic Host.
Please correct me if my understanding is wrong or something else I overlooked.

$ ls /boot/ostree/
fedora-coreos-08a921dd4b162a14798a5f8892dd0a41b99dbfafdabfcc40a9f95cd9fe9de506
$ sudo rpm-ostree rebase "fedora/x86_64/coreos/testing"
$ rpm-ostree status
State: idle
Deployments:
● ostree://fedora:fedora/x86_64/coreos/testing
                   Version: 32.20200824.2.0 (2020-08-24T20:43:54Z)
                BaseCommit: 9a816729728ff71c744670b36bdc4900a972157e6a58a66dfaca84746bb3f07c
              GPGSignature: Valid signature by 97A1AE57C3A2372CCA3A4ABA6C13026D12C944D0
             LocalPackages: kexec-tools-2.0.20-17.fc32.x86_64 'ethtool-2:5.8-1.fc32.x86_64' squashfs-tools-4.3-25.fc32.x86_64 snappy-1.1.8-2.fc32.x86_64
                            dracut-squash-050-61.git20200529.fc32.x86_64

  ostree://fedora:fedora/x86_64/coreos/stable
                   Version: 32.20200809.3.0 (2020-08-24T16:27:55Z)
                BaseCommit: 902c88022e834fb7998140734fae01b75f78b80f63364e9c5309d62d9a260b7c
              GPGSignature: Valid signature by 97A1AE57C3A2372CCA3A4ABA6C13026D12C944D0
             LocalPackages: kexec-tools-2.0.20-17.fc32.x86_64 'ethtool-2:5.8-1.fc32.x86_64' squashfs-tools-4.3-25.fc32.x86_64 snappy-1.1.8-2.fc32.x86_64
                            dracut-squash-050-61.git20200529.fc32.x86_64
[root@localhost yum.repos.d]# ls /boot/ostree/
fedora-coreos-08a921dd4b162a14798a5f8892dd0a41b99dbfafdabfcc40a9f95cd9fe9de506
fedora-coreos-b35a91a10a35bad06bb06eb0d42120d996f798fc48f004aa40447ec3a9d604ef

2. Might fail to upgrade when moving current stream based on Fedora 32 to Fedora 33

All of current streams(stable, testing and next stream) are based on Fedora 32.
But if the streams based on Fedora 32 are moved to Fedora 33 or later, the older kexec-tools tries to
replace the dracut package with an older one. In this case, the version mismatch might lead to fail to
upgrade.

One of the solutions is to remove the kexex-tools package before upgrading.
Howerver it seems less than ideal.

I need to test whether the above problem occus or not. I will do that.

@bgilbert
Copy link
Contributor

Thanks for raising this. The crash initramfs is the same as the normal one, correct? Where do the kernel arguments for the crash kernel come from? If they're directly taken from the boot arguments to the running kernel, we may need to add some argument filtering, or some ConditionKernelCommandLines in initramfs units. We don't want the crash initramfs trying to rerun Ignition if we crash on the first boot.

@k-keiichi-rh
Copy link
Author

Yes, It's the same as the normal one. And the kernel arguments for the crash kernel basically come from /proc/cmdline that is the same as the running kernel's one.

[ /usr/bin/kdumpctl in kexec-tools ]
load_kdump()
{
        KEXEC_ARGS=$(prepare_kexec_args "${KEXEC_ARGS}")
        KDUMP_COMMANDLINE=$(prepare_cmdline "${KDUMP_COMMANDLINE}" "${KDUMP_COMMANDLINE_REMOVE}" "${KDUMP_COMMANDLINE_APPEND}")
        => prepare_cmdline generates the kernel arguments for the crash kernel from /proc/cmdline in /usr/lib/kdump/kdump-lib.sh
        => Some arguments are removed based on KDUMP_COMMANDLINE_REMOVE in /etc/sysconfig/kdump.
              Some arguments are added based on KDUMP_COMMANDLINE_APPEND in /etc/sysconfig/kdump. 
        ...
        $KEXEC $KEXEC_ARGS $standard_kexec_args \
                --command-line="$KDUMP_COMMANDLINE" \
                --initrd=$TARGET_INITRD $kdump_kernel

If they're directly taken from the boot arguments to the running kernel, we may need to add some argument filtering, or some ConditionKernelCommandLines in initramfs units. We don't want the crash initramfs trying to rerun Ignition if we crash on the first boot.

Please let me confirm just in case. There are some arguments for Ignition that are added on the first CoreOS boot only. When we can crash on the first boot, crash kernel(2nd kernel) unintentionally reruns Ignition because the 2nd kernel will boot with the arguments for the first boot only. This situation is what we want to avoid.

If so, I think kexec-tools will be installed on the 2nd boot or later. And it looks like Ignition won't be rerun unintentionally. Is there any other situations that trigger Ignition?

FYI, The following outputs are the sample from my environtment.

o Fedora CoreOS boot parameter(1st kernel)
BOOT_IMAGE=(hd0,gpt1)/ostree/fedora-coreos-0d802306d9c022a7369a9deefb4a35e9720dda48cbceec9bb8907cfcc3d9b7d8/vmlinuz-5.7.16-200.fc32.x86_64 mitigations=auto,nosmt systemd.unified_cgroup_hierarchy=0 console=tty0 console=ttyS0,115200n8 ignition.platform.id=qemu ostree=/ostree/boot.0/fedora-coreos/0d802306d9c022a7369a9deefb4a35e9720dda48cbceec9bb8907cfcc3d9b7d8/0 crashkernel=256m
o crash kernel boot parameter(2nd kernel)
BOOT_IMAGE=(hd0,gpt1)/ostree/fedora-coreos-0d802306d9c022a7369a9deefb4a35e9720dda48cbceec9bb8907cfcc3d9b7d8/vmlinuz-5.7.16-200.fc32.x86_64 mitigations=auto,nosmt systemd.unified_cgroup_hierarchy=0 console=tty0 console=ttyS0,115200n8 ignition.platform.id=qemu ostree=/ostree/boot.0/fedora-coreos/0d802306d9c022a7369a9deefb4a35e9720dda48cbceec9bb8907cfcc3d9b7d8/0 irqpoll nr_cpus=1 reset_devices cgroup_disable=memory mce=off numa=off udev.children-max=2 panic=10 acpi_no_memhotplug transparent_hugepage=never nokaslr hest_disable novmcoredd disable_cpu_apicid=0 acpi_rsdp=0xf5a30 elfcorehdr=1883508K

@bgilbert
Copy link
Contributor

Some arguments are removed based on KDUMP_COMMANDLINE_REMOVE in /etc/sysconfig/kdump.
Some arguments are added based on KDUMP_COMMANDLINE_APPEND in /etc/sysconfig/kdump.

That's useful.

There are some arguments for Ignition that are added on the first CoreOS boot only. When we can crash on the first boot, crash kernel(2nd kernel) unintentionally reruns Ignition because the 2nd kernel will boot with the arguments for the first boot only. This situation is what we want to avoid.

Correct.

If so, I think kexec-tools will be installed on the 2nd boot or later.

Presumably we'd want to ship kexec-tools as part of Fedora CoreOS. We generally discourage package overlays (rpm-ostree install) though #401 may change that.

Is there any other situations that trigger Ignition?

None that are supported.

@k-keiichi-rh
Copy link
Author

Presumably we'd want to ship kexec-tools as part of Fedora CoreOS. We generally discourage package overlays (rpm-ostree install) though #401 may change that.

Thanks for your explanation and the reference #401. I understand what you are concerned about. To install kexec-tools, there are two ways : the "side yum repo" approach and the "part of Fedora CoreOS" approach. In any case, we may need some argument filtering to avoid the problem you mentioned.

@cgwalters
Copy link
Member

Presumably we'd want to ship kexec-tools as part of Fedora CoreOS.

Yeah, I'm leaning that way myself. The tools aren't large, there's basically never going to be users who want different versions of the tools, there's not much that would actually run inside a container image, etc.

@cgwalters
Copy link
Member

Because the new /boot/ostree/fedrora-coreos-XXXX is generated during upgrading and the initramfs that matches
the upgraded kernel is also generated during boot.

Right, I really don't like regenerating the initramfs on boot. There's really a lot going on in kdump and how it interacts with ostree/rpm-ostree is quite potentially tricky. The fact that it has code running both on the host as a systemd service and in the initramfs and how those interact...

One high level point: we can't enable kdump by default, because not every user will be willing to pay the cost in RAM for file dumping, network dump location needs to be configurable...basically it needs to be off by default because it really requires configuration to be useful. I think then it's interesting to look at what the UX should feel like for enabling it. Will circle back to this.

@cgwalters
Copy link
Member

@k-keiichi-rh can you test out patching the manifest to add kexec-tools by default and how that changes things?

In particular what I want to better understand is: if we take that path, do we need a separate kdump initramfs at all? Today rpm-ostree ships a generic initramfs - there's no configuration embedded. If we can configure the kdump initramfs solely via the kernel cmdline, that gets us out of a whole lot of problems.

If actually kdump needs more complex configuration (and I think it does for /etc/sysconfig/kdump) then this falls into a whole case we've been debating for a while about configuring the initramfs. The latest on that is in ostreedev/ostree#2155

@cgwalters
Copy link
Member

@k-keiichi-rh btw if it helps feel free to schedule a realtime meeting on this!

@lucab
Copy link
Contributor

lucab commented Sep 16, 2020

I looks like kexec-tools ships two monolithic vendor-config files under /etc (/etc/kdump.conf and /etc/sysconfig/kdump).
While other bigger things are getting sorted, @k-keiichi-rh can you please look into whether those could 1) be moved to /usr and 2) support some kind of user-overriding via fragments in /etc?
(See https://www.freedesktop.org/software/systemd/man/systemd.unit.html#Description for an example of this in action)

@k-keiichi-rh
Copy link
Author

@k-keiichi-rh can you test out patching the manifest to add kexec-tools by default and how that changes things?

I will do that and share the result.

In particular what I want to better understand is: if we take that path, do we need a separate kdump initramfs at all? Today rpm-ostree ships a generic initramfs - there's no configuration embedded. If we can configure the kdump initramfs solely via the kernel cmdline, that gets us out of a whole lot of problems.
If actually kdump needs more complex configuration (and I think it does for /etc/sysconfig/kdump) then this falls into a whole case we've been debating for a while about configuring the initramfs. The latest on that is in ostreedev/ostree#2155

I understand the most important thing is to avoid renerating the initramfs for kdump to be simplified.
And there are the following cases to discuss:

  1. Shipping a general initramfs file for kdump in Fedora CoreOS image.
    In this case, we need to discuss how to change kdump configuration. There are the following options I think:
    a) Change the configuration using kernel parameters
    b) Change the configuration using a user-overriding aproach like unit files of systemd
    c) Change the configuration manually modified by users
    => This option is not acceptable because regenerating the initramfs will occur.
  2. Shipping serveral initramfs files for kdump depending on situations like network dumping and file dumping.
    => lib/deploy: Add support for overlay initrds  ostreedev/ostree#2155 will be necessary and should be extended for the initramfs for kdump as well as initramfs for 1st kernel.

I believe that the test I will try will lead us to move forward for further discussion.

@k-keiichi-rh
Copy link
Author

I looks like kexec-tools ships two monolithic vendor-config files under /etc (/etc/kdump.conf and /etc/sysconfig/kdump).

Yes, /etc/kdump.conf and /etc/sysconfig/kdump are required to enable kdump.

can you please look into whether those could

  1. be moved to /usr and

kdumpctl is triggered by kdump.service via systemd.
Both of the files are sticked in the kdumpctl script.
We need to fix kexec-tools to achieve that, but we won't be able to move these files to /usr without any reasonable reasons.

[ /usr/bin/kdumpctl in kexec-tools ]
KDUMP_CONFIG_FILE="/etc/kdump.conf"
MKDUMPRD="/sbin/mkdumprd -f"
DRACUT_MODULES_FILE="/usr/lib/dracut/modules.txt"
SAVE_PATH=/var/crash
SSH_KEY_LOCATION="/root/.ssh/kdump_id_rsa"
INITRD_CHECKSUM_LOCATION="/boot/.fadump_initrd_checksum"
DUMP_TARGET=""
DEFAULT_INITRD=""
DEFAULT_INITRD_BAK=""
TARGET_INITRD=""
FADUMP_REGISTER_SYS_NODE="/sys/kernel/fadump_registered"
#kdump shall be the default dump mode
DEFAULT_DUMP_MODE="kdump"
image_time=0

[[ $dracutbasedir ]] || dracutbasedir=/usr/lib/dracut
. $dracutbasedir/dracut-functions.sh
. /lib/kdump/kdump-lib.sh

standard_kexec_args="-p"

# Some default values in case /etc/sysconfig/kdump doesn't include
KDUMP_COMMANDLINE_REMOVE="hugepages hugepagesz slub_debug"

if [ -f /etc/sysconfig/kdump ]; then
        . /etc/sysconfig/kdump
fi
  1. support some kind of user-overriding via fragments in /etc?

There is no support of user-overriding.

@k-keiichi-rh
Copy link
Author

@k-keiichi-rh btw if it helps feel free to schedule a realtime meeting on this!

Thank you for your help. I am a newcomer in the CoreOS system area. So it will be really helpful for me.
Anyway I will sort out items to discuss.
And I could test out patching the manifest to add kexec-tools by default. I will summarize it soon.

@k-keiichi-rh
Copy link
Author

@k-keiichi-rh can you test out patching the manifest to add kexec-tools by default and how that changes things?

When adding kexec-tools by default into the OS image, kdump service doesn't automatically start in Fedora CoreOS.
But it depends on the systemctl preset in /usr/lib/systemd/system-preset/*.
Fedora and Fedora CoreOS won't start it by default.
RHEL and RHCOS will start it by default.

The following is my fedora-coreos-config to add kexec-tools:

[root@localhost config]# pwd
/root/fcos/src/config
[root@localhost config]# git branch
* testing-devel
[root@localhost config]# git diff
diff --git a/manifests/fedora-coreos.yaml b/manifests/fedora-coreos.yaml
index c64e2dd..77cde7b 100644
--- a/manifests/fedora-coreos.yaml
+++ b/manifests/fedora-coreos.yaml
@@ -16,6 +16,7 @@ packages:
   - fedora-coreos-pinger
   # Updates
   - zincati
+  - kexec-tools

 etc-group-members:
   # Add the docker group to /etc/group

The following is the changes when adding kexec-tools by default:

[root@localhost ~]# rpm-ostree status
State: idle
Deployments:
● ostree://fedora:fedora/x86_64/coreos/testing-devel
                   Version: 32.20200916.dev.0 (2020-09-16T19:13:48Z)
                    Commit: 06cae29ae914fa7b9ed916b690a048f070d88600b89a4a8b2278d1e817be59b4
              GPGSignature: (unsigned)
[root@localhost ~]# rpm -qa | grep kexec-tools
kexec-tools-2.0.20-17.fc32.x86_64
[root@localhost ~]# systemctl status kdump
● kdump.service - Crash recovery kernel arming
     Loaded: loaded (/usr/lib/systemd/system/kdump.service; disabled; vendor preset: disabled)
     Active: inactive (dead)

[root@localhost ~]# cat /proc/cmdline
BOOT_IMAGE=(hd0,gpt1)/ostree/fedora-coreos-f7cc3302ff439b7b7f687ad9df4f6f4f9970063b9470016e1743ba1ffe93667d/vmlinuz-5.8.7-200.fc32.x86_64 mitigations=auto,nosmt systemd.unified_cgroup_hierarchy=0 console=tty0 console=ttyS0,115200n8 ignition.platform.id=qemu ignition.firstboot ostree=/ostree/boot.1/fedora-coreos/f7cc3302ff439b7b7f687ad9df4f6f4f9970063b9470016e1743ba1ffe93667d/0
[root@localhost ~]# ls -l /boot/ostree/fedora-coreos-f7cc3302ff439b7b7f687ad9df4f6f4f9970063b9470016e1743ba1ffe93667d/
合計 83132
-rw-r--r--. 1 root root 73469126  9月 16 19:16 initramfs-5.8.7-200.fc32.x86_64.img
-rwxr-xr-x. 1 root root 11653968  9月 16 19:16 vmlinuz-5.8.7-200.fc32.x86_64
=> No wonder the initramfs for kdump is not generated.

As for my next step, I will summarize what we can do at the moment.
Can we start kdump service automatically?
What configuration of kdump can we change?

@cgwalters
Copy link
Member

Can we start kdump service automatically?

Yes, to do that just add it to e.g. 40-coreos.preset (or really, enable it by default in the base Fedora presets).

But I think that's going to reveal a larger problem in that in this model our initramfs came pre-built with kdump code inside it - we just don't have configuration for it.

I am pretty sure however we do this it's going to involve patching kdump to better integrate with rpm-ostree. See e.g. https://github.com/coreos/coreos-assembler/blob/master/README-devel.md#using-overrides for how to test out modifications to the kdump source code.

I believe we also need to land coreos/rpm-ostree#2170 so that an administrator can configure kdump.

@cgwalters
Copy link
Member

Taking a step back, let's walk through what the "administrator experience" for kdump on a CoreOS system would be. I'm writing this out in terms of shell commands to run, but I think what we really want is to make this ergnomic to enable via Ignition (fcct) - e.g. fcct could have high level sugar that would generate a systemd unit to run these commands.

$ vi /etc/sysconfig/kdump
...
$ rpm-ostree kargs --append=crashkernel=128M
$ rpm-ostree initramfs-etc --sync

The fcct sugar could look like:

kdump:
  crashkernel: 128M
  config: |
    net penguin.example.com:/export/cores

(BTW a higher level thing in this - for OpenShift 4 IMO we should make it completely trivial to have a cluster service collect kdump crashes - or even just have a cluster service detect a kernel crash, this is related to coreos/ignition#585 and also https://github.com/kubernetes/node-problem-detector - maybe even ship an operator via OLM for this. But anyways let's get the base mechanics working first)

@k-keiichi-rh
Copy link
Author

k-keiichi-rh commented Sep 17, 2020

Taking a step back, let's walk through what the "administrator experience" for kdump on a CoreOS system would be. I'm writing this out in terms of shell commands to run, but I think what we really want is to make this ergnomic to enable via Ignition (fcct) - e.g. fcct could have high level sugar that would generate a systemd unit to run these commands.

I agree with your suggestion. I would like to discuss the shell commands part.

$ rpm-ostree initramfs-etc --sync

It looks like "rpm-ostree initramfs-etc" wouldn't help to configure kdump.
We may be able to share a general initramfs for 1st kernel with some changes using "rpm-ostree initramfs-etc".
But we can't share a general initramfs for kdump because it is created with machine-specific parameters by drcaut.

The following output is the arguments in my Fedora CoreOS:

# lsinitrd /boot/ostree/fedora-coreos-0d802306d9c022a7369a9deefb4a35e9720dda48cbceec9bb8907cfcc3d9b7d8/initramfs-5.7.16-200.fc32.x86_64kdump.img 
...
Arguments: --quiet --hostonly --hostonly-cmdline --hostonly-i18n --hostonly-mode 'strict' -o 'plymouth dash resume ifcfg earlykdump' --mount '/dev/disk/by-uuid/910678ff-f77e-4a7d-8d53-86f2ac47a823 /kdumproot/sysroot xfs rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,prjquota,nofail,x-systemd.before=initrd-fs.target' --no-hostonly-default-device -f
=> The dump target(--mount) is passed when executing dracut
...
drwxr-xr-x root/root                35 2020-09-15 19:09 squashfs-root/etc/systemd/system/dev-disk-by\x2duuid-910678ff\x2df77e\x2d4a7d\x2d8d53\x2d86f2ac47a823.device.d
-rw-r--r-- root/root                46 2020-09-15 19:09 squashfs-root/etc/systemd/system/dev-disk-by\x2duuid-910678ff\x2df77e\x2d4a7d\x2d8d53\x2d86f2ac47a823.device.d/timeout.conf
lrwxrwxrwx root/root                78 2020-09-15 19:09 squashfs-root/etc/systemd/system/initrd.target.wants/dev-disk-by\x2duuid-910678ff\x2df77e\x2d4a7d\x2d8d53\x2d86f2ac47a823.device -> ../dev-disk-by\x2duuid-910678ff\x2df77e\x2d4a7d\x2d8d53\x2d86f2ac47a823.device
-rw-r--r-- root/root               146 2020-09-15 19:09 squashfs-root/usr/lib/dracut/hooks/emergency/80-\x2fdev\x2fdisk\x2fby-uuid\x2f910678ff-f77e-4a7d-8d53-86f2ac47a823.sh
-rw-r--r-- root/root                64 2020-09-15 19:09 squashfs-root/usr/lib/dracut/hooks/initqueue/finished/devexists-\x2fdev\x2fdisk\x2fby-uuid\x2f910678ff-f77e-4a7d-8d53-86f2ac47a823.sh
=> The systemd device unit and dracut scripts for the dump device are also automatically generated depends on the enviornment. 

So it's difficult to avoid to regenerate the initramfs for kdump on boot. But I don't have any solutions to resolve this problem now.
I need to understand the interactions with rpm-ostree around kdump initramfs more deeply.
Is it enough that we can control the timing of generating the kdump initramfs by implementing "rpm-ostree kdump ..." as sub command of rpm-ostree?

@travier travier added the jira for syncing to jira label Oct 5, 2020
@cgwalters
Copy link
Member

It looks like "rpm-ostree initramfs-etc" wouldn't help to configure kdump.
We may be able to share a general initramfs for 1st kernel with some changes using "rpm-ostree initramfs-etc".
But we can't share a general initramfs for kdump because it is created with machine-specific parameters by drcaut.

The idea behind the work in progress initramfs-etc command is that it allows generating a secondary initramfs from local configuration, distinct from the "golden" initramfs created by the CoreOS build process. The bootloaders we care about should support multiple initrds and concatenate them dynamically.

@kelvinfan001
Copy link
Member

kelvinfan001 commented Oct 8, 2020

I tried to summarize what we know so far (mostly based off of the work that @k-keiichi-rh has done) and put together a WIP doc. Please correct me if I've misunderstood anything!

@kelvinfan001
Copy link
Member

@cgwalters I'm also not completely sure how initramfs-etc will help with kdump. Currently, the kdump initrd is generated with the help of a mkdumprd script that essentially takes the user configuration for the initrd from kdump.conf, and generates a completely new initrd solely for kdump. Do you mean we can let rpm-ostree take over the role of mkdumprd?

@cgwalters
Copy link
Member

Currently, the kdump initrd is generated with the help of a mkdumprd script that essentially takes the user configuration for the initrd from kdump.conf, and generates a completely new initrd solely for kdump.

I hadn't realized that. Well it certainly simplifies things then; you're right that ostree/rpm-ostree probably don't need to be involved.

@cgwalters
Copy link
Member

That fact changes my "lean towards adding by default" stance. Since this needs manual configuration and a reboot anyways, it seems fine to leave as an extension.

@kelvinfan001
Copy link
Member

kelvinfan001 commented Oct 13, 2020

@cgwalters If we follow the steps outlined in @k-keiichi-rh's first comment, kdump seems to work fine. So should our next step be making enabling kdump more ergonomic by adding some fcct sugar, so that all configuration can be done through Ignition configs?

@cgwalters
Copy link
Member

Yeah, the deliverable here may just be changes to the FCOS docs describing the basics, and fcct sugar.

@cgwalters
Copy link
Member

Though we should also probably write at least a basic test for this too.

@jlebon
Copy link
Member

jlebon commented Oct 14, 2020

Since this needs manual configuration and a reboot anyways, it seems fine to leave as an extension.

Yeah agreed. And then in the future, we can keep optimizing it once we have kargs-via-Ignition and first-boot extensions.

@cgwalters
Copy link
Member

I submitted a PR to add kexec-tools to rhcos extensions which I could link here if we merged openshift/os#413

@bgilbert
Copy link
Contributor

If kdump is enough of a core FCOS/RHCOS feature that we're adding FCCT sugar for it, it seems pretty clear to me that we should make it part of the base OS. Is there any reason not to do so?

The reboot to add kdump kargs should probably be implemented using coreos/ignition#1051. It's still unsafe to reboot from a systemd unit during first boot after we've exited the initramfs.

@cgwalters
Copy link
Member

If kdump is enough of a core FCOS/RHCOS feature that we're adding FCCT sugar for it,

Perhaps we just want sugar for kargs and package installs, then we wouldn't need special kdump sugar.

it seems pretty clear to me that we should make it part of the base OS. Is there any reason not to do so?

It drags in some new random dependencies (that are probably unnecessary actually) and it wouldn't be on by default, so if explicit action is required it might as well be to install it as well.

(To be clear I was leaning towards add by default originally, could still be convinced - at least kdump is something that applies universally across metal and cloud for example)

The reboot to add kdump kargs should probably be implemented using coreos/ignition#1051. It's still unsafe to reboot from a systemd unit during first boot after we've exited the initramfs.

Yep. For OpenShift though the MCO is already he single point of rebooting, so all that's needed there is to ship kexec-tools as an extension, we already support writing its config via Ignition and also support kernel arguments explicitly.
(But per discussion it would be good to drain that stuff into Ignition/FCOS)

@bgilbert
Copy link
Contributor

It drags in some new random dependencies (that are probably unnecessary actually) and it wouldn't be on by default, so if explicit action is required it might as well be to install it as well.

Well, except that we're an image-based OS and we discourage package installs. It's not obvious to me that we should only ship things that are enabled by default. For example, we ship software to cloud platforms that only makes sense on bare metal, since we have a unified image for all use cases.

@cgwalters
Copy link
Member

Well, except that we're an image-based OS and we discourage package installs.

I think everyone agrees with that but as stated that sounds like an argument against #401 entirely. IOW this discussion is more about the nuance of this specific case.

We can take this one perhaps to the next open discussion and do a vote?

@cgwalters cgwalters reopened this Oct 14, 2020
@kelvinfan001
Copy link
Member

kelvinfan001 commented Oct 14, 2020

So the thing really blocking kdump from being enabled on FCOS is "It's still unsafe to reboot from a systemd unit during first boot after we've exited the initramfs.", right? If rebooting from a systemd unit after switch-root is not a problem, then we essentially just need to add a systemd unit through Ignition configs that performs the actions in #622 (comment) and reboot?

@bgilbert
Copy link
Contributor

I think everyone agrees with that but as stated that sounds like an argument against #401 entirely.

Eh, not quite. People are obviously going to want to package layer, and it's good to avoid ~instantly breaking their systems when they do. But it's one thing to help folks who want to run FCOS for use cases at the margins of what we support, and it's another to push functionality that we actually support/recommend out into extensions. It's not clear to me that the latter is worth the usability and complexity tradeoffs.

@jlebon
Copy link
Member

jlebon commented Oct 15, 2020

The reboot to add kdump kargs should probably be implemented using coreos/ignition#1051. It's still unsafe to reboot from a systemd unit during first boot after we've exited the initramfs.

This is technically true, but you could say that for anything which wants to reboot during the firstboot, even Zincati. While we can and should design better solutions for things which need a reboot (e.g. kargs-via-Ignition and "live" extensions), we can't address all use cases for which a user will want to reboot on first-boot.

We need a better solution here, which probably will involve some systemd changes (will look at filing an issue there; edit: found this issue and this PR), but meanwhile, I'd say doing After=multi-user.target and using your own stamp file instead of ConditionFirstBoot=true should work for now.

@jlebon jlebon added the meeting topics for meetings label Oct 19, 2020
@jlebon
Copy link
Member

jlebon commented Oct 21, 2020

This was discussed in the meeting today:

#agreed we will include kexec-tools in FCOS to make crash collection built in. we will add documentation
        and possibly FCCT sugar to standardize how it's configured. we will investigate dependencies 
        that it pulls in.

@jlebon jlebon removed the meeting topics for meetings label Oct 21, 2020
@jlebon
Copy link
Member

jlebon commented Oct 21, 2020

Answering the questions from #641 (comment):

Can the package be run from a container?

It probably could run in a container, though it's not designed to. So there would be some maintenance burden involved.

Can the tool be helpful in debugging container runtime issues?

It could be useful if the bug is in the kernel.

Can the tool be helpful in debugging networking issues?

It could be useful if the bug is in the kernel.

Is it possible to layer the package onto the base OS as a day 2 operation?

It is, and it works fine as a layered package. Though it increases friction for something that we want to consider part of the "host API". E.g. if we want a kdump FCC sugar, it's hard to hide that reboot from the user as an implementation detail. Another issue is that it introduces a larger gap before kdump is active. (And ironically, once we support early rebooting from the initrd for kargs, it would result in two reboots total.)

Does the package have additional dependencies? (i.e. does it drag in Python, Perl, etc)

# rpm-ostree install kexec-tools
...
Added:
  dracut-squash-050-61.git20200529.fc32.x86_64
  ethtool-2:5.9-1.fc32.x86_64
  kexec-tools-2.0.20-17.fc32.x86_64
  snappy-1.1.8-2.fc32.x86_64
  squashfs-tools-4.3-25.fc32.x86_64

What is the size of the package and its dependencies?

# rpm -q dracut-squash ethtool kexec-tools snappy squashfs-tools --qf '%{SIZE}\t%{NAME}\n'
2806    dracut-squash
632932  ethtool
1263239 kexec-tools
64687   snappy
419606  squashfs-tools

So total 2.3M, though this includes docs, which we don't ship.

Can the packaging be adjusted to just deliver binaries?

Not applicable. A core desirable part of the package is the systemd service.

What is the intended use case of the package?

Being able to collect crashed kernel cores for analysis.

Can the tool be used to do things we’d rather users not be able to do in FCOS? (E.g. can it be abused as a Turing complete interpreter?)

I'm not very familiar with kdump, though it seems focused in functionality. It does add /sbin/kexec, which can be used directly to boot a new kernel, which we likely don't want to support in FCOS (outside the context of kdump itself of course). The added ethtool dep also has some overlap with NetworkManager.

Does the tool have a history of CVEs?

Scanned Bodhi as well as the Fedora and RHEL RPM changelog for CVEs and didn't find any.

jlebon added a commit to jlebon/fedora-coreos-config that referenced this issue Oct 23, 2020
jlebon added a commit to coreos/fedora-coreos-config that referenced this issue Oct 26, 2020
jlebon added a commit to jlebon/os that referenced this issue Oct 27, 2020
jlebon added a commit to jlebon/os that referenced this issue Oct 27, 2020
@travier
Copy link
Member

travier commented Dec 7, 2020

Given that:

I think we can close this issue when the docs PR gets merged.

I will open an FCCT specific issue to track kdump sugar support. Edit: coreos/butane#175

@kelvinfan001
Copy link
Member

Makes sense to me!

@kelvinfan001
Copy link
Member

coreos/fedora-coreos-docs#198 is now merged.

kelvinfan001 pushed a commit to kelvinfan001/fedora-coreos-config that referenced this issue Dec 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
jira for syncing to jira
Projects
None yet
Development

No branches or pull requests

7 participants