Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce vmcore creation notification to kdump #20

Merged
merged 1 commit into from
Sep 29, 2024

Commits on Sep 5, 2024

  1. Introduce vmcore creation notification to kdump

    Motivation
    ==========
    
    People may forget to recheck to ensure kdump works, which as a result, a
    possibility of no vmcores generated after a real system crash. It is
    unexpected for kdump.
    
    It is highly recommended people to recheck kdump after any system
    modification, such as:
    
    a. after kernel patching or whole yum update, as it might break something
       on which kdump is dependent, maybe due to introduction of any new bug etc.
    b. after any change at hardware level, maybe storage, networking,
       firmware upgrading etc.
    c. after implementing any new application, like which involves 3rd party modules
       etc.
    
    Though these exceed the range of kdump, however a simple vmcore creation
    status notification is good to have for now.
    
    Design
    ======
    
    Kdump currently will check any relating files/fs/drivers modified before
    determine if initrd should rebuild when (re)start. A rebuild is an
    indicator of such modification, and kdump need to be rechecked. This will
    clear the vmcore creation status specified in $VMCORE_CREATION_STATUS.
    
    Vmcore creation check will happen at "kdumpctl (re)start/status", and will
    report the creation success/fail status to users. A "success" status indicates
    previously there has been a vmcore successfully generated based on the current
    env, so it is more likely a vmcore will be generated later when real crash
    happens; A "fail" status indicates previously there was no vmcore
    generated, or has been a vmcore creation failed based on current env. User
    should check the 2nd kernel log or the kexec-dmesg.log for the failing reason.
    
    $VMCORE_CREATION_STATUS is used for recording the vmcore creation status of
    the current env. The format will be like:
    
       success 1718682002
    
    Which means, there has been a vmcore generated successfully at this
    timestamp for the current env.
    
    Usage
    =====
    
    [root@localhost ~]# kdumpctl restart
    kdump: kexec: unloaded kdump kernel
    kdump: Stopping kdump: [OK]
    kdump: kexec: loaded kdump kernel
    kdump: Starting kdump: [OK]
    kdump: Notice: No vmcore creation test performed!
    
    [root@localhost ~]# kdumpctl test
    
    [root@localhost ~]# kdumpctl status
    kdump: Kdump is operational
    kdump: Notice: Last successful vmcore creation on Tue Jun 18 16:39:10 CST 2024
    
    [root@localhost ~]# kdumpctl restart
    kdump: kexec: unloaded kdump kernel
    kdump: Stopping kdump: [OK]
    kdump: kexec: loaded kdump kernel
    kdump: Starting kdump: [OK]
    kdump: Notice: Last successful vmcore creation on Tue Jun 18 16:39:10 CST 2024
    
    The notification for kdumpctl (re)start/status can be disabled by
    setting VMCORE_CREATION_NOTIFICATION in /etc/sysconfig/kdump
    
    ===
    
    v3 -> v2:
    Always mount
    $VMCORE_CREATION_STATUS(/var/crash/vmcore-creation.status)'s device for
    2nd kernel, in case /var is a seperate device than rootfs's device.
    
    v4 -> v3:
    Add "kdumpctl test" as the entrance for performing the kdump test.
    
    v5 -> v4:
    Fix the mounting failure issue in fadump.
    
    v6 -> v5:
    Add new argument as customized mount point for add_mount/to_mount.
    
    v7 -> v6:
    a. Code refactoring based on Philipp's suggestion.
    b. Only mount $VMCORE_CREATION_STATUS(/var/crash/vmcore-creation.status)'s
       device when needed.
    c. Add "--force" option for "kdumpctl test", to support the automation test
       script QE may perform.
    d. Add check in "kdumpctl test" that $VMCORE_CREATION_STATUS can only be on
       local drive.
    
    v8 -> v7:
    a. Rebased the patch on top of upstream commit e2b8463.
    b. Code refactoring based on Philipp's suggestion.
    c. Updated the "test" entry of kdumpctl.8.
    
    Signed-off-by: Tao Liu <ltao@redhat.com>
    liutgnu committed Sep 5, 2024
    Configuration menu
    Copy the full SHA
    6f6a48f View commit details
    Browse the repository at this point in the history