Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce vmcore creation notification to kdump #20

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

liutgnu
Copy link
Collaborator

@liutgnu liutgnu commented Jul 2, 2024

Introduce vmcore creation notification to kdump

Motivation
==========

People may forget to recheck to ensure kdump works, which as a result, a
possibility of no vmcores generated after a real system crash. It is
unexpected for kdump.

It is highly recommended people to recheck kdump after any system
modification, such as:

a. after kernel patching or whole yum update, as it might break something
   on which kdump is dependent, maybe due to introduction of any new bug etc.
b. after any change at hardware level, maybe storage, networking,
   firmware upgrading etc.
c. after implementing any new application, like which involves 3rd party modules
   etc.

Though these exceed the range of kdump, however a simple vmcore creation
status notification is good to have for now.

Design
======

Kdump currently will check any relating files/fs/drivers modified before
determine if initrd should rebuild when (re)start. A rebuild is an
indicator of such modification, and kdump need to be rechecked. This will
clear the vmcore creation status specified in $VMCORE_CREATION_STATUS.

Vmcore creation check will happen at "kdumpctl (re)start/status", and will
report the creation success/fail status to users. A "success" status indicates
previously there has been a vmcore successfully generated based on the current
env, so it is more likely a vmcore will be generated later when real crash
happens; A "fail" status indicates previously there was no vmcore
generated, or has been a vmcore creation failed based on current env. User
should check the 2nd kernel log or the kexec-dmesg.log for the failing reason.

$VMCORE_CREATION_STATUS is used for recording the vmcore creation status of
the current env. The format will be like:

   success 1718682002

Which means, there has been a vmcore generated successfully at this
timestamp for the current env.

Usage
=====

[root@localhost ~]# kdumpctl restart
kdump: kexec: unloaded kdump kernel
kdump: Stopping kdump: [OK]
kdump: kexec: loaded kdump kernel
kdump: Starting kdump: [OK]
kdump: Notice: No vmcore creation test performed!

[root@localhost ~]# kdumpctl test

[root@localhost ~]# kdumpctl status
kdump: Kdump is operational
kdump: Notice: Last successful vmcore creation on Tue Jun 18 16:39:10 CST 2024

[root@localhost ~]# kdumpctl restart
kdump: kexec: unloaded kdump kernel
kdump: Stopping kdump: [OK]
kdump: kexec: loaded kdump kernel
kdump: Starting kdump: [OK]
kdump: Notice: Last successful vmcore creation on Tue Jun 18 16:39:10 CST 2024

The notification for kdumpctl (re)start/status can be disabled by
setting VMCORE_CREATION_NOTIFICATION in /etc/sysconfig/kdump

===

v3 -> v2:
Always mount
$VMCORE_CREATION_STATUS(/var/crash/vmcore-creation.status)'s device for
2nd kernel, in case /var is a seperate device than rootfs's device.

v4 -> v3:
Add "kdumpctl test" as the entrance for performing the kdump test.

v5 -> v4:
Fix the mounting failure issue in fadump.

v6 -> v5:
Add new argument as customized mount point for add_mount/to_mount.

v7 -> v6:
a. Code refactoring based on Philipp's suggestion.
b. Only mount $VMCORE_CREATION_STATUS(/var/crash/vmcore-creation.status)'s
   device when needed.
c. Add "--force" option for "kdumpctl test", to support the automation test
   script QE may perform.
d. Add check in "kdumpctl test" that $VMCORE_CREATION_STATUS can only be on
   local drive.

Signed-off-by: Tao Liu <ltao@redhat.com>

@daveyoung
Copy link
Contributor

daveyoung commented Jul 2, 2024

Hi Tao, it is weird that the email notification still keeps using old description success/fail sentences. though it is V3 so no way to reply inline. Let me put a comment on web ui here:
"kdump: Vmcore creation notice: No vmcore creation test performed!"
For above notice, it looks redundant with two-time mentioning of "vmcore creation". I suggest just "notice: no vmcore creation test preformed!" Or "Warning: no vmcore creation test performed"

And "notice: Last successful vmcore creation on Tue Jun ..." or "Last successful vmcore creation on Tue Jun ..."

Also how about change "vmcore creation" to "kdump test", ditto about the status file name. It sounds more straightforward and easier to understand.

@daveyoung
Copy link
Contributor

Hi Tao, otherwise I noticed that there is a special case for dump_to_rootfs in mkdumprd. So doing some testing and verification with fadump should be also necessary.

@liutgnu
Copy link
Collaborator Author

liutgnu commented Jul 2, 2024

Hi Tao, it is weird that the email notification still keeps using old description success/fail sentences. though it is V3 so no way to reply inline. Let me put a comment on web ui here: "kdump: Vmcore creation notice: No vmcore creation test performed!" For above notice, it looks redundant with two-time mentioning of "vmcore creation". I suggest just "notice: no vmcore creation test preformed!" Or "Warning: no vmcore creation test performed"

And "notice: Last successful vmcore creation on Tue Jun ..." or "Last successful vmcore creation on Tue Jun ..."

Also how about change "vmcore creation" to "kdump test", ditto about the status file name. It sounds more straightforward and easier to understand.

Hi Dave,

Agreed, just notice: Last successful vmcore .... is enough, will get it improved in v4.

@liutgnu
Copy link
Collaborator Author

liutgnu commented Jul 2, 2024

Hi Tao, otherwise I noticed that there is a special case for dump_to_rootfs in mkdumprd. So doing some testing and verification with fadump should be also necessary.

Yes, will do it, currently there is a OOM issue in kexec-tools which blocked my fadump testing, see RHEL-35516.

@coiby
Copy link
Member

coiby commented Jul 2, 2024

Hi @liutgnu,

I'm curious to ask why you prefer creating a new PR for a new version? For the perspective of maintainer, one benefit of pushing new version to the same PR is I can compare the difference between two versions with one-lick i.e. by clicking the "Compare" hyperlink,
Screenshot_2024-07-02_15-54-56

@liutgnu
Copy link
Collaborator Author

liutgnu commented Jul 2, 2024

Hi @liutgnu,

I'm curious to ask why you prefer creating a new PR for a new version? For the perspective of maintainer, one benefit of pushing new version to the same PR is I can compare the difference between two versions with one-lick i.e. by clicking the "Compare" hyperlink,

Please tell me which way you prefer. On one hand, the "force push" is dangerous and should be avoided, I guess a seperate MR of each patch version(pushing from different branch) is a safer one. On the other hand, "force push" against a single MR as you described, may have the following issue:

1: patch v1 (MR created)
|
V
2: force push1 (to improve v1)
|
V
3: force push2 (to improve v1 again, now the v1 is OK for review)
|
V
4: force push3 (Received comments, drafting v2)
|
V
5: force push4 (to improve v2, now v2 is ready for review)
|
V
....

You see, there might be plenty of "force pushes". I guess what you want to view is the diff of 3 & 5. I don't know if you can select the commit hash for diff comparison. What I know is it can only the diff of adjorning commits, please correct me if I'm wrong.

@coiby
Copy link
Member

coiby commented Jul 3, 2024

Thanks for raising the concerns! If you are worried that a force-push may overwrite the history of this repo, then you needn't to worry about it because this repo doesn't allow force-push which is also the default setting.

As for the case of comparing versions of 3 & 5, you can simply send new version of patches in one single force-push to avoid this case. Btw, you can also compare different commits directly e.g. compare your v3 with v2.

So I prefer sending new version of patches to the same PR. Another benefit is we can the same context in the same PR.

@liutgnu liutgnu force-pushed the kdump-test-v3 branch 2 times, most recently from 11ef6af to 2cbf93d Compare July 11, 2024 05:53
@liutgnu
Copy link
Collaborator Author

liutgnu commented Jul 11, 2024

The v4 has been force pushed to this PR, which mainly introduced the "kdumpctl test" as the entrance for performing kdump test. @daveyoung @baoquan-he

In addition, in v3 Dave mentioned the combination of /kdumproot and /sysroot in kdump-lib.sh:get_kdump_mntpoint_from_target(), in order to simplify the dir mounting for saving $VMCORE_CREATION_STATUS. However I made an experiment showing the negative results, see the following:

If we unify the 2, and kdump through nfs and failure dump to rootfs, the mounting status will be like:

Arguments: --add 'kdumpbase' --quiet --hostonly --hostonly-cmdline --hostonly-i18n --hostonly-mode 'strict' --hostonly-nics '' -o 'plymouth resume ifcfg earlykdump' --mount '192.168.1.65:/tmp/nfs /sysroot nfs defaults' --squash-compressor 'zstd' --mount '/dev/mapper/rhel_ibm--p8--kvm--03--guest--02-root /sysroot xfs rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,noquota'

The original mounting status will be like:

Arguments: --add 'kdumpbase' --quiet --hostonly --hostonly-cmdline --hostonly-i18n --hostonly-mode 'strict' --hostonly-nics '' -o 'plymouth resume ifcfg earlykdump' --mount '192.168.1.65:/tmp/nfs /kdumproot nfs defaults' --squash-compressor 'zstd' --mount '/dev/mapper/rhel_ibm--p8--kvm--03--guest--02-root /sysroot xfs rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,noquota'

When nfs mount failed, it will try to mount sysroot as "systemctl start --no-block sysroot.mount", it will mount nfs again, instead of mounting the block device. So the mount will fail again, and dump_to_rootfs fails, which is unexpected.

see the 2nd kernel log:

[ 130.266759] mount[563]: mount.nfs: Connection refused
[ 130.268887] systemd[1]: sysroot.mount: Mount process exited, code=exited, status=32/n/a
[FAILED] Failed to mount /sysroot.
See 'systemctl status sysroot.mount' for details.
[ 130.272324] systemd[1]: Failed to mount /sysroot.
...
[ 256.480662] kdump[700]: failed to dump to '/sysroot', it's not a mount point!
[FAILED] Failed to start Kdump Vmcore Save Service.
See 'systemctl status kdump-capture.service' for details.
...
[ 257.293836] systemd[1]: sysroot.mount: Directory /sysroot to mount over is not empty, mounting anyway.
[ 257.296748] systemd[1]: Mounting /sysroot...
[ 257.304670] mount[776]: mount.nfs: rpc.statd is not running but is required for remote locking.
...
[ 347.314093] kdump[981]: Failed to mount rootfs
[ 347.316864] kdump[983]: Executing final action systemctl reboot -f

@daveyoung
Copy link
Contributor

The v4 has been force pushed to this PR, which mainly introduced the "kdumpctl test" as the entrance for performing kdump test. @daveyoung @baoquan-he

In addition, in v3 Dave mentioned the combination of /kdumproot and /sysroot in kdump-lib.sh:get_kdump_mntpoint_from_target(), in order to simplify the dir mounting for saving $VMCORE_CREATION_STATUS. However I made an experiment showing the negative results, see the following:

If we unify the 2, and kdump through nfs and failure dump to rootfs, the mounting status will be like:

Arguments: --add 'kdumpbase' --quiet --hostonly --hostonly-cmdline --hostonly-i18n --hostonly-mode 'strict' --hostonly-nics '' -o 'plymouth resume ifcfg earlykdump' --mount '192.168.1.65:/tmp/nfs /sysroot nfs defaults' --squash-compressor 'zstd' --mount '/dev/mapper/rhel_ibm--p8--kvm--03--guest--02-root /sysroot xfs rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,noquota'

The original mounting status will be like:

Arguments: --add 'kdumpbase' --quiet --hostonly --hostonly-cmdline --hostonly-i18n --hostonly-mode 'strict' --hostonly-nics '' -o 'plymouth resume ifcfg earlykdump' --mount '192.168.1.65:/tmp/nfs /kdumproot nfs defaults' --squash-compressor 'zstd' --mount '/dev/mapper/rhel_ibm--p8--kvm--03--guest--02-root /sysroot xfs rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,noquota'

When nfs mount failed, it will try to mount sysroot as "systemctl start --no-block sysroot.mount", it will mount nfs again, instead of mounting the block device. So the mount will fail again, and dump_to_rootfs fails, which is unexpected.

see the 2nd kernel log:

[ 130.266759] mount[563]: mount.nfs: Connection refused [ 130.268887] systemd[1]: sysroot.mount: Mount process exited, code=exited, status=32/n/a [FAILED] Failed to mount /sysroot. See 'systemctl status sysroot.mount' for details. [ 130.272324] systemd[1]: Failed to mount /sysroot. ... [ 256.480662] kdump[700]: failed to dump to '/sysroot', it's not a mount point! [FAILED] Failed to start Kdump Vmcore Save Service. See 'systemctl status kdump-capture.service' for details. ... [ 257.293836] systemd[1]: sysroot.mount: Directory /sysroot to mount over is not empty, mounting anyway. [ 257.296748] systemd[1]: Mounting /sysroot... [ 257.304670] mount[776]: mount.nfs: rpc.statd is not running but is required for remote locking. ... [ 347.314093] kdump[981]: Failed to mount rootfs [ 347.316864] kdump[983]: Executing final action systemctl reboot -f

Hi Tao, I finally found your reply here. Thanks for trying the idea. It is fair enough it you can not use one mount point. Anyway I have still some questions about the mouting, added another comment in your new version

@liutgnu
Copy link
Collaborator Author

liutgnu commented Jul 15, 2024

Hi Dave, I just force pusted the v5 a few hours ago. I tested the fadump case, and mainly fixed its error mounting issues. Please checkout the added code comments in mkdumprd.

@daveyoung daveyoung changed the title [Patch v3]Introduce vmcore creation notification to kdump Introduce vmcore creation notification to kdump Jul 15, 2024
Copy link
Contributor

@daveyoung daveyoung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, tao. Let me put a new comment about the add_mount.

Can you do below?

add a new argument $4 for mountpoint, eg. add_mount $1 $2 $3 $4
in add_mount() function you can choose the argument specified mount point, if $4 is empty then use the kdump mount point generated by another function.

In this way you do not need special handling about the notifier argument, you only specific the notifier mount point when you call add_mount

mkdumprd Outdated
@@ -407,6 +413,9 @@ if ! is_fadump_capable; then
# so it doesn't affect the logic of check_dump_fs_modified().
is_dump_to_rootfs && add_mount "$(to_dev_name "$(get_root_fs_device)")"

status_target=$(get_target_from_path $(dirname "$VMCORE_CREATION_STATUS"))
add_mount "$status_target" "" "" "vmcore_status_target"

Copy link
Contributor

@daveyoung daveyoung Jul 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better to only add the dump target when we have not added it. eg. skip it in case it is exactly the dump target block device or rootfs (with dump_to_rootfs set).
This can be done like below:
Add a check in add_mount, if the dev is vmcore status target then set a variable vmcore_status_mount_added = true. Later before call dracut, you can check and call add_mount only when vmcore_status_mount_added is not true.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes agreed, it would be better, I will get it improved in v6. Thanks!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Dave, I force pusted v6 to this MR to fix the above issue, please check it out. Thanks!

liutgnu referenced this pull request Jul 15, 2024
Motivation
==========

People may forget to recheck to ensure kdump works, which as a result, a
possibility of no vmcores generated after a real system crash. It is
unexpected for kdump.

It is highly recommended people to recheck kdump after any system
modification, such as:

a. after kernel patching or whole yum update, as it might break something
   on which kdump is dependent, maybe due to introduction of any new bug etc.
b. after any change at hardware level, maybe storage, networking,
   firmware upgrading etc.
c. after implementing any new application, like which involves 3rd party modules
   etc.

Though these exceed the range of kdump, however a simple vmcore creation
status notification is good to have for now.

Design
======

Kdump currently will check any relating files/fs/drivers modified before
determine if initrd should rebuild when (re)start. A rebuild is an
indicator of such modification, and kdump need to be rechecked. This will
clear the vmcore creation status specified in $VMCORE_CREATION_STATUS.

Vmcore creation check will happen at "kdumpctl (re)start/status", and will
report the creation success/fail status to users. A "success" status indicates
previously there has been a vmcore successfully generated based on the current
env, so it is more likely a vmcore will be generated later when real crash
happens; A "fail" status indicates previously there was no vmcore
generated, or has been a vmcore creation failed based on current env. User
should check the 2nd kernel log or the kexec-dmesg.log for the failing reason.

$VMCORE_CREATION_STATUS is used for recording the vmcore creation status of
the current env. The format will be like:

   success 1718682002

Which means, there has been a vmcore generated successfully at this
timestamp for the current env.

There have been a possibility that, the location of $VMCORE_CREATION_STATUS,
is also a seperate device rather than the rootfs device, so in this patch,
we will always mount the $VMCORE_CREATION_STATUS's device for 2nd
kernel. It doesn't matter if $VMCORE_CREATION_STATUS's device is the
same as rootfs device, as a result, the same device will be mounted
twice but on 2 different mount points.

Usage
=====

[root@localhost ~]# kdumpctl restart
kdump: kexec: unloaded kdump kernel
kdump: Stopping kdump: [OK]
kdump: kexec: loaded kdump kernel
kdump: Starting kdump: [OK]
kdump: Notice: No vmcore creation test performed!

[root@localhost ~]# kdumpctl test

[root@localhost ~]# kdumpctl status
kdump: Kdump is operational
kdump: Notice: Last successful vmcore creation on Tue Jun 18 16:39:10 CST 2024

[root@localhost ~]# kdumpctl restart
kdump: kexec: unloaded kdump kernel
kdump: Stopping kdump: [OK]
kdump: kexec: loaded kdump kernel
kdump: Starting kdump: [OK]
kdump: Notice: Last successful vmcore creation on Tue Jun 18 16:39:10 CST 2024

The notification for kdumpctl (re)start/status can be disabled by
setting VMCORE_CREATION_NOTIFICATION in /etc/sysconfig/kdump

===

v3 -> v2:
Always mount
$VMCORE_CREATION_STATUS(/var/crash/vmcore-creation.status)'s device for
2nd kernel, in case /var is a seperate device than rootfs's device.

v4 -> v3:
Add "kdumpctl test" as the entrance for performing the kdump test.

v5 -> v4:
Fix the mounting failure issue in fadump.

Signed-off-by: Tao Liu <ltao@redhat.com>
Copy link
Collaborator

@prudo1 prudo1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Tao,
here are some comments from my first round. I still need to think about a few things, e.g. I don't know if it really makes sense to store the status file on the rootfs. My first thought was that it would be better to store it on the target, next to the dumps instead. IMHO, that would make tracking the status easier, when you have a big fleet dumping to a remote host, e.g. via ssh. But I'm not entirely sure if we can assume that the target is always mounted/accessible. Especially when dumping to nfs.

{
_status=$1
_status_file=$2
_dir=${_status_file%%${_status_file##*/}}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is bash specific and cannot be used in kdump-lib-initramfs. Best is to use _dir=$(dirname "$_status_file") instead.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK agreed, we need to install dirname cmd into initramfs first though...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right... but I believe it's worth it...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK agreed, will add it in next patch.

kdumpctl Outdated
Comment on lines 1664 to 1666
if [[ -z "$VMCORE_CREATION_NOTIFICATION" ]]; then
return
fi
Copy link
Collaborator

@prudo1 prudo1 Jul 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't like this. In kdump.conf we already have two different ways to turn things on/off (yes/no and 1/0). Having a third one really doesn't help. Especially as turning it off by passing an empty string is extremely nonintuitive in my opinion as even when a user explicitly sets it to 'no' or 'false' it will be on. Personally I would simply check whether it's explicitly set to 'yes' and exit otherwise. I.e.

[[ ${VMCORE_CREATION_NOTIFICATION,,} == yes ]] || return 0

where ${,,} converts the string to lower case, so 'Yes', 'YES', etc. will also work.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK agreed, will get it improved in v7.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

kdumpctl Outdated
_record=$(cat "$VMCORE_CREATION_STATUS")
_status=$(echo $_record | awk '{print $1}')
_timestamp=$(echo $_record | awk '{print $2}')
_status_date=$(echo "@"$_timestamp | xargs date -d 2> /dev/null)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be simplified to

read -r _status _date < "$VMCORE_CREATION_STATUS"
_date="$(date -d "@$_date")"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, your suggestion is much better!

kdumpctl Outdated
Comment on lines 1762 to 1763
is_kernel_loaded "$DEFAULT_DUMP_MODE"
if [[ "$?" -ne 0 ]]; then
Copy link
Collaborator

@prudo1 prudo1 Jul 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can check the return value via directly

if ! is_kernel_loaded "$DEFAULT_DUMP_MODE"; then

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed

kdumpctl Outdated
test)
is_kernel_loaded "$DEFAULT_DUMP_MODE"
if [[ "$?" -ne 0 ]]; then
derror "Kdump should be operational before test."
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/should/needs to/

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed

kdumpctl Outdated
Comment on lines 1761 to 1778
test)
is_kernel_loaded "$DEFAULT_DUMP_MODE"
if [[ "$?" -ne 0 ]]; then
derror "Kdump should be operational before test."
exit 1
fi
read -p "DANGER!!! Will perform a kdump test by crashing the system, proceed? (Y/N): " input
case $input in
[Yy] )
dinfo "Start kdump test..."
;;
* )
dinfo "Operation cancelled."
exit 0
;;
esac
set_vmcore_creation_status 'clear' "$VMCORE_CREATION_STATUS"
echo c > /proc/sysrq-trigger
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer if you moved this into a separate function rather than having it all in the switch case.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK agreed, I have the same thought when coding...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

kdumpctl Outdated
derror "Kdump should be operational before test."
exit 1
fi
read -p "DANGER!!! Will perform a kdump test by crashing the system, proceed? (Y/N): " input
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two things:

  1. I prefer giving the different options as lower case letters and indicate the default as upper case letter. I.e. 's;Y/N;y/N;'
  2. I understand that you want to give users a chance to abort in case they have second thoughts. But that only makes sense when users trigger the test manually. When they use it in a test script this will cause issues. Does it make sense to add a --force option to disable this question?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree with your "--force" option, it is better!

mkdumprd Outdated
Comment on lines 66 to 68
if [[ -z "$_new_mntpoint" ]]; then
_new_mntpoint=$(get_kdump_mntpoint_from_target "$_target")
fi
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_new_mntpoint="${_new_mntpoint:-$(get_kdump_mntpoint_from_target "$_target")}"

is identical but shorter and better matches what is done for the other options right below.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed

mkdumprd Outdated
if is_fadump_capable && \
[[ $(get_root_fs_device) == "$status_target" ]] && \
[[ $(cat /proc/cmdline) =~ $ro_regex ]]; then
add_mount "$status_target" "" "ro " "$new_mntpoint"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does adding the "ro" here have any impact? In to_mount the ro should be converted back to rw anyway.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it won't, you see it is "ro ", not "ro", there is one extra space within, so it won't be converted to "rw" later in to_mount(). Personally I don't like the "hacking" here, but didn't think of a better approach...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arghh, ok I missed the extra space... That is really hacky...

Can't we use an different approach instead? If I understand the comment above correctly this hack is only needed when we dump to the rootfs. So we could only add the new mount point when it's not a dump to rootfs and then check in dracut-kdump.sh whether the new mount point exists or not. If it exists we write the status there otherwise it must be a dump to rootfs and we can simply add the status to the target.
This should work, shouldn't it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, thanks for the suggestion, I will draft the code and see it can work as expected.

@liutgnu
Copy link
Collaborator Author

liutgnu commented Jul 16, 2024

Hi Philipp,

Thanks a lot for your careful review!

Hi Tao, here are some comments from my first round. I still need to think about a few things, e.g. I don't know if it really makes sense to store the status file on the rootfs. My first thought was that it would be better to store it on the target, next to the dumps instead. IMHO, that would make tracking the status easier, when you have a big fleet dumping to a remote host, e.g. via ssh. But I'm not entirely sure if we can assume that the target is always mounted/accessible. Especially when dumping to nfs.

Yes, saving the status file onto the target was my first option, but I gave it up due to the 2 reasons: 1) It is difficult to identify 'who created the vmcore/status file'. Say if we have a nfs/ssh server, several machines will dump vmcore/status file to it. Since we need to verify if kdump test is successful for one specific machine, we need to identify the vmcore/status file is created by it, no by others. Frankly there is no stable way to do that, the ip+date string isn't a good method, e.g the following case: physical machine_a hosted vm1 and vm2 by its NAT network, physical machine_b hosted vm3 and vm4 by its NAT network, and physical machine_c is for vmcore hosting. So vm1 and vm3 can have the same ip address as 10.0.2.15. How can we know if one vmcore/status file belongs to vm1 or vm3? My previous solution is to create a special string(a timestamp) in kernel before trigger kernel crash, and check the vmcore-dmesg.txt file if the string exists, to identify its belonging.

  1. Though [1] is doable, but it need to access the vmcore server frenqently(each checking of kdump test success or not), QE/customers don't want us to access the vmcore server too much, unless we are dumping vmcore to it. In addition, accessing a remote server is slow, especially when there are hundreds of vmcores, which means we need to find the special string within hundreds vmcore-dmesg.txt files. It is slow, and nfs-server may not mounted as well...

So saving the status file locally is fast and have no problem to determine its belonging than saving remotely.

@prudo1
Copy link
Collaborator

prudo1 commented Jul 17, 2024

Hi Philipp,

Thanks a lot for your careful review!

Hi Tao, here are some comments from my first round. I still need to think about a few things, e.g. I don't know if it really makes sense to store the status file on the rootfs. My first thought was that it would be better to store it on the target, next to the dumps instead. IMHO, that would make tracking the status easier, when you have a big fleet dumping to a remote host, e.g. via ssh. But I'm not entirely sure if we can assume that the target is always mounted/accessible. Especially when dumping to nfs.

Yes, saving the status file onto the target was my first option, but I gave it up due to the 2 reasons: 1) It is difficult to identify 'who created the vmcore/status file'. Say if we have a nfs/ssh server, several machines will dump vmcore/status file to it. Since we need to verify if kdump test is successful for one specific machine, we need to identify the vmcore/status file is created by it, no by others. Frankly there is no stable way to do that, the ip+date string isn't a good method, e.g the following case: physical machine_a hosted vm1 and vm2 by its NAT network, physical machine_b hosted vm3 and vm4 by its NAT network, and physical machine_c is for vmcore hosting. So vm1 and vm3 can have the same ip address as 10.0.2.15. How can we know if one vmcore/status file belongs to vm1 or vm3? My previous solution is to create a special string(a timestamp) in kernel before trigger kernel crash, and check the vmcore-dmesg.txt file if the string exists, to identify its belonging.

Right, the ip+date is a pretty poor method to identify the system that dumped. My idea was to replace the ip with the machine-id. The machine-id is determined at the first boot after installation and then stays stable for ever. At least for "normal" installations. I'm not a 100% sure how it behaves for container images where the rootfs is ro. If I understand the man page correctly the machine-id is then determined every time on boot. So it won't be stable. But in that scenario the ip probably wouldn't be stable as well.
So we could switch from ip+date to machine-id+date for the dumps and machine-id.status for the status files. Then we also don't need to check all the vmcore-dmesg.txt.

What do you think?

2. Though [1] is doable, but it need to access the vmcore server frenqently(each checking of kdump test success or not), QE/customers don't want us to access the vmcore server too much, unless we are dumping vmcore to it. In addition, accessing a remote server is slow, especially when there are hundreds of vmcores, which means we need to find the special string within hundreds vmcore-dmesg.txt files. It is slow, and nfs-server may not mounted as well...

True, but at least for ssh targets we already connect to them every time we start to check whether the network is up and running. So in that case there won't be too many additional connections. But then there is kdumpctl status where we don't know whether we can reach the target. And of course nfs targets...

So saving the status file locally is fast and have no problem to determine its belonging than saving remotely.

True, but it also means that we need to mount a disk locally for cases where we didn't need it before. Which will require additional memory and thus increases the chance that the kdump kernel runs oom...

@liutgnu
Copy link
Collaborator Author

liutgnu commented Jul 18, 2024

Hi Philipp,
Thanks a lot for your careful review!

Hi Tao, here are some comments from my first round. I still need to think about a few things, e.g. I don't know if it really makes sense to store the status file on the rootfs. My first thought was that it would be better to store it on the target, next to the dumps instead. IMHO, that would make tracking the status easier, when you have a big fleet dumping to a remote host, e.g. via ssh. But I'm not entirely sure if we can assume that the target is always mounted/accessible. Especially when dumping to nfs.

Yes, saving the status file onto the target was my first option, but I gave it up due to the 2 reasons: 1) It is difficult to identify 'who created the vmcore/status file'. Say if we have a nfs/ssh server, several machines will dump vmcore/status file to it. Since we need to verify if kdump test is successful for one specific machine, we need to identify the vmcore/status file is created by it, no by others. Frankly there is no stable way to do that, the ip+date string isn't a good method, e.g the following case: physical machine_a hosted vm1 and vm2 by its NAT network, physical machine_b hosted vm3 and vm4 by its NAT network, and physical machine_c is for vmcore hosting. So vm1 and vm3 can have the same ip address as 10.0.2.15. How can we know if one vmcore/status file belongs to vm1 or vm3? My previous solution is to create a special string(a timestamp) in kernel before trigger kernel crash, and check the vmcore-dmesg.txt file if the string exists, to identify its belonging.

Right, the ip+date is a pretty poor method to identify the system that dumped. My idea was to replace the ip with the machine-id. The machine-id is determined at the first boot after installation and then stays stable for ever. At least for "normal" installations. I'm not a 100% sure how it behaves for container images where the rootfs is ro. If I understand the man page correctly the machine-id is then determined every time on boot. So it won't be stable. But in that scenario the ip probably wouldn't be stable as well. So we could switch from ip+date to machine-id+date for the dumps and machine-id.status for the status files. Then we also don't need to check all the vmcore-dmesg.txt.

What do you think?

I guess we cannot do that, the machine-id should be encrpted and shouldn't be exposed to outeside. See the "man machine-id", as I quote some of it:

This ID uniquely identifies the host. It should be considered "confidential", and must not be exposed in untrusted environments, in particular on the network...

If we do using some hash/encrypt approach before using machine-id, I guess it will introduce extra complexity for kdump function.

2. Though [1] is doable, but it need to access the vmcore server frenqently(each checking of kdump test success or not), QE/customers don't want us to access the vmcore server too much, unless we are dumping vmcore to it. In addition, accessing a remote server is slow, especially when there are hundreds of vmcores, which means we need to find the special string within hundreds vmcore-dmesg.txt files. It is slow, and nfs-server may not mounted as well...

True, but at least for ssh targets we already connect to them every time we start to check whether the network is up and running. So in that case there won't be too many additional connections. But then there is kdumpctl status where we don't know whether we can reach the target. And of course nfs targets...

Yeah, since CEE will use "kdumpctl test" often, and I guess nfs targets are their priority. So we need to come up with a universal approach, not only for local/ssh, but also nfs target...

So saving the status file locally is fast and have no problem to determine its belonging than saving remotely.

True, but it also means that we need to mount a disk locally for cases where we didn't need it before. Which will require additional memory and thus increases the chance that the kdump kernel runs oom...

Yes agreed, but I guess it is OK to increase the creashkernel size a little bit in order to support this.

@prudo1
Copy link
Collaborator

prudo1 commented Jul 19, 2024

Hi Philipp,
Thanks a lot for your careful review!

Hi Tao, here are some comments from my first round. I still need to think about a few things, e.g. I don't know if it really makes sense to store the status file on the rootfs. My first thought was that it would be better to store it on the target, next to the dumps instead. IMHO, that would make tracking the status easier, when you have a big fleet dumping to a remote host, e.g. via ssh. But I'm not entirely sure if we can assume that the target is always mounted/accessible. Especially when dumping to nfs.

Yes, saving the status file onto the target was my first option, but I gave it up due to the 2 reasons: 1) It is difficult to identify 'who created the vmcore/status file'. Say if we have a nfs/ssh server, several machines will dump vmcore/status file to it. Since we need to verify if kdump test is successful for one specific machine, we need to identify the vmcore/status file is created by it, no by others. Frankly there is no stable way to do that, the ip+date string isn't a good method, e.g the following case: physical machine_a hosted vm1 and vm2 by its NAT network, physical machine_b hosted vm3 and vm4 by its NAT network, and physical machine_c is for vmcore hosting. So vm1 and vm3 can have the same ip address as 10.0.2.15. How can we know if one vmcore/status file belongs to vm1 or vm3? My previous solution is to create a special string(a timestamp) in kernel before trigger kernel crash, and check the vmcore-dmesg.txt file if the string exists, to identify its belonging.

Right, the ip+date is a pretty poor method to identify the system that dumped. My idea was to replace the ip with the machine-id. The machine-id is determined at the first boot after installation and then stays stable for ever. At least for "normal" installations. I'm not a 100% sure how it behaves for container images where the rootfs is ro. If I understand the man page correctly the machine-id is then determined every time on boot. So it won't be stable. But in that scenario the ip probably wouldn't be stable as well. So we could switch from ip+date to machine-id+date for the dumps and machine-id.status for the status files. Then we also don't need to check all the vmcore-dmesg.txt.
What do you think?

I guess we cannot do that, the machine-id should be encrpted and shouldn't be exposed to outeside. See the "man machine-id", as I quote some of it:

This ID uniquely identifies the host. It should be considered "confidential", and must not be exposed in untrusted environments, in particular on the network...

Yeah, I read that as well. But I believe that for our use case we can assume that the dump target can be trusted even when it's a remote target. Reason is that, if we cannot trust the dump target we cannot write the dump there as well. In the end a kernel dump will always contain a lot of confidential information. So if we write a dump to an untrusted target the machine-id is one of the smaller problems we have.

If we do using some hash/encrypt approach before using machine-id, I guess it will introduce extra complexity for kdump function.

Fully agree, but as said above IMHO that shouldn't be necessary.

2. Though [1] is doable, but it need to access the vmcore server frenqently(each checking of kdump test success or not), QE/customers don't want us to access the vmcore server too much, unless we are dumping vmcore to it. In addition, accessing a remote server is slow, especially when there are hundreds of vmcores, which means we need to find the special string within hundreds vmcore-dmesg.txt files. It is slow, and nfs-server may not mounted as well...

True, but at least for ssh targets we already connect to them every time we start to check whether the network is up and running. So in that case there won't be too many additional connections. But then there is kdumpctl status where we don't know whether we can reach the target. And of course nfs targets...

Yeah, since CEE will use "kdumpctl test" often, and I guess nfs targets are their priority. So we need to come up with a universal approach, not only for local/ssh, but also nfs target...

True...

So saving the status file locally is fast and have no problem to determine its belonging than saving remotely.

True, but it also means that we need to mount a disk locally for cases where we didn't need it before. Which will require additional memory and thus increases the chance that the kdump kernel runs oom...

Yes agreed, but I guess it is OK to increase the creashkernel size a little bit in order to support this.

Yes, I guess so...

Motivation
==========

People may forget to recheck to ensure kdump works, which as a result, a
possibility of no vmcores generated after a real system crash. It is
unexpected for kdump.

It is highly recommended people to recheck kdump after any system
modification, such as:

a. after kernel patching or whole yum update, as it might break something
   on which kdump is dependent, maybe due to introduction of any new bug etc.
b. after any change at hardware level, maybe storage, networking,
   firmware upgrading etc.
c. after implementing any new application, like which involves 3rd party modules
   etc.

Though these exceed the range of kdump, however a simple vmcore creation
status notification is good to have for now.

Design
======

Kdump currently will check any relating files/fs/drivers modified before
determine if initrd should rebuild when (re)start. A rebuild is an
indicator of such modification, and kdump need to be rechecked. This will
clear the vmcore creation status specified in $VMCORE_CREATION_STATUS.

Vmcore creation check will happen at "kdumpctl (re)start/status", and will
report the creation success/fail status to users. A "success" status indicates
previously there has been a vmcore successfully generated based on the current
env, so it is more likely a vmcore will be generated later when real crash
happens; A "fail" status indicates previously there was no vmcore
generated, or has been a vmcore creation failed based on current env. User
should check the 2nd kernel log or the kexec-dmesg.log for the failing reason.

$VMCORE_CREATION_STATUS is used for recording the vmcore creation status of
the current env. The format will be like:

   success 1718682002

Which means, there has been a vmcore generated successfully at this
timestamp for the current env.

Usage
=====

[root@localhost ~]# kdumpctl restart
kdump: kexec: unloaded kdump kernel
kdump: Stopping kdump: [OK]
kdump: kexec: loaded kdump kernel
kdump: Starting kdump: [OK]
kdump: Notice: No vmcore creation test performed!

[root@localhost ~]# kdumpctl test

[root@localhost ~]# kdumpctl status
kdump: Kdump is operational
kdump: Notice: Last successful vmcore creation on Tue Jun 18 16:39:10 CST 2024

[root@localhost ~]# kdumpctl restart
kdump: kexec: unloaded kdump kernel
kdump: Stopping kdump: [OK]
kdump: kexec: loaded kdump kernel
kdump: Starting kdump: [OK]
kdump: Notice: Last successful vmcore creation on Tue Jun 18 16:39:10 CST 2024

The notification for kdumpctl (re)start/status can be disabled by
setting VMCORE_CREATION_NOTIFICATION in /etc/sysconfig/kdump

===

v3 -> v2:
Always mount
$VMCORE_CREATION_STATUS(/var/crash/vmcore-creation.status)'s device for
2nd kernel, in case /var is a seperate device than rootfs's device.

v4 -> v3:
Add "kdumpctl test" as the entrance for performing the kdump test.

v5 -> v4:
Fix the mounting failure issue in fadump.

v6 -> v5:
Add new argument as customized mount point for add_mount/to_mount.

v7 -> v6:
a. Code refactoring based on Philipp's suggestion.
b. Only mount $VMCORE_CREATION_STATUS(/var/crash/vmcore-creation.status)'s
   device when needed.
c. Add "--force" option for "kdumpctl test", to support the automation test
   script QE may perform.
d. Add check in "kdumpctl test" that $VMCORE_CREATION_STATUS can only be on
   local drive.

Signed-off-by: Tao Liu <ltao@redhat.com>
@liutgnu
Copy link
Collaborator Author

liutgnu commented Jul 22, 2024

Now the v7 is posted

@liutgnu
Copy link
Collaborator Author

liutgnu commented Jul 22, 2024

Hi Philipp,
Thanks a lot for your careful review!

Hi Tao, here are some comments from my first round. I still need to think about a few things, e.g. I don't know if it really makes sense to store the status file on the rootfs. My first thought was that it would be better to store it on the target, next to the dumps instead. IMHO, that would make tracking the status easier, when you have a big fleet dumping to a remote host, e.g. via ssh. But I'm not entirely sure if we can assume that the target is always mounted/accessible. Especially when dumping to nfs.

Yes, saving the status file onto the target was my first option, but I gave it up due to the 2 reasons: 1) It is difficult to identify 'who created the vmcore/status file'. Say if we have a nfs/ssh server, several machines will dump vmcore/status file to it. Since we need to verify if kdump test is successful for one specific machine, we need to identify the vmcore/status file is created by it, no by others. Frankly there is no stable way to do that, the ip+date string isn't a good method, e.g the following case: physical machine_a hosted vm1 and vm2 by its NAT network, physical machine_b hosted vm3 and vm4 by its NAT network, and physical machine_c is for vmcore hosting. So vm1 and vm3 can have the same ip address as 10.0.2.15. How can we know if one vmcore/status file belongs to vm1 or vm3? My previous solution is to create a special string(a timestamp) in kernel before trigger kernel crash, and check the vmcore-dmesg.txt file if the string exists, to identify its belonging.

Right, the ip+date is a pretty poor method to identify the system that dumped. My idea was to replace the ip with the machine-id. The machine-id is determined at the first boot after installation and then stays stable for ever. At least for "normal" installations. I'm not a 100% sure how it behaves for container images where the rootfs is ro. If I understand the man page correctly the machine-id is then determined every time on boot. So it won't be stable. But in that scenario the ip probably wouldn't be stable as well. So we could switch from ip+date to machine-id+date for the dumps and machine-id.status for the status files. Then we also don't need to check all the vmcore-dmesg.txt.
What do you think?

I guess we cannot do that, the machine-id should be encrpted and shouldn't be exposed to outeside. See the "man machine-id", as I quote some of it:
This ID uniquely identifies the host. It should be considered "confidential", and must not be exposed in untrusted environments, in particular on the network...

Yeah, I read that as well. But I believe that for our use case we can assume that the dump target can be trusted even when it's a remote target. Reason is that, if we cannot trust the dump target we cannot write the dump there as well. In the end a kernel dump will always contain a lot of confidential information. So if we write a dump to an untrusted target the machine-id is one of the smaller problems we have.

OK, agreed. However one thing I noticed from my fedora:37 container, the systemd package is not installed:
[root@3c6570f17fba system]# rpm -qa|grep systemd
systemd-libs-251.10-588.fc37.x86_64

So there is no systemd-machine-id-commit.service to generate the machine-id, instead we got and empty file:
[root@3c6570f17fba system]# rpm -qf /etc/machine-id
file /etc/machine-id is not owned by any package
[root@3c6570f17fba system]# ls -ail /etc/machine-id
3046437 -rw-r--r--. 1 root root 0 Feb 5 2023 /etc/machine-id

I guess machine-id is not a stable approach for containers than ip+date...

If we do using some hash/encrypt approach before using machine-id, I guess it will introduce extra complexity for kdump function.

Fully agree, but as said above IMHO that shouldn't be necessary.

2. Though [1] is doable, but it need to access the vmcore server frenqently(each checking of kdump test success or not), QE/customers don't want us to access the vmcore server too much, unless we are dumping vmcore to it. In addition, accessing a remote server is slow, especially when there are hundreds of vmcores, which means we need to find the special string within hundreds vmcore-dmesg.txt files. It is slow, and nfs-server may not mounted as well...

True, but at least for ssh targets we already connect to them every time we start to check whether the network is up and running. So in that case there won't be too many additional connections. But then there is kdumpctl status where we don't know whether we can reach the target. And of course nfs targets...

Yeah, since CEE will use "kdumpctl test" often, and I guess nfs targets are their priority. So we need to come up with a universal approach, not only for local/ssh, but also nfs target...

True...

So saving the status file locally is fast and have no problem to determine its belonging than saving remotely.

True, but it also means that we need to mount a disk locally for cases where we didn't need it before. Which will require additional memory and thus increases the chance that the kdump kernel runs oom...

Yes agreed, but I guess it is OK to increase the creashkernel size a little bit in order to support this.

Yes, I guess so...

@liutgnu
Copy link
Collaborator Author

liutgnu commented Jul 31, 2024

Hi @prudo1 , kindly ping. Could you please review the patch? If you have no time, then we can ask other's help. Thanks!

@liutgnu
Copy link
Collaborator Author

liutgnu commented Aug 5, 2024

Hi @prudo1 ,

Hi Philipp,
Thanks a lot for your careful review!

Hi Tao, here are some comments from my first round. I still need to think about a few things, e.g. I don't know if it really makes sense to store the status file on the rootfs. My first thought was that it would be better to store it on the target, next to the dumps instead. IMHO, that would make tracking the status easier, when you have a big fleet dumping to a remote host, e.g. via ssh. But I'm not entirely sure if we can assume that the target is always mounted/accessible. Especially when dumping to nfs.

Yes, saving the status file onto the target was my first option, but I gave it up due to the 2 reasons: 1) It is difficult to identify 'who created the vmcore/status file'. Say if we have a nfs/ssh server, several machines will dump vmcore/status file to it. Since we need to verify if kdump test is successful for one specific machine, we need to identify the vmcore/status file is created by it, no by others. Frankly there is no stable way to do that, the ip+date string isn't a good method, e.g the following case: physical machine_a hosted vm1 and vm2 by its NAT network, physical machine_b hosted vm3 and vm4 by its NAT network, and physical machine_c is for vmcore hosting. So vm1 and vm3 can have the same ip address as 10.0.2.15. How can we know if one vmcore/status file belongs to vm1 or vm3? My previous solution is to create a special string(a timestamp) in kernel before trigger kernel crash, and check the vmcore-dmesg.txt file if the string exists, to identify its belonging.

Right, the ip+date is a pretty poor method to identify the system that dumped. My idea was to replace the ip with the machine-id. The machine-id is determined at the first boot after installation and then stays stable for ever. At least for "normal" installations. I'm not a 100% sure how it behaves for container images where the rootfs is ro. If I understand the man page correctly the machine-id is then determined every time on boot. So it won't be stable. But in that scenario the ip probably wouldn't be stable as well. So we could switch from ip+date to machine-id+date for the dumps and machine-id.status for the status files. Then we also don't need to check all the vmcore-dmesg.txt.
What do you think?

I guess we cannot do that, the machine-id should be encrpted and shouldn't be exposed to outeside. See the "man machine-id", as I quote some of it:
This ID uniquely identifies the host. It should be considered "confidential", and must not be exposed in untrusted environments, in particular on the network...

Yeah, I read that as well. But I believe that for our use case we can assume that the dump target can be trusted even when it's a remote target. Reason is that, if we cannot trust the dump target we cannot write the dump there as well. In the end a kernel dump will always contain a lot of confidential information. So if we write a dump to an untrusted target the machine-id is one of the smaller problems we have.

OK, agreed. However one thing I noticed from my fedora:37 container, the systemd package is not installed: [root@3c6570f17fba system]# rpm -qa|grep systemd systemd-libs-251.10-588.fc37.x86_64

So there is no systemd-machine-id-commit.service to generate the machine-id, instead we got and empty file: [root@3c6570f17fba system]# rpm -qf /etc/machine-id file /etc/machine-id is not owned by any package [root@3c6570f17fba system]# ls -ail /etc/machine-id 3046437 -rw-r--r--. 1 root root 0 Feb 5 2023 /etc/machine-id

I guess machine-id is not a stable approach for containers than ip+date...

I re-thought about the machine-id approach, and I think the above container case is not correct, kdump is not supposted to run within a container. However there are cases as CoreOS which uses ostree as layers for fs. Though CoreOS will mount /etc as writable, aka the machine-id can be generated in the first boot and stays there, but I'm not sure if there are cases which will mount /etc as read-only, or users can make such a customization. I'm a newbie to ostree :(

The ip+date string, though very poor, but better for human reading. E.g. /var/crash/10.0.2.15-2024-07-11-07:54:41, is much readable than /var/crash/machine-id-1d4c642df43846/. The former one indication the machine is more like to be a VM, so if we are searching for one bare-metal dumped vmcore, this one is easily to skipped.

So IMOH I don't have very strong motivation for replacing ip+date to be machine-id string for vmcore folder naming for now.

Other than machine-id for vmcore labeling, Dave gave me a different solution: 1) before trigger kernel panic, we generate a unique string, and put it into a file. 2) mkdumprd include the file into initramfs-kdump.img. 3) in 2nd kernel, after creating vmcore, cp the file aside of vmcore. 4) kdumpctl status can check if the string aside of vmcore is expected. The file can also connect with kdump.status, if we draft the file content by some format. This method is easy and doable, if we do want to save the kdump.status on to the remote machine.

Personally I would prefer to save the kdump.status into local drive rather than remote, just because it can make "kdumpctl status" working faster and reliable, since we don't need to care about the future network status after "kdumpctl test".

Any comments?

Thanks,
Tao Liu

If we do using some hash/encrypt approach before using machine-id, I guess it will introduce extra complexity for kdump function.

Fully agree, but as said above IMHO that shouldn't be necessary.

2. Though [1] is doable, but it need to access the vmcore server frenqently(each checking of kdump test success or not), QE/customers don't want us to access the vmcore server too much, unless we are dumping vmcore to it. In addition, accessing a remote server is slow, especially when there are hundreds of vmcores, which means we need to find the special string within hundreds vmcore-dmesg.txt files. It is slow, and nfs-server may not mounted as well...

True, but at least for ssh targets we already connect to them every time we start to check whether the network is up and running. So in that case there won't be too many additional connections. But then there is kdumpctl status where we don't know whether we can reach the target. And of course nfs targets...

Yeah, since CEE will use "kdumpctl test" often, and I guess nfs targets are their priority. So we need to come up with a universal approach, not only for local/ssh, but also nfs target...

True...

So saving the status file locally is fast and have no problem to determine its belonging than saving remotely.

True, but it also means that we need to mount a disk locally for cases where we didn't need it before. Which will require additional memory and thus increases the chance that the kdump kernel runs oom...

Yes agreed, but I guess it is OK to increase the creashkernel size a little bit in order to support this.

Yes, I guess so...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants