Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[rawhide][ppc64le] ext.config.kdump.crash failure #1698

Closed
jbtrystram opened this issue Mar 25, 2024 · 7 comments
Closed

[rawhide][ppc64le] ext.config.kdump.crash failure #1698

jbtrystram opened this issue Mar 25, 2024 · 7 comments

Comments

@jbtrystram
Copy link
Contributor

jbtrystram commented Mar 25, 2024

relevant Console output:

[    4.583671] systemd[1]: Starting kdump-capture.service - Kdump Vmcore Save Service...
[    4.614645] kdump[462]: Kdump is using the default log level(3).
[    4.668561] kdump[497]: saving to /sysroot/ostree/deploy/fedora-coreos/var/crash/127.0.0.1-2024-03-24-13:37:43/
[    4.679853] kdump[502]: saving vmcore-dmesg.txt to /sysroot/ostree/deploy/fedora-coreos/var/crash/127.0.0.1-2024-03-24-13:37:43/
[    4.694130] kdump[508]: saving vmcore-dmesg.txt complete
[    4.695729] kdump[510]: saving vmcore
[    4.710297] kdump.sh[511]: 
Checking for memory holes                         : [  0.0 %] /                  
Checking for memory holes                         : [100.0 %] |                  readpage_elf: Attempt to read non-existent page at 0xc000000000000.
[    4.710664] kdump.sh[511]: readmem: type_addr: 0, addr:c00c000000000000, size:16384
[    4.710795] kdump.sh[511]: __exclude_unnecessary_pages: Can't read the buffer of struct page.
[    4.710933] kdump.sh[511]: create_2nd_bitmap: Can't exclude unnecessary pages.
[    4.713611] kdump.sh[511]: The kernel version is not supported.
[    4.713745] kdump.sh[511]: The makedumpfile operation may be incomplete.
[    4.713863] kdump.sh[511]: makedumpfile Failed.
[    4.715342] kdump[513]: saving vmcore failed, exitcode:1
[    4.716640] kdump[515]: saving vmcore failed
[    4.731037] kdump[520]: saving the /run/initramfs/kexec-dmesg.log to /sysroot/ostree/deploy/fedora-coreos/var/crash/127.0.0.1-2024-03-24-13:37:43///
[    4.733652] systemd[1]: kdump-capture.service: Main process exited, code=exited, status=1/FAILURE
[    4.733887] systemd[1]: kdump-capture.service: Failed with result 'exit-code'.

test log :

Mar 24 13:36:48 qemu0 kola-runext-test.sh[5465]: ++ kdumpctl estimate
Mar 24 13:36:49 qemu0 kola-runext-test.sh[5468]: kdump: Detected change in File System
Mar 24 13:36:49 qemu0 kola-runext-test.sh[5468]: kdump: Rebuilding /var/lib/kdump/initramfs-6.9.0-0.rc0.20240322git8e938e398669.14.fc41.ppc64lekdump.img
Mar 24 13:36:52 qemu0 kola-runext-test.sh[7563]: grep: /var/tmp/dracut.vQUkwN/initramfs/etc/systemd/system.conf*: No such file or directory
Mar 24 13:37:31 qemu0 kola-runext-test.sh[9101]: tail: error writing 'standard output': Broken pipe
Mar 24 13:37:32 qemu0 kola-runext-test.sh[9119]: tail: error writing 'standard output': Broken pipe
Mar 24 13:37:32 qemu0 kola-runext-test.sh[9124]: tail: error writing 'standard output': Broken pipe
Mar 24 13:37:32 qemu0 kola-runext-test.sh[9129]: tail: error writing 'standard output': Broken pipe
Mar 24 13:37:32 qemu0 kola-runext-test.sh[9134]: tail: error writing 'standard output': Broken pipe
Mar 24 13:37:32 qemu0 kola-runext-test.sh[5322]: + output='Reserved crashkernel:    512M
Mar 24 13:37:32 qemu0 kola-runext-test.sh[5322]: Recommended crashkernel: 512M
Mar 24 13:37:32 qemu0 kola-runext-test.sh[5322]: Kernel image size:   0M
Mar 24 13:37:32 qemu0 kola-runext-test.sh[5322]: Kernel modules size: 13M
Mar 24 13:37:32 qemu0 kola-runext-test.sh[5322]: Initramfs size:      68M
Mar 24 13:37:32 qemu0 kola-runext-test.sh[5322]: Runtime reservation: 64M
Mar 24 13:37:32 qemu0 kola-runext-test.sh[5322]: Large modules:
Mar 24 13:37:32 qemu0 kola-runext-test.sh[5322]:     xfs: 2752512
Mar 24 13:37:32 qemu0 kola-runext-test.sh[5322]:     sunrpc: 1048576'
Mar 24 13:37:32 qemu0 kola-runext-test.sh[5322]: + grep -q 'WARNING: Current crashkernel size is lower than recommended size'
Mar 24 13:37:32 qemu0 kola-runext-test.sh[5322]: + /tmp/autopkgtest-reboot-prepare aftercrash
Mar 24 13:37:32 qemu0 kola-runext-test.sh[5322]: + sleep 5
Mar 24 13:37:37 qemu0 kola-runext-test.sh[5322]: + echo 'Triggering sysrq'
Mar 24 13:37:37 qemu0 kola-runext-test.sh[5322]: Triggering sysrq
Mar 24 13:37:37 qemu0 kola-runext-test.sh[5322]: + sync
-- Boot ee11846cf9a544cb9d1d8eede6a27136 --
Mar 24 13:38:26 qemu0 kola-runext-test.sh[2626]: + . /var/opt/kola/extdata/commonlib.sh
Mar 24 13:38:26 qemu0 kola-runext-test.sh[2626]: ++ IFS=' '
Mar 24 13:38:26 qemu0 kola-runext-test.sh[2626]: ++ read -r -a cmdline
Mar 24 13:38:26 qemu0 kola-runext-test.sh[2626]: + case "${AUTOPKGTEST_REBOOT_MARK:-}" in
Mar 24 13:38:26 qemu0 kola-runext-test.sh[2636]: ++ find /var/crash -type f -name vmcore
Mar 24 13:38:26 qemu0 kola-runext-test.sh[2626]: + kcore=
Mar 24 13:38:26 qemu0 kola-runext-test.sh[2626]: + test -z ''
Mar 24 13:38:26 qemu0 kola-runext-test.sh[2626]: + fatal 'No kcore found in /var/crash'
Mar 24 13:38:26 qemu0 kola-runext-test.sh[2626]: + echo 'No kcore found in /var/crash'
Mar 24 13:38:26 qemu0 kola-runext-test.sh[2626]: No kcore found in /var/crash
Mar 24 13:38:26 qemu0 kola-runext-test.sh[2626]: + exit 1

How hard would it be to save kexec-dmesg.log from the test run ?

@jbtrystram
Copy link
Contributor Author

jbtrystram commented Mar 26, 2024

Looks like this in a known upstream issue https://bugzilla.redhat.com/show_bug.cgi?id=2269991
So i'll denylist this test until it's fixed

@jbtrystram
Copy link
Contributor Author

Denylist PR : coreos/fedora-coreos-config#2922

jbtrystram added a commit to jbtrystram/fedora-coreos-config that referenced this issue Mar 26, 2024
Kdump currently fails on kernel-next, for ppc64le
This is a known upstream bug
See coreos/fedora-coreos-tracker#1698
jbtrystram added a commit to jbtrystram/fedora-coreos-config that referenced this issue Mar 26, 2024
Kdump currently fails on kernel-next, for ppc64le
This is a known upstream bug
See coreos/fedora-coreos-tracker#1698
Also see upstream bug : https://bugzilla.redhat.com/show_bug.cgi?id=2269991
dustymabe pushed a commit to coreos/fedora-coreos-config that referenced this issue Mar 26, 2024
Kdump currently fails on kernel-next, for ppc64le
This is a known upstream bug
See coreos/fedora-coreos-tracker#1698
Also see upstream bug : https://bugzilla.redhat.com/show_bug.cgi?id=2269991
aaradhak added a commit to aaradhak/fedora-coreos-config that referenced this issue Mar 29, 2024
aaradhak added a commit to coreos/fedora-coreos-config that referenced this issue Mar 29, 2024
@dustymabe
Copy link
Member

supposedly fixed by crash-8.0.4-5.fc41. @jbtrystram can you confirm and then remove the denylist entry and and close this out?

@jbtrystram
Copy link
Contributor Author

jbtrystram commented Apr 3, 2024

@dustymabe testing this on a PPC64le machine today still fails.
crash is not included in fcos by default, and the console log shows an issue with makedumpfile. If i understand correctly, crash is a tool you would use to analyse the dumpfiles.

I did some digging and I think the issue is that this makedumpfile patch is not backported into kexec-tools (which packages makedumpfile) :

makedumpfile -v
makedumpfile: version 1.7.4 (released on 6 Nov 2023)
lzo     enabled
snappy  enabled
zstd    enabled

See: https://bugzilla.redhat.com/show_bug.cgi?id=2269991#c7

@coiby could you backport the makedumpfile fix to kexec-tools ?

Another fix is to pin the kernel to a prior version than v6.9-rc2

@coiby
Copy link

coiby commented Apr 8, 2024

Hi @jbtrystram, kexec-tools-2.0.28-7.fc41 now includes the makedumpfile fix, thanks for the reminder!

@jbtrystram
Copy link
Contributor Author

@dustymabe
Copy link
Member

The snooze for this was dropped in coreos/fedora-coreos-config@7857871

All should be good now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants