Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[rawhide]: In ppc64le, kdump fails to generate crash dump file after kernel crash #1588

Closed
gursewak1997 opened this issue Sep 29, 2023 · 7 comments

Comments

@gursewak1997
Copy link
Member

gursewak1997 commented Sep 29, 2023

kdump.crash test fails to generate a vmcore file after the crash is triggered in ppc64le in latest rawhide builds. Build and console.txt

Sep 27 14:43:22 qemu0 kola-runext-test.sh[4210]: Triggering sysrq
Sep 27 14:43:22 qemu0 kola-runext-test.sh[4210]: + sync
-- Boot 8098bc91f506424a859a558df7074270 --
Sep 27 14:44:05 qemu0 kola-runext-test.sh[4982]: + . /var/opt/kola/extdata/commonlib.sh
Sep 27 14:44:05 qemu0 kola-runext-test.sh[4982]: ++ IFS=' '
Sep 27 14:44:05 qemu0 kola-runext-test.sh[4982]: ++ read -r -a cmdline
Sep 27 14:44:05 qemu0 kola-runext-test.sh[4982]: + case "${AUTOPKGTEST_REBOOT_MARK:-}" in
Sep 27 14:44:05 qemu0 kola-runext-test.sh[4983]: ++ find /var/crash -type f -name vmcore
Sep 27 14:44:05 qemu0 kola-runext-test.sh[4982]: + kcore=
Sep 27 14:44:05 qemu0 kola-runext-test.sh[4982]: + test -z ''
Sep 27 14:44:05 qemu0 kola-runext-test.sh[4982]: + fatal 'No kcore found in /var/crash'
Sep 27 14:44:05 qemu0 kola-runext-test.sh[4982]: + echo 'No kcore found in /var/crash'
Sep 27 14:44:05 qemu0 kola-runext-test.sh[4982]: No kcore found in /var/crash
Sep 27 14:44:05 qemu0 kola-runext-test.sh[4982]: + exit 1

From console.log

[    6.257286] systemd[1]: Starting dracut-pre-pivot.service - dracut pre-pivot and cleanup hook...
[    6.278742] systemd[1]: Finished dracut-pre-pivot.service - dracut pre-pivot and cleanup hook.
[    6.281623] systemd[1]: Starting kdump-capture.service - Kdump Vmcore Save Service...
[    6.305930] kdump[470]: Kdump is using the default log level(3).
[    6.357795] kdump[505]: saving to /sysroot/ostree/deploy/fedora-coreos/var/crash/127.0.0.1-2023-09-27-14:49:26/
[    6.418746] kdump[510]: saving vmcore-dmesg.txt to /sysroot/ostree/deploy/fedora-coreos/var/crash/127.0.0.1-2023-09-27-14:49:26/
[    6.469768] kdump[516]: saving vmcore-dmesg.txt complete
[    6.472054] kdump[518]: saving vmcore
[    6.491839] kdump.sh[519]: readpage_elf: Attempt to read non-existent page at 0x4000000000000000.
[    6.492143] kdump.sh[519]: readmem: type_addr: 0, addr:0, size:8
[    6.492338] kdump.sh[519]: get_vmemmap_list_info: Can't get vmemmap region addresses
[    6.492549] kdump.sh[519]: get_machdep_info_ppc64: Can't get vmemmap list info.
[    6.496798] kdump.sh[519]: makedumpfile Failed.
[    6.497445] kdump[521]: saving vmcore failed, exitcode:1
[    6.499651] kdump[523]: saving vmcore failed
[    6.521439] kdump[528]: saving the /run/initramfs/kexec-dmesg.log to /sysroot/ostree/deploy/fedora-coreos/var/crash/127.0.0.1-2023-09-27-14:49:26///
[    6.525735] systemd[1]: kdump-capture.service: Main process exited, code=exited, status=1/FAILURE
[    6.530241] systemd[1]: kdump-capture.service: Failed with result 'exit-code'.
[    6.530501] systemd[1]: Failed to start kdump-capture.service - Kdump Vmcore Save Service.

Apparently, the transition of kernel version that seems to have caused this is
kernel-6.6.0-0.rc0.20230829git1c59d383390f.59.fc40 -> kernel-doc-6.6.0-0.rc0.20230830git6c1b980a7e79.1.fc40.

The kdump.crash test in Rawhide:
Passes with kernel-6.6.0-0.rc0.20230829git1c59d383390f.59.fc40
Fails with kernel-6.6.0-0.rc0.20230830git6c1b980a7e79.1.fc40

Build: [rawhide][ppc64le] ⚡ 40.20230927.91.0

@dustymabe
Copy link
Member

Maybe one of these:

$ git log --oneline 1c59d383390f..6c1b980a7e79 arch/powerpc/include/
d68b4b6f307d Merge tag 'mm-nonmm-stable-2023-08-28-22-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
b96a3e9142fd Merge tag 'mm-stable-2023-08-28-18-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
9fee28baa601 powerpc: implement the new page table range API
603fd64dfa45 powerpc/book3s64/memhotplug: enable memmap on memory for radix
8d539b84f1e3 nmi_backtrace: allow excluding an arbitrary CPU
f2b79c0d7968 powerpc/book3s64/radix: add support for vmemmap optimization for radix
368a0590d954 powerpc/book3s64/vmemmap: switch radix to use a different vmemmap handling function
27af67f35631 powerpc/book3s64/mm: enable transparent pud hugepage
6bbd42e2df8f mmu_notifiers: call invalidate_range() when invalidating TLBs
8d05554dca2a powerpc: mm: convert to GENERIC_IOREMAP
016fec91013c mm: move is_ioremap_addr() into new header file
0b1f77e74b5a asm-generic/iomap.h: remove ARCH_HAS_IOREMAP_xx macros
32cc0b7c9d50 powerpc: add pte_free_defer() for pgtables sharing page

@gursewak1997
Copy link
Member Author

BZ issue: https://bugzilla.redhat.com/show_bug.cgi?id=2241399

@dustymabe
Copy link
Member

36e826b landed in v6.7-rc1 which should be in kernel-6.7.0-0.rc1.20231114git9bacdd8996c7.17.fc40

but for some reason our tests are still failing.

@dustymabe
Copy link
Member

According to https://bugzilla.redhat.com/show_bug.cgi?id=2241399#c15 it looks like we are now blocked on #1598 before we can unpin kexec-tools in rawhide and verify this is fixed.

@dustymabe
Copy link
Member

We are now unblocked on #1598

marmijo added a commit to marmijo/fedora-coreos-config that referenced this issue Dec 14, 2023
The fix for coreos/fedora-coreos-tracker#1588
landed upstream and this test is now passing so let's remove it from
the denylist.
 Please enter the commit message for your changes. Lines starting
marmijo added a commit to marmijo/fedora-coreos-config that referenced this issue Dec 14, 2023
The fix for coreos/fedora-coreos-tracker#1588
landed upstream and this test is now passing so let's remove it from
the denylist.
@marmijo
Copy link
Member

marmijo commented Dec 14, 2023

I tested ext.config.kdump.crash using kexec-tools-2.0.27-3.fc40 in a ppc64le debug pod and it passed successfully. I opened coreos/fedora-coreos-config#2768 to remove the denylist entry.

gursewak1997 pushed a commit to coreos/fedora-coreos-config that referenced this issue Dec 14, 2023
The fix for coreos/fedora-coreos-tracker#1588
landed upstream and this test is now passing so let's remove it from
the denylist.
@dustymabe
Copy link
Member

Thanks!

aaradhak pushed a commit to aaradhak/fedora-coreos-config that referenced this issue Mar 18, 2024
aaradhak pushed a commit to aaradhak/fedora-coreos-config that referenced this issue Mar 18, 2024
The fix for coreos/fedora-coreos-tracker#1588
landed upstream and this test is now passing so let's remove it from
the denylist.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants