Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests: add a remote kdump test #3829

Merged
merged 1 commit into from
Jul 10, 2024
Merged

Conversation

jbtrystram
Copy link
Contributor

@jbtrystram jbtrystram commented Jul 4, 2024

This test setups two machines to test if kdump successfully exports vmcore to a SSH destination.

Fixes coreos/fedora-coreos-tracker#1753


This is not fully functional yet but ready for review I think.
The test setup works ( i can see the logs created on the remote machine) but kola fails the test somehow :

[coreos-assembler]$ /mnt/cosa/bin/kola run kdump.crash.ssh --ssh-on-test-failure 
⏭  Skipping kola test pattern "fcos.internet":
  👉 https://github.com/coreos/coreos-assembler/pull/1478
⏭  Skipping kola test pattern "podman.workflow":
  👉 https://github.com/coreos/coreos-assembler/pull/1478
=== RUN   kdump.crash.ssh
/home/core/crash/10.0.2.15-2024-07-04-15:55:25/vmcore-dmesg.txt
/home/core/crash/10.0.2.15-2024-07-04-15:55:25/vmcore.flat
--- FAIL: kdump.crash.ssh (105.02s)
) on machine cac011c1-b98c-499a-9116-ba14fed5e45f consolered crash
FAIL, output in tmp/kola/qemu-2024-07-04-1554-32683
Error: harness: test suite failed
2024-07-04T15:56:09Z cli: harness: test suite failed

is there some flag to say the machine is expected to reboot ?

Copy link

openshift-ci bot commented Jul 4, 2024

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@travier
Copy link
Member

travier commented Jul 5, 2024

is there some flag to say the machine is expected to reboot ?

There is one to say that the VM is expected to crash: https://github.com/coreos/fedora-coreos-config/blob/testing-devel/tests/kola/kdump/crash/test.sh#L7

tags: skip-base-checks

You can likely find the corresponding option in the code to set that for this test.

mantle/kola/tests/ignition/kdump.go Outdated Show resolved Hide resolved
mantle/kola/tests/ignition/kdump.go Outdated Show resolved Hide resolved
mantle/kola/tests/ignition/kdump.go Outdated Show resolved Hide resolved
@jbtrystram
Copy link
Contributor Author

tags: skip-base-checks

yep that was it. Thanks @travier !

@jbtrystram jbtrystram force-pushed the kdump_ssh_test branch 2 times, most recently from 980ece7 to 3efbf2a Compare July 8, 2024 22:07
@jbtrystram jbtrystram marked this pull request as ready for review July 8, 2024 22:08
@jbtrystram
Copy link
Contributor Author

jbtrystram commented Jul 8, 2024

I think this is now ready for review
I added a couple of retry loops to let some time for kdump to generate the initramfs and write the logs.

The test now pass :

[coreos-assembler]$ /mnt/bin/kola run kdump.crash.ssh
=== RUN   kdump.crash.ssh
--- PASS: kdump.crash.ssh (68.66s)
PASS, output in tmp/kola/qemu-2024-07-08-2208-10301

@jbtrystram
Copy link
Contributor Author

the RHCOS test fails on the SSH command to crash the kernel.
I suspect it's not the latest revision that was tested.
Anyway, the kdump work:

[    6.910879] kdump[559]: Kdump is using the default log level(3).
[    7.136098] kdump[605]: saving to core@10.0.2.2:/home/core/crash/10.0.2.15-2024-07-08-23:59:44
[    8.007655] kdump[605]: saving to core@10.0.2.2:/home/core/crash/10.0.2.15-2024-07-08-23:59:44
[    7.317830] kdump[609]: saving vmcore-dmesg.txt to core@10.0.2.2:/home/core/crash/10.0.2.15-2024-07-08-23:59:44
[    8.189387] kdump[609]: saving vmcore-dmesg.txt to core@10.0.2.2:/home/core/crash/10.0.2.15-2024-07-08-23:59:44
[    7.486731] kdump.sh[611]: 159+1 records in
[    7.490292] kdump.sh[611]: 159+1 records out
[    7.494151] kdump.sh[611]: 81809 bytes (82 kB, 80 KiB) copied, 0.000764738 s, 107 MB/s[    8.358288] kdump.sh[611]: 159+1 records in

[    8.361851] kdump.sh[611]: 159+1 records out
[    8.365709] kdump.sh[611]: 81809 bytes (82 kB, 80 KiB) copied, 0.000764738 s, 107 MB/s
[    7.647105] kdump[614]: saving vmcore-dmesg.txt complete
[    8.518662] kdump[614]: saving vmcore-dmesg.txt complete
[    7.657912] kdump[616]: saving vmcore
[    8.529470] kdump[616]: saving vmcore
[    9.665279] kdump.sh[617]: 
Checking for memory holes                         : [  0.0 %] /                  
Checking for memory holes                         : [100.0 %] |                  
Excluding unnecessary pages                       : [100.0 %] \                  
Checking for memory holes                         : [100.0 %] -                  
Checking for memory holes                         : [100.0 %] /                  
Excluding unnecessary pages                       : [100.0 %] |                  
Copying data                                      : [ 36.7 %] \           eta: 2s
Copying data                                      : [ 96.5 %] -           eta: 0s
Copying data                                      : [100.0 %] /           eta: 0s
Copying data                                      : [100.0 %] |           eta: 0s
[   10.536834] kdump.sh[617]: 
Checking for memory holes                         : [  0.0 %] /                  
Checking for memory holes                         : [100.0 %] |                  
Excluding unnecessary pages                       : [100.0 %] \                  
Checking for memory holes                         : [100.0 %] -                  
Checking for memory holes                         : [100.0 %] /                  
Excluding unnecessary pages                       : [100.0 %] |                  
Copying data                                      : [ 36.7 %] \           eta: 2s
Copying data                                      : [ 96.5 %] -           eta: 0s
Copying data                                      : [100.0 %] /           eta: 0s
Copying data                                      : [100.0 %] |           eta: 0s
[    9.734010] kdump.sh[618]: 114846+1727 records in
[    9.737953] kdump.sh[618]: 114846+1727 records out
[    9.741836] kdump.sh[618]: 59162607 bytes (59 MB, 56 MiB) copied, 1.9093 s, 31.0 MB/s
[   10.605567] kdump.sh[618]: 114846+1727 records in
[   10.609512] kdump.sh[618]: 114846+1727 records out
[   10.613395] kdump.sh[618]: 59162607 bytes (59 MB, 56 MiB) copied, 1.9093 s, 31.0 MB/s
[    9.900371] kdump[621]: saving vmcore complete
[   10.771928] kdump[621]: saving vmcore complete
[    9.913585] kdump[623]: saving the /run/initramfs/kexec-dmesg.log to core@10.0.2.2:/home/core/crash/10.0.2.15-2024-07-08-23:59:44/
[   10.785086] kdump[623]: saving the /run/initramfs/kexec-dmesg.log to core@10.0.2.2:/home/core/crash/10.0.2.15-2024-07-08-23:59:44/
[   10.109477] kdump[630]: Executing final action systemctl reboot -f
[   10.981034] kdump[630]: Executing final action systemctl reboot -f
[   10.126948] systemd[1]: Shutting down.
[   10.998506] systemd[1]: Shutting down.

Copy link
Member

@jlebon jlebon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A string of nits, but looks good overall!

mantle/kola/tests/ignition/kdump.go Outdated Show resolved Hide resolved
mantle/kola/tests/ignition/kdump.go Outdated Show resolved Hide resolved
mantle/kola/tests/ignition/kdump.go Outdated Show resolved Hide resolved
mantle/kola/tests/ignition/kdump.go Outdated Show resolved Hide resolved
mantle/kola/tests/ignition/kdump.go Outdated Show resolved Hide resolved
mantle/kola/tests/ignition/kdump.go Outdated Show resolved Hide resolved
mantle/kola/tests/ignition/kdump.go Outdated Show resolved Hide resolved
mantle/kola/tests/ignition/kdump.go Outdated Show resolved Hide resolved
mantle/kola/tests/ignition/kdump.go Outdated Show resolved Hide resolved
mantle/kola/tests/ignition/kdump.go Outdated Show resolved Hide resolved
@jbtrystram jbtrystram force-pushed the kdump_ssh_test branch 2 times, most recently from 3051876 to 6136cc8 Compare July 10, 2024 12:05
This test setups two machines to test if kdump successfully
exports vmcore to a SSH destination.

This sets a fairly large timeout for kdump to become active
as generating the initramfs can be long on slower systems.

Fixes coreos/fedora-coreos-tracker#1753
Copy link
Member

@jlebon jlebon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, LGTM!

@jlebon jlebon merged commit b8958ed into coreos:main Jul 10, 2024
5 checks passed
jbtrystram added a commit to jbtrystram/fedora-coreos-config that referenced this pull request Jul 16, 2024
This test was recently added in coreos/coreos-assembler#3829
but hit the same issue as `ext.kdump.crash`.
This should be fixed in kexec-tools-2.0.28-12.fc40, so extending the snooze
for 5 days only as it should hit stable soon.
jbtrystram added a commit to coreos/fedora-coreos-config that referenced this pull request Jul 16, 2024
This test was recently added in coreos/coreos-assembler#3829
but hit the same issue as `ext.kdump.crash`.
This should be fixed in kexec-tools-2.0.28-12.fc40, so extending the snooze
for 5 days only as it should hit stable soon.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add testing for kdump over network
4 participants