mantle/kola: Auto enable rerun success in some failure scenarios #3494

dustymabe · 2023-05-30T20:56:00Z

This will auto enable rerun success for some failure scenarios including timeouts and some console checks.

This tag will be used when the test framework detects tests failing in specific ways that we've decided we want to allow rerun success for.

This will allow tests that get aborted due to a timeout to not fail the overall run if a rerun is attempted and it passes. One example of where this is useful is if a test times out on initialization and never reaches the machine via SSH (which we see regularly). In that case is the failure an issue with the platform bringing up the machine or a fundamental issue with the software inside FCOS/RHCOS? If it's an issue with the platform and the rerun succeeds then we don't want to see a failure. If it's a fundamental issue with the software inside (i.e. Ignition fails) then the rerun will fail anyway and this will do no harm.

This patch enables us to have console checks that are non-fatal and will print a log message to the screen and nothing more, which was stated as desirable in [1]. This commit also re-works the implementation of the console/journal checks in runTest() to deduplicate code. It has the side-effect of making SkipConsoleWarnings apply to journal checks too, but I think that's actually a benefit and not a negative. [1] coreos#3450 (comment)

This will make it so there are console checks we can define that, if configured, will mark a test as able to have a test run succeed if a rerun succeeds.

Now that we have warnOnly and allowRerunSuccess capabilities in our consoleChecks let's add back in the kernel soft lockup check that was removed in 7283c89.

This function conveniently exists so let's use it.

Since we call runProvidedTests() for both the first run and the rerun let's not call the variable firstRunErr since in the nested call that actually won't be accurate. Let's just call it runErr instead.

In this case it's usually the platform having some internal failure. If it succeeds in the rerun then just be happy with that.

gursewak1997

Looks pretty good overall.

dustymabe added 8 commits May 30, 2023 16:49

mantle/kola: Add internal "allow-rerun-success" tag

0624455

This tag will be used when the test framework detects tests failing in specific ways that we've decided we want to allow rerun success for.

mantle/kola: allow consoleChecks that allow rerun success

75dd1ae

This will make it so there are console checks we can define that, if configured, will mark a test as able to have a test run succeed if a rerun succeeds.

mantle/kola: add back in kernel soft lockup consolecheck

abc48de

Now that we have warnOnly and allowRerunSuccess capabilities in our consoleChecks let's add back in the kernel soft lockup check that was removed in 7283c89.

mantle/kola: use HasString() for checking if tag is set

2d36ff8

This function conveniently exists so let's use it.

mantle/kola: rename variable

9722e33

Since we call runProvidedTests() for both the first run and the rerun let's not call the variable firstRunErr since in the nested call that actually won't be accurate. Let's just call it runErr instead.

mantle/kola: allow rerun success for platform machine start failure

10dc540

In this case it's usually the platform having some internal failure. If it succeeds in the rerun then just be happy with that.

dustymabe requested review from gursewak1997 and marmijo May 31, 2023 15:47

This was referenced May 31, 2023

mantle/kola: add detection for a kernel soft lockup #3450

Merged

Revert "mantle/kola: add detection for a kernel soft lockup" #3481

Merged

gursewak1997 approved these changes May 31, 2023

View reviewed changes

dustymabe merged commit 8f7c06a into coreos:main May 31, 2023
2 checks passed

dustymabe deleted the dusty-more-rerun-success branch May 31, 2023 17:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mantle/kola: Auto enable rerun success in some failure scenarios #3494

mantle/kola: Auto enable rerun success in some failure scenarios #3494

dustymabe commented May 30, 2023

gursewak1997 left a comment

mantle/kola: Auto enable rerun success in some failure scenarios #3494

mantle/kola: Auto enable rerun success in some failure scenarios #3494

Conversation

dustymabe commented May 30, 2023

gursewak1997 left a comment

Choose a reason for hiding this comment