Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZTS fails to detect killed processes. #9003

Merged
merged 1 commit into from
Jul 10, 2019

Conversation

AttilaFueloep
Copy link
Contributor

Motivation and Context

If a process gets killed by a signal ksh(1) returns 256+signum. log_mustnot was expecting a return value of 128+signum, failing to detect killed processes. This resulted in test passing where they should not.

Description

How Has This Been Tested?

Verified that tests that cause segfaults passed without this change and failed, as they should, with this change.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (a change to man pages or other documentation)

Checklist:

@AttilaFueloep
Copy link
Contributor Author

The tests failing is expected. Once #9001 is merged they should pass again.

By the way, how can I see which tests failed, there is no obvious link on the "Details" page.

@behlendorf
Copy link
Contributor

Thanks for running this down. I'd wondered about this in the past but never investigated further. According the ksh man pages for both illumos and Linux, 256+signum is the expected for abnormal termination. So I'm not sure why this was ever set to 128.

       If it terminates abnormally, its value is 256+signum. The name of the signal
       corresponding  to  the exit  status can be obtained by way of the -l option of
       the kill built-in utility.

By the way, how can I see which tests failed, there is no obvious link on the "Details" page.

You're going to want to click on the summary link for the ZTS results. This is a summarized version of the full log which lists which tests failed and which (if any) of those failures were unexpected. It will also include full extracts for those specific test failures. For example:

Tests with results other than PASS that are unexpected:
    FAIL cli_root/zfs_send/zfs_send-b (expected PASS)
    FAIL cli_root/zfs_send/zfs_send_004_neg (expected PASS)
    FAIL slog/slog_replay_volume (expected PASS)

Once #9001 is merged they should pass again.

That makes sense. Once #9001 is merged will want to get this change rebased to verify all is well.

@behlendorf behlendorf requested a review from jwk404 July 8, 2019 17:16
@behlendorf behlendorf added the Status: Code Review Needed Ready for review and testing label Jul 8, 2019
@@ -199,11 +199,11 @@ function log_neg_expect
print -u2 $($out)
_printerror "$@" "unexpectedly exited $status (File not found)"
# bus error - core dump
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: let's make it explicit where 263 (and 267) come from in the comments. Perhaps something like this.

# bus error - core dump (256+signal, SIGBUS=7)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

@AttilaFueloep
Copy link
Contributor Author

So I'm not sure why this was ever set to 128.

Maybe this was run by sh or bash formerly. They use 128+signum.

Thanks for the link! I will address your review and rebase once #9001 is merged.

@behlendorf
Copy link
Contributor

#9001 was merged so feel free to rebase this when convenient.

@AttilaFueloep
Copy link
Contributor Author

AttilaFueloep commented Jul 8, 2019

Squashed and rebased. Let's see how the tests go.

@richardelling
Copy link
Contributor

For the archives, ksh was the de-facto shell at Sun at the time. bash was not guaranteed to be available and sh is simply too limited, left there for backwards compatibility.

That said, it was well known that 256 + errno is expected, thanks for tracking this down.

log_neg_expect was using the wrong exit status to detect if a process
got killed by SIGSEGV or SIGBUS, resulting in false positives.

Signed-off-by: Attila Fülöp <attila@fueloep.org>
@AttilaFueloep
Copy link
Contributor Author

Sorry, I've to force push again to fix the commit message. There is a 73 characters wide line and a typo.

@codecov
Copy link

codecov bot commented Jul 9, 2019

Codecov Report

Merging #9003 into master will increase coverage by 0.21%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #9003      +/-   ##
==========================================
+ Coverage    78.5%   78.72%   +0.21%     
==========================================
  Files         401      401              
  Lines      120145   120145              
==========================================
+ Hits        94322    94580     +258     
+ Misses      25823    25565     -258
Flag Coverage Δ
#kernel 79.49% <ø> (+0.1%) ⬆️
#user 66.57% <ø> (+0.34%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1d20b76...7a8b9a2. Read the comment docs.

@behlendorf
Copy link
Contributor

The test run encountered a surprising number of of rare test cases failures (4 total, one per bot). While unlikely, we have seen each of the failures at least once before. I've resubmitted those builds to repeat the testing.

@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Jul 9, 2019
@AttilaFueloep
Copy link
Contributor Author

With the exception of the kernel.org build failure, which I think is unrelated, everything else seems to be fine now.

@behlendorf
Copy link
Contributor

@AttilaFueloep looks good, I'll get this merged. The kernel.org failures are due to the 5.3 merge window now being open, so we'll need to see what's changed and sort that out.

@behlendorf behlendorf merged commit c3fba90 into openzfs:master Jul 10, 2019
TulsiJain pushed a commit to TulsiJain/zfs that referenced this pull request Jul 20, 2019
log_neg_expect was using the wrong exit status to detect if a process
got killed by SIGSEGV or SIGBUS, resulting in false positives.

Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Attila Fülöp <attila@fueloep.org>
Closes openzfs#9003
TulsiJain pushed a commit to TulsiJain/zfs that referenced this pull request Jul 20, 2019
log_neg_expect was using the wrong exit status to detect if a process
got killed by SIGSEGV or SIGBUS, resulting in false positives.

Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Attila Fülöp <attila@fueloep.org>
Closes openzfs#9003
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Aug 13, 2019
log_neg_expect was using the wrong exit status to detect if a process
got killed by SIGSEGV or SIGBUS, resulting in false positives.

Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Attila Fülöp <attila@fueloep.org>
Closes openzfs#9003
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Aug 22, 2019
log_neg_expect was using the wrong exit status to detect if a process
got killed by SIGSEGV or SIGBUS, resulting in false positives.

Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Attila Fülöp <attila@fueloep.org>
Closes openzfs#9003
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Aug 23, 2019
log_neg_expect was using the wrong exit status to detect if a process
got killed by SIGSEGV or SIGBUS, resulting in false positives.

Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Attila Fülöp <attila@fueloep.org>
Closes openzfs#9003
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Sep 17, 2019
log_neg_expect was using the wrong exit status to detect if a process
got killed by SIGSEGV or SIGBUS, resulting in false positives.

Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Attila Fülöp <attila@fueloep.org>
Closes openzfs#9003
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Sep 18, 2019
log_neg_expect was using the wrong exit status to detect if a process
got killed by SIGSEGV or SIGBUS, resulting in false positives.

Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Attila Fülöp <attila@fueloep.org>
Closes openzfs#9003
tonyhutter pushed a commit that referenced this pull request Sep 26, 2019
log_neg_expect was using the wrong exit status to detect if a process
got killed by SIGSEGV or SIGBUS, resulting in false positives.

Reviewed-by: loli10K <ezomori.nozomu@gmail.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Attila Fülöp <attila@fueloep.org>
Closes #9003
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants