Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emit an error message before MMP suspends pool #7048

Merged
merged 1 commit into from
Jan 17, 2018
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions module/zfs/mmp.c
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@
#include <sys/mmp.h>
#include <sys/spa.h>
#include <sys/spa_impl.h>
#include <sys/time.h>
#include <sys/vdev.h>
#include <sys/vdev_impl.h>
#include <sys/zfs_context.h>
Expand Down Expand Up @@ -429,6 +430,10 @@ mmp_thread(void *arg)
*/
if (!suspended && mmp_fail_intervals && multihost &&
(start - mmp->mmp_last_write) > max_fail_ns) {
cmn_err(CE_WARN, "MMP writes to pool '%s' have not "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should mention that the pool will be suspended in this message. This will help connect the dots if the system does end up panicking due to the failure mode.

How about something like MMP writes to pool 'name' have not succeeded in over Xs. The pool will be suspended.?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, regardless of whether the system panics or not, zio_suspend() will emit a message that the pool is being suspended. But I can add what you suggest to the new message if that's what you want.

I'm not sure how to address the style check failure. Running 'make checkstyle' locally passes. The errors listed http://build.zfsonlinux.org/builders/Ubuntu%2017.10%20x86_64%20%28STYLE%29/builds/1998/steps/shell_2/logs/stdio don't seem to have anything to do with this change. Any hints?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jhammond-intel I think this makes sense as a quick fix. Longer term I think it would be nice to pass a reason to zio_suspend() so it can print a more descriptive error message.

Go ahead and ignore the style check failures for this one. They're not your fault. I updated the VM running the checks last week and pulled in a new version of shellcheck which appears to be upset about new issue which haven't yet been addressed/suppressed. See #7040 (which still needs work).

"succeeded in over %llus; suspending pool",
spa_name(spa),
NSEC2SEC(start - mmp->mmp_last_write));
zio_suspend(spa, NULL);
}

Expand Down