Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[2.1.12] Probe vdevs before marking removed #14897

Merged

Conversation

behlendorf
Copy link
Contributor

Motivation and Context

Backport of #14861.

Description

Before allowing the ZED to mark a vdev as REMOVED due to a hotplug event confirm that it is non-responsive with probe. Any device which can be successfully probed should be left ONLINE to prevent a healthy pool from being incorrectly SUSPENDED. This may occur for at least the following two scenarios.

  1. Drive expansion (zpool online -e) in VMware environments.
    If, during the partition resize operation, a partition is
    removed and re-created then udev will send a removed event.

  2. Re-scanning the namespaces of an NVMe device (nvme ns-rescan)
    may result in a udev remove and add event being delivered.

Finally, update the ZED to only kick in a spare when the removal was successful.

Before allowing the ZED to mark a vdev as REMOVED due to a
hotplug event confirm that it is non-responsive with probe.
Any device which can be successfully probed should be left
ONLINE to prevent a healthy pool from being incorrectly
SUSPENDED.  This may occur for at least the following two
scenarios.

1) Drive expansion (zpool online -e) in VMware environments.
   If, during the partition resize operation, a partition is
   removed and re-created then udev will send a removed event.

2) Re-scanning the namespaces of an NVMe device (nvme ns-rescan)
   may result in a udev remove and add event being delivered.

Finally, update the ZED to only kick in a spare when the
removal was successful.

Reviewed-by: Ameer Hamza <ahamza@ixsystems.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue openzfs#14859
Closes openzfs#14861
@behlendorf behlendorf added the Status: Code Review Needed Ready for review and testing label May 25, 2023
@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels May 26, 2023
@behlendorf behlendorf merged commit e2176f1 into openzfs:zfs-2.1.12-staging May 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants