Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rescan enclosure sysfs path on import #12095

Merged
merged 1 commit into from
Oct 4, 2021

Conversation

tonyhutter
Copy link
Contributor

Motivation and Context

Fix vdev_enc_sysfs_path not getting updated on import

Description

When you create a pool, zfs writes vd->vdev_enc_sysfs_path with the enclosure sysfs path to the fault LEDs, like:

  vdev_enc_sysfs_path = /sys/class/enclosure/0:0:1:0/SLOT8

However, this enclosure path doesn't get updated on successive imports even if enclosure path to the disk changes. This patch fixes the issue.

Fixes: #11950

How Has This Been Tested?

On a disk enclosure, I used a fake vdev_enc_sysfs_path set from userspace:

--- a/lib/libzutil/os/linux/zutil_import_os.c
+++ b/lib/libzutil/os/linux/zutil_import_os.c
@@ -853,6 +853,13 @@ update_vdev_config_dev_strs(nvlist_t *nv)
                /* Add enclosure sysfs path (if disk is in an enclosure). */
                upath = zfs_get_underlying_path(path);
                spath = zfs_get_enclosure_sysfs_path(upath);
+
+               char *new_spath = NULL;
+               new_spath = getenv("ZPOOL_CONFIG_VDEV_ENCPATH_OVERRIDE");
+               if (new_spath) {
+                       spath = strdup(new_spath);
+               }
+
                if (spath)
                        nvlist_add_string(nv, ZPOOL_CONFIG_VDEV_ENC_SYSFS_PATH,
                            spath);

I then verified with zdb -C that the new fake config got set. Then, I exported/imported the pool without the fake and saw the pool correctly set the real vdev_enc_sysfs_path.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@nscfreny
Copy link

nscfreny commented May 21, 2021

Tested d352a62 on a pool with wrong vdev_enc_sysfs_path

vdev_enc_sysfs_path corrected?

  • reboot/import-cache: no
  • manual export/import: yes

@ZNikke
Copy link

ZNikke commented May 21, 2021

@tonyhutter This patch does indeed improve the situation, an exported pool with incorrect enclosure path gets the right one after import. Also, creating a new pool gets the enclosure path set now.

I haven't been able to test with enclosure path changing between reboots, but the results of @nscfreny seems to show there's work to be done still... I'll be on site monday and try anyway, making our enclosure device move is a simple as removing a few disks from the enclosure and reboot.

@ZNikke
Copy link

ZNikke commented May 21, 2021

IMO, the ZPOOL_CONFIG_VDEV_ENCPATH_OVERRIDE could be made part of the patch together with some tests to avoid regressions in this area. At least some basics could be covered without requiring actual enclosure hardware.

Something along the lines of:

zpool create tank disk1

enc2=$(mktemp -d)

env ZPOOL_CONFIG_VDEV_ENCPATH_OVERRIDE=$enc2 zpool attach tank disk1 disk2

zdb -C tank and examine that disk2 has enclosure path set to $enc2

And build on scenarios from there. A reboot or equivalent should clear/change the enclosure path of disk2 depending on hardware. export followed by import with ZPOOL_CONFIG_VDEV_ENCPATH_OVERRIDE set should have the enclosure path set to the override, etc.

This could then be improved on to encompass zed related tests by doing:

echo 0 > $enc2/fault
echo 0 > $enc2/locate

to get the LED entries in place. We can then trigger various states for disk2 and verify that zed-stuff flips the led entries as appropriate.

Your thoughts?

@nscfreny
Copy link

We have a spare HPE Apollo 4510 with a few disks in it and since it has a smartpqi controller, vdev_enc_sysfs_path will be offset by one if I create a logical volume on one of the drives and reboot.

Unfortunately I'll be in zoom meetings rest of the day but I can probably do some testing during the weekend.

module/zfs/vdev.c Outdated Show resolved Hide resolved
@nscfreny
Copy link

nscfreny commented May 23, 2021

I think may have a lead on why reboot/import-cache does not correct wrong vdev_enc_sysfs_path in vdev_copy_path_impl().

  • zpool import -c /etc/zfs/zpool.cache -aN: svd->vdev_enc_sysfs_path is read from zpool.cache, which may be wrong and also has same value as dvd->vdev_enc_sysfs_path.
  • zpool import without cache file: svd->vdev_enc_sysfs_path is correct.

Edit: A temporary workaround for import-cache could be to have zfs-import-cache.service run a script that extract pool names from zpool.cache, remove zpool.cache and import pools by name.

@ahrens ahrens added the Status: Code Review Needed Ready for review and testing label May 24, 2021
@tonyhutter
Copy link
Contributor Author

I think may have a lead on why reboot/import-cache does not correct wrong vdev_enc_sysfs_path in vdev_copy_path_impl().

Thanks, I forgot to test with a cache. I'll see if I can find a perminant fix for that case and add it to this PR.

@ZNikke
Copy link

ZNikke commented May 24, 2021

The only case I can think of when using an encpath from the cache could be useful would be for MISSING/UNAVAILABLE devices, if it's valid that is.

@nscfreny
Copy link

nscfreny commented May 25, 2021

Discovered that if vdev_path from zpool.cache does not match reality after reboot, vdev_enc_sysfs_path will also be corrected on import retry. This will of course not work with WWN id device names that does not change, but vdev_enc_sysfs_path can still be messed up by disks missing after reboot.

Here I cheated by editing zpool.cache to look like it would have if a disk with a lower slot number had "ceased to exist" after reboot (scary error message considering retry was successful).

zpool import -c /etc/zfs/zpool.cache -a
cannot import 'testpool5': no such pool or dataset
        Destroy and re-create the pool from
        a backup source.
cachefile import failed, retrying

grep vdev_copy_path /proc/spl/kstat/zfs/dbgmsg | tail -1
1621965366   vdev.c:2397:vdev_copy_path_impl(): vdev_copy_path: vdev 17314167603364500466: vdev_enc_sysfs_path changed from '/sys/class/enclosure/3:0:8:0/1' to '/sys/class/enclosure/3:0:7:0/1'

ZPOOL_SCRIPTS_AS_ROOT=1 zpool status -c ses
  pool: testpool5
 state: ONLINE
config:

        NAME        STATE     READ WRITE CKSUM      enc  encdev  slot  fault_led  locate_led
        testpool5   ONLINE       0     0     0
          sdk       ONLINE       0     0     0  3:0:7:0    sg22     2          0           0

errors: No known data errors

@tonyhutter
Copy link
Contributor Author

So it looks like a cache import is a different codepath that doesn't call get_configs()->fix_paths(). If we were to update the sysfs path on a cache imprt, it would be at:

zpool_find_import_cached(libpc_handle_t *hdl, importargs_t *iarg)
	< fixup sysfs path here >
        if ((dst = zutil_refresh_config(hdl, src)) == NULL) 

I'm looking into the cleanest way of doing that.

@nscfreny
Copy link

I believe zpool_find_import_cached(libpc_handle_t *hdl, importargs_t *iarg) does this if (iarg->scan)

See near:
idata.scan = B_TRUE;
in:
cmd/zpool/zpool_main.c

@tonyhutter
Copy link
Contributor Author

@nscfreny that kind of defeats the purpose of doing a cache import though. You also can't pass both -c and -s to zpool import.

@nscfreny
Copy link

@tonyhutter a cache import that fail because a device path has changed will automatically retry with idata.scan = B_TRUE; and still only import pools that are in the cache. But there might of course be negative consequences of triggering this fallback by default?

@tonyhutter
Copy link
Contributor Author

a cache import that fail because a device path has changed will automatically retry with idata.scan = B_TRUE

I think that's true for the common case. But if you create your pool with /dev/disk/by-id/ devices, the disk can move in the enclosure, but the id will stay the same. In that case, I'm not sure it would do the auto rescan.

@tonyhutter tonyhutter force-pushed the rescan_sysfs_path branch 2 times, most recently from 28819b4 to 493807c Compare June 9, 2021 00:16
lib/libzutil/zutil_pool.c Outdated Show resolved Hide resolved
@tonyhutter
Copy link
Contributor Author

This is having build errors on Ubuntu/Debian. I need to look into those...

@tonyhutter tonyhutter force-pushed the rescan_sysfs_path branch 2 times, most recently from 93e077a to eb6f849 Compare June 11, 2021 17:38
@tonyhutter
Copy link
Contributor Author

Looks like there's some API stuff I need to update:

$ make checkabi
...
4 Added functions:

  'function int for_each_vdev(zpool_handle_t*, pool_vdev_iter_f, void*)'    {for_each_vdev}
  'function int for_each_vdev_in_nvlist(nvlist_t*, pool_vdev_iter_f, void*)'    {for_each_vdev_in_nvlist}
  'function void update_vdevs_config_dev_sysfs_path(nvlist_t*)'    {update_vdevs_config_dev_sysfs_path}
  'function nvlist_t* zpool_get_config(zpool_handle_t*, nvlist_t**)'    {zpool_get_config}

Makefile:1085: recipe for target 'checkabi' failed

@tonyhutter tonyhutter force-pushed the rescan_sysfs_path branch 2 times, most recently from 4902c11 to b6f685d Compare June 17, 2021 00:05
@tonyhutter
Copy link
Contributor Author

The ZTS failures are from a NULL pointer dereference kernel panic. I'm looking into that...

@tonyhutter
Copy link
Contributor Author

The checkstyle failure is a known issue (#12459)

@tonyhutter tonyhutter force-pushed the rescan_sysfs_path branch 2 times, most recently from 32070a3 to 61154dc Compare August 11, 2021 22:19
@tonyhutter
Copy link
Contributor Author

I think these failures are unrelated to this PR. Reviewers - please take another look.

@tonyhutter
Copy link
Contributor Author

My latest push gets rid of the ABI changes and is correctly building again on FreeBSD. Reviewers, please take another look when you get a chance.

@tonyhutter
Copy link
Contributor Author

My latest push reworkes this code again to be even more simplified.

Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for further rework, this looks good to me and turned out nicely.

When you create a pool, zfs writes vd->vdev_enc_sysfs_path with the
enclosure sysfs path to the fault LEDs, like:

    vdev_enc_sysfs_path = /sys/class/enclosure/0:0:1:0/SLOT8

However, this enclosure path doesn't get updated on successive imports
even if enclosure path to the disk changes.  This patch fixes the issue.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Oct 4, 2021
@behlendorf behlendorf merged commit 2a8430a into openzfs:master Oct 4, 2021
tonyhutter added a commit to tonyhutter/zfs that referenced this pull request Nov 2, 2021
When you create a pool, zfs writes vd->vdev_enc_sysfs_path with the
enclosure sysfs path to the fault LEDs, like:

    vdev_enc_sysfs_path = /sys/class/enclosure/0:0:1:0/SLOT8

However, this enclosure path doesn't get updated on successive imports
even if enclosure path to the disk changes.  This patch fixes the issue.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes openzfs#11950
Closes openzfs#12095
tonyhutter added a commit to tonyhutter/zfs that referenced this pull request Nov 2, 2021
When you create a pool, zfs writes vd->vdev_enc_sysfs_path with the
enclosure sysfs path to the fault LEDs, like:

    vdev_enc_sysfs_path = /sys/class/enclosure/0:0:1:0/SLOT8

However, this enclosure path doesn't get updated on successive imports
even if enclosure path to the disk changes.  This patch fixes the issue.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes openzfs#11950
Closes openzfs#12095
ofaaland pushed a commit to LLNL/zfs that referenced this pull request Nov 4, 2021
When you create a pool, zfs writes vd->vdev_enc_sysfs_path with the
enclosure sysfs path to the fault LEDs, like:

    vdev_enc_sysfs_path = /sys/class/enclosure/0:0:1:0/SLOT8

However, this enclosure path doesn't get updated on successive imports
even if enclosure path to the disk changes.  This patch fixes the issue.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes openzfs#11950 
Closes openzfs#12095
tonyhutter added a commit to tonyhutter/zfs that referenced this pull request Nov 13, 2021
When you create a pool, zfs writes vd->vdev_enc_sysfs_path with the
enclosure sysfs path to the fault LEDs, like:

    vdev_enc_sysfs_path = /sys/class/enclosure/0:0:1:0/SLOT8

However, this enclosure path doesn't get updated on successive imports
even if enclosure path to the disk changes.  This patch fixes the issue.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes openzfs#11950
Closes openzfs#12095
tonyhutter added a commit to tonyhutter/zfs that referenced this pull request Nov 13, 2021
When you create a pool, zfs writes vd->vdev_enc_sysfs_path with the
enclosure sysfs path to the fault LEDs, like:

    vdev_enc_sysfs_path = /sys/class/enclosure/0:0:1:0/SLOT8

However, this enclosure path doesn't get updated on successive imports
even if enclosure path to the disk changes.  This patch fixes the issue.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes openzfs#11950
Closes openzfs#12095
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

enclosure path not updated when it changes between reboots
6 participants