Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix dynamic vdev expansion (zpool online -e) #822

Closed
wants to merge 5 commits into from

Conversation

dechamps
Copy link
Contributor

This series of patches fixes a bunch of issues related to dynamic vdev expansion (zpool online -e), especially on whole disks.

There were numerous bugs in the whole code path: zpool_relabel_disk wasn't displaying errors correctly (5534b70), zpool_vdev_online wasn't using the right device path when relabelling (5d49b12), efi_write wasn't able to rescan partitions due to EBUSY (2f30140), and efi_use_whole_disk was broken due to a regression in e5dc681 ( 8d45c2d). These patches fix all of these problems. See the individual commits for details.

After applying these patches, zpool online -e should work gracefully in all circumstances.

Fixes #808.

@behlendorf: commit 2f30140 uses a bunch of kernel-provided functions (get_gendisk, bdget, blkdev_get, ioctl_by_bdev). Expect trouble with old kernels.

dechamps added 4 commits July 6, 2012 15:44
The error handling code around zpool_relabel_disk() is either inexistent
or wrong. The function call itself is not checked, and
zpool_relabel_disk() is generating error messages from an unitialized
buffer.

Before:

    # zpool online -e homez sdb; echo $?
    `: cannot relabel 'sdb1': unable to open device: 2
    0

After:

    # zpool online -e homez sdb; echo $?
    cannot expand sdb: cannot relabel 'sdb1': unable to open device: 2
    1
Currently, zpool_vdev_online() calls zpool_relabel_disk() with a short
partition device name, which is obviously wrong because (1)
zpool_relabel_disk() expects a full, absolute path to use with open()
and (2) efi_write() must be called on an opened disk device, not a
partition device.

With this patch, zpool_relabel_disk() gets called with a full disk
device path. The path is determined using the same algorithm as
zpool_find_vdev().
Currently, zpool online -e (dynamic vdev expansion) doesn't work on
whole disks because we're invoking ioctl(BLKRRPART) from userspace
while ZFS still has a partition open on the disk, which results in
EBUSY.

This patch moves the BLKRRPART invocation from the zpool utility to the
module. Specifically, this is done just before opening the device in
vdev_disk_open() which is called inside vdev_reopen(). This requires
jumping through some hoops to get to the disk device from the partition
device, and to make sure we can still open the partition after the
BLKRRPART call.

Note that this new code path is triggered on dynamic vdev expansion
only; other actions, like creating a new pool, are unchanged and still
call BLKRRPART from userspace.
Commit e5dc681 changed EFI_NUMPAR from 9 to 128. This means that the
on-disk EFI label has efi_nparts = 128 instead of 9. The index of the
reserved partition, however, is still 8. This breaks
efi_use_whole_disk(), which uses efi_nparts-1 as the index of the
reserved partition.

This commit fixes efi_use_whole_disk() when the index of the reserved
partition is not efi_nparts-1. It rewrites the algorithm and makes it
more robust by using the order of the partitions instead of their
numbering. It assumes that the last non-empty partition is the reserved
partition, and that the non-empty partition before that is the data
partition.
This reverts parts of 2f30140 which added non-exclusive versions of
vdev_bdev_open and vdev_bdev_close. These calls were needed in the first
versions of the 2f30140 patch, but it turns out they're not needed
anymore and I just forgot to remove them.
@behlendorf
Copy link
Contributor

After making the few minor changes suggested above I've merged the following 3 changes in to master.

cee43a7 Fix efi_use_whole_disk() when efi_nparts == 128.
7608bd0 Use the right device path when relabeling.
8adf486 Fix error handling for "zpool online -e".

That just leaves the partition scanning change which still needs a little clean up. I'm going to close this pull request and ask you to open a new one based on master with the needed changes.

@behlendorf behlendorf closed this Jul 12, 2012
@dechamps
Copy link
Contributor Author

You mean a new pull request with 2f30140 and be99500 squashed together? You also mentioned blkdev_get() being non-portable but I don't have the necessary test systems to produce the needed wrapper, so I assumed you were going to do it.

@behlendorf
Copy link
Contributor

Yeah, basically I was asking just to have them squashed together and to have the blkdev_git() issue fixed. I'm fairly well set up to do that so let me just do the leg work on this first and I'll open the pull request for you to review.

pcd1193182 pushed a commit to pcd1193182/zfs that referenced this pull request Sep 26, 2023
…object_agent/rustix-0.37.13

Bump rustix from 0.37.11 to 0.37.13 in /cmd/zfs_object_agent
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

vdev expansion doesn't work
2 participants