-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Space checking of special device removal is too strict #9142
Labels
Type: Defect
Incorrect behavior (e.g. crash, hang)
Comments
cc @don-brady |
I've found one more issue in the mentioned code: mc->mc_groups > 1 condition quoted above does not allow to remove special vdev in situation when there are two of them and capacity of normal vdev(s) alone is insufficient. For example, if md0, md1 and md2 have the same size, then this test fails, while it should not:
|
ghost
pushed a commit
to zfsonfreebsd/ZoF
that referenced
this issue
Mar 10, 2020
Issue openzfs#9142 describes an error in the checks for device removal that can prevent removal of special allocation class vdevs in some situations. Enhance alloc_class/alloc_class_012_pos to check situations where this bug occurs. Update zts-report with knowledge of issue openzfs#9142. Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
ghost
mentioned this issue
Mar 10, 2020
12 tasks
ghost
pushed a commit
to zfsonfreebsd/ZoF
that referenced
this issue
Mar 10, 2020
Update zts-report with knowledge of issue openzfs#9142
behlendorf
pushed a commit
that referenced
this issue
Mar 12, 2020
Issue #9142 describes an error in the checks for device removal that can prevent removal of special allocation class vdevs in some situations. Enhance alloc_class/alloc_class_012_pos to check situations where this bug occurs. Update zts-report with knowledge of issue #9142. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes #10116 Issue #9142
ahrens
added a commit
to ahrens/zfs
that referenced
this issue
Dec 11, 2020
The space in special devices is not included in spa_dspace (or dsl_pool_adjustedsize(), or the zfs `available` property). Therefore there is always at least as much free space in the normal class, as there is allocated in the special class(es). And therefore, there is always enough free space to remove a special device. However, the checks for free space when removing special devices did not take this into account. This commit corrects that. Additionally, when removing the 2nd-to-last special device, its space would not be reallocated to the last remaining special device, because mc_groups has already been decremented. That is also fixed. Signed-off-by: Matthew Ahrens <mahrens@delphix.com> Closes openzfs#9142
13 tasks
jsai20
pushed a commit
to jsai20/zfs
that referenced
this issue
Mar 30, 2021
Issue openzfs#9142 describes an error in the checks for device removal that can prevent removal of special allocation class vdevs in some situations. Enhance alloc_class/alloc_class_012_pos to check situations where this bug occurs. Update zts-report with knowledge of issue openzfs#9142. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ryan Moeller <ryan@iXsystems.com> Closes openzfs#10116 Issue openzfs#9142
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
System information
Describe the problem you're observing
When removing a special device, sometimes we get an error
cannot remove <devname>: out of space
, when there's actually sufficient free space. We're especially likely to hit this when the device to remove is mostly emptyThe problem is that
spa_vdev_remove_top_check()
does not take into account the free space in the device to be removed. For example, imagine a pool with 1TB available space (in the normal class, perzfs list
), and a special device of 1.1TB with 0.1TB allocated. Logically, there is plenty of free space to move the 0.1TB of allocated space. However, the check being performed isspecial_device_size (1.1TB) < normal_available_space (1TB)
which fails.The solution is probably to always add the free space of the special class to
available
(not just when there will be some devices left in the class after removal). In the above example, we should be checkingspecial_device_size (1.1TB) < normal_available_space (1TB) + special_available_space (1TB)
which will succeed. This is similar to the logic when removing a normal device, where we checkremoving_device_size < normal_available_space
, wherenormal_available_space
includes the free space on the removing device.Describe how to reproduce the problem
The
alloc_class_013_pos
test operates very close to this limit, so making the dedup device slightly larger, or the zvol slightly larger, or the normal devices slightly smaller, will trigger this.Include any warning/errors/backtraces from the system logs
The text was updated successfully, but these errors were encountered: