Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework error handling in zpool_trim() #10372

Merged
merged 1 commit into from
May 28, 2020

Conversation

jgallag88
Copy link
Contributor

Motivation and Context

When a manual trim is run against an entire pool, errors about particular devices which don't support trim are suppressed. However, a non-zero status is returned. Moreover, if a wait was requested it is skipped, as if there had been a fatal error.

Description

This changes zpool_trim() in libzfs so that it doesn't return an error when the only errors are suppressed ones. An exception is made when none of the devices support trim, in which case an error is reported and a non-zero status is returned.

This also fixes how the --wait flag works in the presence of suppressed errors. In particular, suppressed errors no longer cause zpool_trim() to skip the wait.

I did some refactoring that I felt helped make the updated error-handling logic simpler. It also fixes a small memory leak, and a assertion failure that can happen in certain error paths, both related to the errlist nvlist constructed by lzc_trim.

How Has This Been Tested?

Ran zpool trim against a pool with devices which all support trim, against a pool with a mixture of devices that support trim and devices that don't, and against a pool with only devices that don't support trim. I checked that an error was reported in the third scenario, but not the first two.

I repeated the same tests, except specifying the vdevs explicitly instead of just trimming the whole pool. In this case, an error was reported in both of the last two scenarios.

In all cases, I check that a non-zero exit status was returned if and only if and only if an error was reported.

I also checked that in all of the scenarios above, if the wait flag was specified, the wait was performed unless an error was reported.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (a change to man pages or other documentation)

Checklist:

  • My code follows the ZFS on Linux code style requirements.
  • I have updated the documentation accordingly.
  • I have read the contributing document.
  • I have added tests to cover my changes.
  • I have run the ZFS Test Suite with this change applied.
  • All commit messages are properly formatted and contain Signed-off-by.

Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

lib/libzfs/libzfs_pool.c Outdated Show resolved Hide resolved
@codecov-commenter
Copy link

codecov-commenter commented May 27, 2020

Codecov Report

Merging #10372 into master will increase coverage by 0.10%.
The diff coverage is 79.06%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #10372      +/-   ##
==========================================
+ Coverage   79.52%   79.62%   +0.10%     
==========================================
  Files         391      391              
  Lines      123590   123608      +18     
==========================================
+ Hits        98288    98427     +139     
+ Misses      25302    25181     -121     
Flag Coverage Δ
#kernel 80.01% <ø> (+0.01%) ⬆️
#user 66.10% <79.06%> (+0.17%) ⬆️
Impacted Files Coverage Δ
lib/libzfs/libzfs_pool.c 73.24% <79.06%> (-0.24%) ⬇️
module/zfs/vdev_raidz.c 88.88% <0.00%> (-4.36%) ⬇️
module/zfs/vdev_raidz_math.c 76.57% <0.00%> (-2.26%) ⬇️
module/zfs/zrlock.c 89.23% <0.00%> (-1.54%) ⬇️
module/os/linux/zfs/vdev_disk.c 83.65% <0.00%> (-0.78%) ⬇️
module/zfs/dsl_scan.c 85.87% <0.00%> (-0.68%) ⬇️
cmd/zed/agents/zfs_mod.c 77.55% <0.00%> (-0.67%) ⬇️
module/zfs/vdev_removal.c 96.55% <0.00%> (-0.58%) ⬇️
module/zfs/vdev_indirect_mapping.c 98.55% <0.00%> (-0.49%) ⬇️
module/os/linux/zfs/zpl_xattr.c 83.26% <0.00%> (-0.43%) ⬇️
... and 49 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c946d5a...8142532. Read the comment docs.

When a manual trim is run against an entire pool, errors about
particular devices which don't support trim are suppressed. This changes
zpool_trim() in libzfs so that it doesn't return an error when the only
errors are suppressed ones. An exception is made when none of the
devices support trim, in which case an error is reported and a non-zero
status is returned.

This also fixes how the --wait flag works in the presence of suppressed
errors. In particular, suppressed errors no longer cause zpool_trim()
to skip the wait.

Signed-off-by: John Gallagher <john.gallagher@delphix.com>
@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels May 27, 2020
@behlendorf behlendorf merged commit 50ff632 into openzfs:master May 28, 2020
as-com pushed a commit to as-com/zfs that referenced this pull request Jun 20, 2020
When a manual trim is run against an entire pool, errors about
particular devices which don't support trim are suppressed. This changes
zpool_trim() in libzfs so that it doesn't return an error when the only
errors are suppressed ones. An exception is made when none of the
devices support trim, in which case an error is reported and a non-zero
status is returned.

This also fixes how the --wait flag works in the presence of suppressed
errors. In particular, suppressed errors no longer cause zpool_trim()
to skip the wait.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: John Gallagher <john.gallagher@delphix.com>
Closes openzfs#10263 
Closes openzfs#10372 
(cherry picked from commit 50ff632)
jsai20 pushed a commit to jsai20/zfs that referenced this pull request Mar 30, 2021
When a manual trim is run against an entire pool, errors about
particular devices which don't support trim are suppressed. This changes
zpool_trim() in libzfs so that it doesn't return an error when the only
errors are suppressed ones. An exception is made when none of the
devices support trim, in which case an error is reported and a non-zero
status is returned.

This also fixes how the --wait flag works in the presence of suppressed
errors. In particular, suppressed errors no longer cause zpool_trim()
to skip the wait.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: John Gallagher <john.gallagher@delphix.com>
Closes openzfs#10263 
Closes openzfs#10372
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants