Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add subcommand to wait for background zfs activity to complete #9162

Merged
merged 3 commits into from
Sep 14, 2019

Conversation

jgallag88
Copy link
Contributor

@jgallag88 jgallag88 commented Aug 14, 2019

Motivation

Currently the best way to wait for the completion of a long-running operation in a pool, like a scrub or device removal, is to poll zpool status and parse its output, which is neither efficient nor convenient.

Description

This change adds a wait subcommand to the zpool command. When invoked, zpool wait will block until a specified type of background activity completes. Currently, this subcommand can wait for any of the following:

  • Scrubs or resilvers to complete
  • Devices to initialized
  • Devices to be replaced
  • Devices to be removed
  • Checkpoints to be discarded
  • Background freeing to complete

For example, a scrub that is in progress could be waited for by running

zpool wait -t scrub <pool>

This also adds a -w flag to the attach, checkpoint, initialize, replace, remove, and scrub subcommands. When used, this flag makes the operations kicked off by these subcommands synchronous instead of asynchronous.

This functionality is implemented using a new ioctl. The type of activity to wait for is provided as input to the ioctl, and the ioctl blocks until all activity of that type has completed. An ioctl was used over other methods of kernel-userspace communiction primarily for the sake of portability.

Porting Notes

This is ported from Delphix OS change DLPX-44432. The following changes
were made while porting:

  • Added ZoL-style ioctl input declaration.
  • Reorganized error handling in zpool_initialize in libzfs to integrate
    better with changes made for TRIM support.
  • Fixed check for whether a checkpoint discard is in progress.
    Previously it also waited if the pool had a checkpoint, instead of
    just if a checkpoint was being discarded.
  • Exposed zfs_initialize_chunk_size as a ZoL-style tunable.
  • Updated more existing tests to make use of new zpool wait
    functionality, tests that don't exist in Delphix OS.
  • Used existing ZoL tunable zfs_scan_suspend_progress, together with
    zinject, in place of a new tunable zfs_scan_max_blks_per_txg.
  • Added support for a non-integral interval argument to zpool wait.
  • Added logic to periodically reprint headers when stdout is attached to a terminal.
  • Added a test to make sure zpool wait can be run as an unprivileged user.

Future work

ZoL has support for trimming devices, which Delphix OS does not. In the future, 'zpool wait' could be extended to add the ability to wait for trim operations to complete.

How Has This Been Tested?

New tests have been added to cover this new feature. These tests have been running at Delphix for about a year and a half.

Signed-off-by: John Gallagher john.gallagher@delphix.com

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (a change to man pages or other documentation)

Checklist:

@jgallag88
Copy link
Contributor Author

Note that a couple of the new tests are flaky because of #9155. Depending on what the resolution there is, I can update the new tests to work around that issue if need be.

@codecov
Copy link

codecov bot commented Aug 14, 2019

Codecov Report

Merging #9162 into master will decrease coverage by 0.12%.
The diff coverage is 65.98%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #9162      +/-   ##
==========================================
- Coverage   79.11%   78.98%   -0.13%     
==========================================
  Files         400      400              
  Lines      121790   122055     +265     
==========================================
+ Hits        96357    96409      +52     
- Misses      25433    25646     +213
Flag Coverage Δ
#kernel 79.75% <96.15%> (ø) ⬆️
#user 66.61% <44.04%> (+0.06%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c8242a9...f0ff6c4. Read the comment docs.

@behlendorf behlendorf added the Status: Code Review Needed Ready for review and testing label Aug 16, 2019
@behlendorf behlendorf added the Type: Feature Feature request or new feature label Aug 26, 2019
Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's great to see the long wanted feature implemented! A few initial comments inline.

cmd/zpool/zpool_main.c Outdated Show resolved Hide resolved
module/zfs/zfs_ioctl.c Show resolved Hide resolved
cmd/zpool/zpool_main.c Outdated Show resolved Hide resolved
cmd/zpool/zpool_main.c Outdated Show resolved Hide resolved
cmd/zpool/zpool_main.c Outdated Show resolved Hide resolved
{
static char *col_subopts[] = { "discard", "free",
"initialize", "replace", "remove", "resilver",
"scrub", NULL };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While not necessarily a blocker, it would be highly desirable to include "trim" in the events which can be waited for. This would help head off compatibility issues with scripts that parse this output when we finally do add a "trim" column. Luckily, the logically place to put the new column is at the end. But it would still be nice to address now if at all possible, the TRIM logic should be very similar, if not identical, to the initialize logic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. I'll work on adding this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I imagine in the future we'll add new background tasks that we may want to wait for. Hopefully we wouldn't worry about breaking scripts by adding that to the default output of zpool wait. Scripts that want to parse the columns of output need to either specify which columns they want, or not many any assumptions about what columns exist.

So, I don't think that changing the default output should be a consideration for including zpool wait -t trim now vs. later. That said, if you want trim-wait support in the initial integration, for completeness, I can understand that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No disagreement from me. It's why I specifically mentioned that I didn't consider the missing zpool wait -t trim functionality to be a blocker. Just that we could potentially prevent some minor issues by including it in the initial commit. From an end user perspective it's also a rather conspicuous omission.

cmd/zpool/zpool_main.c Show resolved Hide resolved
cmd/zpool/zpool_main.c Show resolved Hide resolved
configure.ac Show resolved Hide resolved
cmd/zpool/zpool_main.c Outdated Show resolved Hide resolved
@jgallag88 jgallag88 force-pushed the zpoolWait branch 2 times, most recently from 7632bfe to f9f1144 Compare September 11, 2019 18:41
@@ -46,7 +46,7 @@ unsigned long zfs_initialize_value = 0xdeadbeefdeadbeeeULL;
int zfs_initialize_limit = 1;

/* size of initializing writes; default 1MiB, see zfs_remove_max_segment */
uint64_t zfs_initialize_chunk_size = 1024 * 1024;
unsigned long zfs_initialize_chunk_size = 1024 * 1024;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to your change, but it seems like this could default to SPA_MAXBLOCKSIZE (on ZoL), like zfs_remove_max_segment.

Currently the best way to wait for the completion of a long-running
operation in a pool, like a scrub or device removal, is to poll 'zpool
status' and parse its output, which is neither efficient nor convenient.

This change adds a 'wait' subcommand to the zpool command. When invoked,
'zpool wait' will block until a specified type of background activity
completes. Currently, this subcommand can wait for any of the following:

 - Scrubs or resilvers to complete
 - Devices to initialized
 - Devices to be replaced
 - Devices to be removed
 - Checkpoints to be discarded
 - Background freeing to complete

For example, a scrub that is in progress could be waited for by running

    zpool wait -t scrub <pool>

This also adds a -w flag to the attach, checkpoint, initialize, replace,
remove, and scrub subcommands. When used, this flag makes the operations
kicked off by these subcommands synchronous instead of asynchronous.

This functionality is implemented using a new ioctl. The type of
activity to wait for is provided as input to the ioctl, and the ioctl
blocks until all activity of that type has completed. An ioctl was used
over other methods of kernel-userspace communiction primarily for the
sake of portability.

Porting Notes:
This is ported from Delphix OS change DLPX-44432. The following changes
were made while porting:

 - Added ZoL-style ioctl input declaration.
 - Reorganized error handling in zpool_initialize in libzfs to integrate
   better with changes made for TRIM support.
 - Fixed check for whether a checkpoint discard is in progress.
   Previously it also waited if the pool had a checkpoint, instead of
   just if a checkpoint was being discarded.
 - Exposed zfs_initialize_chunk_size as a ZoL-style tunable.
 - Updated more existing tests to make use of new 'zpool wait'
   functionality, tests that don't exist in Delphix OS.
 - Used existing ZoL tunable zfs_scan_suspend_progress, together with
   zinject, in place of a new tunable zfs_scan_max_blks_per_txg.
 - Added support for a non-integral interval argument to zpool wait.

Future work:
ZoL has support for trimming devices, which Delphix OS does not. In the
future, 'zpool wait' could be extended to add the ability to wait for
trim operations to complete.

Signed-off-by: John Gallagher <john.gallagher@delphix.com>
 - reprint headers when output scrolls past end of terminal
 - add test that zpool wait can be run by unprivileged users
 - various style and grammar nits
Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing my feedback, this looks good to me pending confirmation from the CI. @jgallag88 if it's going to take a while to add the trim functionality I don't object to merging this now to prevent it from getting stale. However, if it's close I'd prefer to wait.

@jgallag88
Copy link
Contributor Author

@behlendorf Unfortunately I think it's going to take me at least a couple weeks before I can add trim support, given the other things I'm working on at the moment. It's not a huge deal either way, but it would be nice to have this merged.

One thing that I think does need to be addressed before this can be merged, though, is flakiness in the new tests caused by #9155. It might not show up in the CI tests, but I've seen it occasionally when running the tests locally. I can work around it pretty easily by destroying and recreating the test pool in between each relevant test, but I was a little hesitant to do that if #9155 really is a bug and not just a misunderstanding on my part of how resilvering works.

@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Sep 11, 2019
@codecov
Copy link

codecov bot commented Sep 12, 2019

Codecov Report

Merging #9162 into master will increase coverage by 10.68%.
The diff coverage is 83.21%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master    #9162       +/-   ##
===========================================
+ Coverage   68.45%   79.13%   +10.68%     
===========================================
  Files         354      401       +47     
  Lines      115959   122457     +6498     
===========================================
+ Hits        79376    96906    +17530     
+ Misses      36583    25551    -11032
Flag Coverage Δ
#kernel 79.77% <97.69%> (+12.37%) ⬆️
#user 67.14% <62.34%> (+7.11%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5815f7a...c887ade. Read the comment docs.

@behlendorf behlendorf merged commit e60e158 into openzfs:master Sep 14, 2019
@jgallag88 jgallag88 deleted the zpoolWait branch February 27, 2020 21:52
@jgallag88 jgallag88 mentioned this pull request Feb 27, 2020
12 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested) Type: Feature Feature request or new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants