Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix prefetching of indirect blocks while destroying #14603

Merged
merged 2 commits into from
Mar 24, 2023

Commits on Mar 11, 2023

  1. fix prefetching of indirect blocks while destroying

    When traversing a tree of block pointers (e.g. for `zfs destroy <fs>` or
    `zfs send`), we prefetch the indirect blocks that will be needed, in
    `traverse_prefetch_metadata()`.  In the case of `zfs destroy <fs>`, we
    do a little traversing each txg, and resume the traversal the next txg.
    So the indirect blocks that will be needed, and thus are candidates for
    prefetching, does not include blocks that are before the resume point.
    
    The problem is that the logic for determining if the indirect blocks are
    before the resume point is incorrect, causing the (up to 1024) L1
    indirect blocks that are inside the first L2 to not be prefetched.  In
    practice, if we are able to read many more than 1024 blocks per txg,
    then this will be inconsequential.  But if i/o latency is more than a
    few milliseconds, almost no L1's will be prefetched, so they will be
    read serially, and thus the destroying will be very slow.  This can be
    observed as `zpool get freeing` decreasing very slowly.
    
    Specifically: When we first examine the L2 that contains the block we'll
    be resuming from, we have not yet resumed, so `td_resume` is nonzero.
    At this point, all calls to `traverse_prefetch_metadata()` will fail,
    even if the L1 in question is after the resume point.  It isn't until
    the callback is issued for the resume point that we zero out
    `td_resume`, but by this point we've already attempted and failed to
    prefetch everything under this L2 indirect block.
    
    This commit addresses the issue by reusing the existing
    `resume_skip_check()` to determine if the L1's bookmark is before or
    after the resume point.  To do so, this function is made non-mutating
    (the caller now zeros `td_resume`).
    
    Note, this bug likely predates (was not introduced by) openzfs#11803.
    
    Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
    ahrens committed Mar 11, 2023
    Configuration menu
    Copy the full SHA
    9b23dd5 View commit details
    Browse the repository at this point in the history
  2. fix prefetching - no zero

    ahrens committed Mar 11, 2023
    Configuration menu
    Copy the full SHA
    8e2b28c View commit details
    Browse the repository at this point in the history