Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Newly attached device is resilvered multiple times #9155

Closed
jgallag88 opened this issue Aug 12, 2019 · 6 comments
Closed

Newly attached device is resilvered multiple times #9155

jgallag88 opened this issue Aug 12, 2019 · 6 comments
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@jgallag88
Copy link
Contributor

System information

Type Version/Name
Distribution Name Ubuntu
Distribution Version 18.04
Linux Kernel 4.15.0
Architecture x86-64
ZFS Version delphix@3eebcae
SPL Version delphix@3eebcae

Describe the problem you're observing

When a device is attached to a pool, it sometimes ends up being resilvered twice. A resilver will be kicked off, and when it completes, it will start all over again the next txg. This seems to happen about half the time.

Describe how to reproduce the problem

Create a pool with a bit of data in it

$ sudo zpool create testpool xvdc
$ sudo dd if=/dev/urandom of=/testpool/file1 bs=1M count=4096

Then replace one of the devices in the pool

$ sudo zpool replace testpool xvdc xvdb

and watch the output of zpool status testpool 1. It will begin resilvering the new device

  pool: testpool
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Mon Aug 12 19:48:32 2019
        4.00G scanned at 455M/s, 1.28G issued at 145M/s, 4.00G total
        1.27G resilvered, 31.90% done, 0 days 00:00:19 to go
config:

        NAME           STATE     READ WRITE CKSUM
        testpool       ONLINE       0     0     0
          replacing-0  ONLINE       0     0     0
            xvdc       ONLINE       0     0     0
            xvdb       ONLINE       0     0     0  (resilvering)

it will finish resilvering the device

  pool: testpool
 state: ONLINE
  scan: resilvered 4.01G in 0 days 00:00:32 with 0 errors on Mon Aug 12 19:49:04 2019
config:

        NAME           STATE     READ WRITE CKSUM
        testpool       ONLINE       0     0     0
          replacing-0  ONLINE       0     0     0
            xvdc       ONLINE       0     0     0
            xvdb       ONLINE       0     0     0

then begin again

  pool: testpool
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Mon Aug 12 19:49:10 2019
        4.00G scanned at 4.00G/s, 112M issued at 112M/s, 4.00G total
        104M resilvered, 2.72% done, 0 days 00:00:35 to go
config:

        NAME           STATE     READ WRITE CKSUM
        testpool       ONLINE       0     0     0
          replacing-0  ONLINE       0     0     0
            xvdc       ONLINE       0     0     0
            xvdb       ONLINE       0     0     0  (resilvering)

If you are doing a replace, the old device is detached after the second resilver completes

  pool: testpool
 state: ONLINE
  scan: resilvered 4.01G in 0 days 00:00:32 with 0 errors on Mon Aug 12 19:49:42 2019
config:

        NAME        STATE     READ WRITE CKSUM
        testpool    ONLINE       0     0     0
          xvdb      ONLINE       0     0     0

This doesn't happen every time, but on my system it doesn't seem to take more than 2 or 3 attempts to be able to reproduce the issue.

Include any warning/errors/backtraces from the system logs

@jgallag88
Copy link
Contributor Author

What's happening is that when the new device is attached, zed receives an EC_DEV_STATUS.ESC_DEV_DLE event, which can cause it to reopen the pool. The reopen logic calls vdev_open(), which includes

    /*
     * If a leaf vdev has a DTL, and seems healthy, then kick off a
     * resilver.  But don't do this if we are doing a reopen for a scrub,
     * since this would just restart the scrub we are already doing.
     */
    if (vd->vdev_ops->vdev_op_leaf && !spa->spa_scrub_reopen &&
        vdev_resilver_needed(vd, NULL, NULL)) {
        if (dsl_scan_resilvering(spa->spa_dsl_pool) &&
            spa_feature_is_enabled(spa, SPA_FEATURE_RESILVER_DEFER))
            vdev_set_deferred_resilver(spa, vd);
        else
            spa_async_request(spa, SPA_ASYNC_RESILVER);
    }

When this logic runs for the new device, vdev_resilver_needed() is true, and because we just started a resilver when we attached the device, dsl_scan_resilvering() is true too. So we end calling vdev_set_deferred_resilver(), which is what triggers the next resilver after the first one finishes.

This can be reproduced by manually running zpool reopen while a resilver is in progress.

@hyegeek
Copy link

hyegeek commented Oct 17, 2019

Is there a way to stop the resilver loop once it starts? My server is currently resilving over and over again and each one is taking 24 or or more hours to do. The server is very slow with all of the disk traffic and it is causing a huge impact.

I'm currently running 0.8.2 on kernel 4.19.78.

@jgallag88
Copy link
Contributor Author

@hyegeek For this particular issue, the resilver will only restart if something is reopening the pool. In my case it was zed, but it could also be something else.

@hyegeek
Copy link

hyegeek commented Oct 21, 2019

My resilver finally finished once I killed zed and allowed it to resilver two more times. Once because zed was running at the beginning and the second time to clean things up. At over 24 hours for the resilver, this was painful.

So, this really seems like a bug. zed is need (I think) to report when there are issues, yet if it is running, it keeps you from recovering.

Am I missing something?

@jgallag88
Copy link
Contributor Author

Yes, this is a bug. I'm not too familiar with this code, and I haven't had a chance to track down how this behavior was introduced and what the correct fix would be.

@hyegeek
Copy link

hyegeek commented Oct 21, 2019

Thanks. Now that I know how to work around it, I can work on getting my blood pressure back down. 😀

Other things I've noticed about zed might (or might not) be helpful is hunting this. On some of my systems, it seems that ZED causes the kernel to constantly re-enumerate the disks. The server I had the resilver problem with is one such system. Every few minutes my kernel log lists the disks like it just found them. The log entries stop if I kill zed.

tonyhutter pushed a commit to tonyhutter/zfs that referenced this issue Jan 22, 2020
If a device is participating in an active resilver, then it will have a
non-empty DTL. Operations like vdev_{open,reopen,probe}() can cause the
resilver to be restarted (or deferred to be restarted later), which is
unnecessary if the DTL is still covered by the current scan range. This
is similar to the logic in vdev_dtl_should_excise() where the DTL can
only be excised if it's max txg is in the resilvered range.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Gallagher <john.gallagher@delphix.com>
Reviewed-by: Kjeld Schouten <kjeld@schouten-lebbing.nl>
Signed-off-by: John Poduska <jpoduska@datto.com>
Issue openzfs#840
Closes openzfs#9155
Closes openzfs#9378
Closes openzfs#9551
Closes openzfs#9588
tonyhutter pushed a commit that referenced this issue Jan 23, 2020
If a device is participating in an active resilver, then it will have a
non-empty DTL. Operations like vdev_{open,reopen,probe}() can cause the
resilver to be restarted (or deferred to be restarted later), which is
unnecessary if the DTL is still covered by the current scan range. This
is similar to the logic in vdev_dtl_should_excise() where the DTL can
only be excised if it's max txg is in the resilvered range.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Gallagher <john.gallagher@delphix.com>
Reviewed-by: Kjeld Schouten <kjeld@schouten-lebbing.nl>
Signed-off-by: John Poduska <jpoduska@datto.com>
Issue #840
Closes #9155
Closes #9378
Closes #9551
Closes #9588
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

No branches or pull requests

3 participants