Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zpool trim -w does not wait for trim to end #10263

Closed
dioni21 opened this issue Apr 29, 2020 · 13 comments
Closed

zpool trim -w does not wait for trim to end #10263

dioni21 opened this issue Apr 29, 2020 · 13 comments
Assignees
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@dioni21
Copy link
Contributor

dioni21 commented Apr 29, 2020

System information

Type Version/Name
Distribution Name Fedora
Distribution Version 31
Linux Kernel 5.5.17-200.fc31.x86_64
Architecture x86_64
ZFS Version 0.8.0-732_ga84c92f93 (master as of 21/4/2020)
SPL Version 0.8.0-732_ga84c92f93

Describe the problem you're observing

I disabled autotrim and intend to run it at specified times. While testing the feature, I found that the -w flag does not work.

Describe how to reproduce the problem

# zpool set autotrim off tank
# time zpool trim -w tank
real    0m0.684s
user    0m0.348s
sys     0m0.070s
# echo $?
1
# zpool status -Pt
  pool: tank
 state: ONLINE
status: Some supported features are not enabled on the pool. The pool can
        still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
        the pool may no longer be accessible by software that does not support
        the features. See zpool-features(5) for details.
config: 

        NAME                               STATE     READ WRITE CKSUM
        tank                               ONLINE       0     0     0
          mirror-0                         ONLINE       0     0     0
            /dev/disk/by-id/ata-XXX-part1  ONLINE       0     0     0  (trim unsupported)
            /dev/disk/by-id/ata-YYY-part1  ONLINE       0     0     0  (trim unsupported)
        special
          mirror-1                         ONLINE       0     0     0
            /dev/ssd480/special            ONLINE       0     0     0  (0% trimmed, started at Tue 28 Apr 2020 09:09:24 PM -03)
            /dev/nvme/special              ONLINE       0     0     0  (0% trimmed, started at Tue 28 Apr 2020 09:09:24 PM -03)
        logs
          mirror-2                         ONLINE       0     0     0
            /dev/ssd480/zil                ONLINE       0     0     0  (46% trimmed, started at Tue 28 Apr 2020 09:09:24 PM -03)
            /dev/nvme/zil                  ONLINE       0     0     0  (46% trimmed, started at Tue 28 Apr 2020 09:09:24 PM -03)
        cache
          /dev/nvme/cache                  ONLINE       0     0     0  (untrimmed)
          /dev/ssd480/cache                ONLINE       0     0     0  (untrimmed)

errors: No known data errors

If I insist and do the command again:

# time zpool trim -w tank
cannot trim '/dev/ssd480/special': currently trimming
cannot trim '/dev/nvme/special': currently trimming
cannot trim '/dev/ssd480/zil': currently trimming
cannot trim '/dev/nvme/zil': currently trimming
real    0m1.349s
user    0m0.354s
sys     0m0.070s

Include any warning/errors/backtraces from the system logs

No error anywhere I could see

@dioni21
Copy link
Contributor Author

dioni21 commented Apr 29, 2020

Tagging @jgallag88, per request

@planarteapot
Copy link

Does zpool trim have a -w option?

zfs-0.8.3-1ubuntu12
zfs-kmod-0.8.3-1ubuntu12
$ zpool trim
missing pool name argument
usage:
        trim [-d] [-r <rate>] [-c | -s] <pool> [<device> ...]```

@gdevenyi
Copy link
Contributor

@dentad #10071

@jgallag88 jgallag88 self-assigned this Apr 29, 2020
@jgallag88
Copy link
Contributor

Thanks for filing this @dioni21. I suspect the issue that you are hitting is that the -w flag doesn't work when the pool has a mixture of devices that support trim and ones that don't:

delphix@ip-10-110-231-233:~$ sudo zpool create -f testpool xvdb /export/home/delphix/vdev_1
delphix@ip-10-110-231-233:~$ zpool status -t testpool
  pool: testpool
 state: ONLINE
  scan: none requested
config:

	NAME                           STATE     READ WRITE CKSUM
	testpool                       ONLINE       0     0     0
	  xvdb                         ONLINE       0     0     0  (trim unsupported)
	  /export/home/delphix/vdev_1  ONLINE       0     0     0  (untrimmed)

errors: No known data errors
delphix@ip-10-110-231-233:~$ time sudo zpool trim -w testpool

real	0m0.039s
user	0m0.014s
sys	0m0.005s
delphix@ip-10-110-231-233:~$ echo $?
255

The issue is here. After we do the ioctl to start the trim, we check the return value and if it is non-zero, we skip the wait. However, the return value can indicate errors for devices that don't support trim (EZFS_TRIM_NOTSUP). Such errors are later skipped over in the loop where the error messages are printed out, so they aren't really errors from the perspective of the overall zpool trim command (except that they sort of are, since they still cause a non-zero exit code).

The one difference between the example I've posted and the one you are hitting is the exit status of zpool wait (255 vs 1). I'm not sure what accounts for that difference; it looks like zpool_trim always returns -1 (reported as 255 by bash) in this scenario.

@jgallag88
Copy link
Contributor

@dioni21 Do you by any chance have any aliases or other things in your shell environment that would have modified the exit status of zpool trim, accounting for the difference between my testing and yours?

@jgallag88
Copy link
Contributor

Assuming that the problem I mentioned is the root cause of this issue, I think the best thing to do might be to ignore EZFS_TRIM_NOTSUP errors when deciding whether or not to wait.

However, I think that it's potentially confusing that zpool trim can return a non-zero status without printing anything to stderr when there are devices which don't support trimming (and the whole pool is being trimmed). @behlendorf, was this the intended behavior?

@behlendorf
Copy link
Contributor

The intended behavior was to return an error when attempting to TRIM a specific vdev and it doesn't support TRIM. When attempting to TRIM an entire pool I agree it would make the most sense to not return an error as long as at least one of the vdevs supports the TRIM.

@kneutron
Copy link

--Suggested workaround until this is fixed in code: bash wrapper script to call zfs trim, and [pseudocode] while $(zpool status -v |grep -c '% trimmed') -gt 0 ; sleep 10 until it disappears

sdate=$(date)
# do forever
while :; do
  echo "Pool: $1 - trim started: $sdate"
  zpool status -v $1 |grep  '% trimmed' || break 2
  sleep 10
  date
  
done

--Note this is untested, as I don't have any trim-supported ZFS devices atm

@jgallag88
Copy link
Contributor

You can also use zpool wait -t trim after starting the trim. Regardless, I'll try to open a PR for both these issues shortly.

@dioni21
Copy link
Contributor Author

dioni21 commented May 24, 2020

@dioni21 Do you by any chance have any aliases or other things in your shell environment that would have modified the exit status of zpool trim, accounting for the difference between my testing and yours?

No.

My setup is compiled from git, using --enable-asan --enable-debuginfo --enable-debug, but I think this does not affect exit behaviour.

@jgallag88
Copy link
Contributor

@dioni21 Do you by any chance have any aliases or other things in your shell environment that would have modified the exit status of zpool trim, accounting for the difference between my testing and yours?

No.

My setup is compiled from git, using --enable-asan --enable-debuginfo --enable-debug, but I think this does not affect exit behaviour.

Strange. The only places I can see the return value of 1 potentially coming from are

  • from the call to zpool_trim_wait() (but I don't think that's ever getting called, because we currently skip the wait if any devices don't support trim)
  • from time. It's possible that different shells have implementation of time that work differently, but I'd expect them to generally pass the return value of the program through unmodified.

In any case, I've opened #10372 to address the issues that we do understand.

@behlendorf behlendorf added the Type: Defect Incorrect behavior (e.g. crash, hang) label May 26, 2020
@dioni21
Copy link
Contributor Author

dioni21 commented May 27, 2020

@jgallag88

Strange. The only places I can see the return value of 1 potentially coming from are

  • from the call to zpool_trim_wait() (but I don't think that's ever getting called, because we currently skip the wait if any devices don't support

I browsed the code to see if I could find anything, and indeed most returns of 1 are in case of memory allocation errors. I think I saw something in vdev treatment, but could not be sure. Note that my setup has also special, logs and cache devices, while your test setup has not.

  • from time. It's possible that different shells have implementation of time that work differently, but I'd expect them to generally pass the return value of the program through unmodified.

Thought this too, but the same same happens without time.

@behlendorf
Copy link
Contributor

I've merged the fix for the original issue here. If there are other fixes needed lets tackle them in follow up PRs as needed.

as-com pushed a commit to as-com/zfs that referenced this issue Jun 20, 2020
When a manual trim is run against an entire pool, errors about
particular devices which don't support trim are suppressed. This changes
zpool_trim() in libzfs so that it doesn't return an error when the only
errors are suppressed ones. An exception is made when none of the
devices support trim, in which case an error is reported and a non-zero
status is returned.

This also fixes how the --wait flag works in the presence of suppressed
errors. In particular, suppressed errors no longer cause zpool_trim()
to skip the wait.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: John Gallagher <john.gallagher@delphix.com>
Closes openzfs#10263 
Closes openzfs#10372 
(cherry picked from commit 50ff632)
jsai20 pushed a commit to jsai20/zfs that referenced this issue Mar 30, 2021
When a manual trim is run against an entire pool, errors about
particular devices which don't support trim are suppressed. This changes
zpool_trim() in libzfs so that it doesn't return an error when the only
errors are suppressed ones. An exception is made when none of the
devices support trim, in which case an error is reported and a non-zero
status is returned.

This also fixes how the --wait flag works in the presence of suppressed
errors. In particular, suppressed errors no longer cause zpool_trim()
to skip the wait.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: John Gallagher <john.gallagher@delphix.com>
Closes openzfs#10263 
Closes openzfs#10372
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

No branches or pull requests

6 participants