Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zfs-2.1.14 patchset #15601

Closed

Conversation

tonyhutter
Copy link
Contributor

Motivation and Context

Patchset for 2.1.14. This release was put out to include the fix for dirty dbuf corruption: #15526

Description

Include fix for data corruption. Full details in: #15526.

How Has This Been Tested?

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

Over its history this the dirty dnode test has been changed between
checking for a dnodes being on `os_dirty_dnodes` (`dn_dirty_link`) and
`dn_dirty_record`.

  de198f2 Fix lseek(SEEK_DATA/SEEK_HOLE) mmap consistency
  2531ce3 Revert "Report holes when there are only metadata changes"
  ec4f9b8 Report holes when there are only metadata changes
  454365b Fix dirty check in dmu_offset_next()
  66aca24 SEEK_HOLE should not block on txg_wait_synced()

Also illumos/illumos-gate@c543ec060d illumos/illumos-gate@2bcf0248e9

It turns out both are actually required.

In the case of appending data to a newly created file, the dnode proper
is dirtied (at least to change the blocksize) and dirty records are
added.  Thus, a single logical operation is represented by separate
dirty indicators, and must not be separated.

The incorrect dirty check becomes a problem when the first block of a
file is being appended to while another process is calling lseek to skip
holes. There is a small window where the dnode part is undirtied while
there are still dirty records. In this case, `lseek(fd, 0, SEEK_DATA)`
would not know that the file is dirty, and would go to
`dnode_next_offset()`. Since the object has no data blocks yet, it
returns `ESRCH`, indicating no data found, which results in `ENXIO`
being returned to `lseek()`'s caller.

Since coreutils 9.2, `cp` performs sparse copies by default, that is, it
uses `SEEK_DATA` and `SEEK_HOLE` against the source file and attempts to
replicate the holes in the target. When it hits the bug, its initial
search for data fails, and it goes on to call `fallocate()` to create a
hole over the entire destination file.

This has come up more recently as users upgrade their systems, getting
OpenZFS 2.2 as well as a newer coreutils. However, this problem has been
reproduced against 2.1, as well as on FreeBSD 13 and 14.

This change simply updates the dirty check to check both types of dirty.
If there's anything dirty at all, we immediately go to the "wait for
sync" stage, It doesn't really matter after that; both changes are on
disk, so the dirty fields should be correct.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes openzfs#15571
Closes openzfs#15526
@tonyhutter tonyhutter changed the base branch from master to zfs-2.1.13-staging November 28, 2023 22:34
@tonyhutter tonyhutter changed the base branch from zfs-2.1.13-staging to zfs-2.1-release November 28, 2023 22:37
@tonyhutter tonyhutter changed the base branch from zfs-2.1-release to zfs-2.1.14-staging November 28, 2023 22:44
@robn
Copy link
Member

robn commented Nov 28, 2023

Thanks @tonyhutter.

@behlendorf behlendorf added the Status: Accepted Ready to integrate (reviewed, tested) label Nov 28, 2023
@tonyhutter tonyhutter changed the title zfs-2.1.14 patchstack zfs-2.1.14 patchset Nov 28, 2023
@AllKind
Copy link
Contributor

AllKind commented Nov 29, 2023

@tonyhutter
Could we include these:
9ce567c
95b68eb

So the dkms installation mess (modules not being built, broken dkms status) is not carried on (2.1.x -> 2.1.y and 2.1.x -> 2.2.x upgrades).

@AllKind
Copy link
Contributor

AllKind commented Nov 29, 2023

@tonyhutter One other thing:
Would it be possible to add a note to the release notes, as proposed in #15575 ?

@IvanVolosyuk
Copy link

IvanVolosyuk commented Nov 29, 2023

It would be nice to have a fix for:
#15338
Kernels 6.5+ compiled with ZFS this way don't actually include ZFS filesystem due to a script failure to patch fs/Makefile.
It is mentioned in the TODO for the 2.1-release:
https://github.com/openzfs/zfs/projects/30

When doing a manual TRIM on a zpool, the metaslab being TRIMmed is
potentially re-enabled before all queued TRIM zios for that metaslab
have completed. Since TRIM zios have the lowest priority, it is 
possible to get into a situation where allocations occur from the 
just re-enabled metaslab and cut ahead of queued TRIMs to the same 
metaslab.  If the ranges overlap, this will cause corruption.

We were able to trigger this pretty consistently with a small single 
top-level vdev zpool (i.e. small number of metaslabs) with heavy 
parallel write activity while performing a manual TRIM against a 
somewhat 'slow' device (so TRIMs took a bit of time to complete). 
With the patch, we've not been able to recreate it since. It was on 
illumos, but inspection of the OpenZFS trim code looks like the 
relevant pieces are largely unchanged and so it appears it would be 
vulnerable to the same issue.

Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Jason King <jking@racktopsystems.com>
Illumos-issue: https://www.illumos.org/issues/15939
Closes openzfs#15395
nabijaczleweli and others added 2 commits November 30, 2023 11:09
The order in fs/Makefile doesn't matter,
the order in fs/Kconfig is preserved (ext2 is included as the first
thing in the first if BUILD block, and only once), but I don't think it
matters much either, realistically

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes openzfs#13316
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
@behlendorf
Copy link
Contributor

Closing. 2.1.14 has been tagged.

@behlendorf behlendorf closed this Dec 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants