Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix null ptr deref when renaming a zvol with snaps and snapdev=visible #16316

Merged
merged 1 commit into from
Aug 15, 2024

Conversation

jgottula
Copy link
Contributor

@jgottula jgottula commented Jul 1, 2024

Motivation and Context

Fixes #16274 (a regression introduced by #15486). The issue affects stable releases beginning with 2.2.3.

Description

Adds a simple NULL check to dataset_kstats_rename so that when the function is called for a snapshot (which do not have kstats allocated for them, at least at present), nothing is done.

This fix should have the additional bonus of "just doing the right thing" in the event that things ever change such that kstats are allocated for snapshots: in that case, the NULL check would no longer bail out of the function, and it would instead update the dataset_name kstat value for the zvol snapshot in question.

How Has This Been Tested?

As mentioned over in #16274, I applied the patch here to my own local build of zfs pretty early on, and it was effective at preventing the bug from occurring: I was then able to rename snapshotted zvols with snapdev=visible as much as I wanted, without any kthread fatalities or zvol-related stuff hanging indefinitely.

The problem is consistent and very easy to reproduce. I provided some slightly more detailed steps over in the issue thread, but it's really just as simple as:

  1. Create pool
  2. Create zvol
  3. Ensure the zvol has snapdev=visible
  4. Ensure the zvol has one or more snapshots
  5. Do zfs rename on the zvol

@asomers helpfully verified that the bug affects FreeBSD too, and that this patch fixes things appropriately there as well.

I suppose that I could also test a build of zfs where I disable the snapshot ignore logic at the beginning of dataset_kstats_create, to double-check the assertion I made earlier that this patch would "just work", and cause kstats-for-snapshots-of-zvols to reflect renames properly, in the event that such kstats should ever exist. I haven't actually gone ahead and done this though. Let me know if that's something you'd like me to do.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

  • My code follows the OpenZFS code style requirements.
  • (N/A) I have updated the documentation accordingly.
  • I have read the contributing document.
  • (See comment below) I have added tests to cover my changes.
  • (GitHub CI only) I have run the ZFS Test Suite with this change applied.
  • All commit messages are properly formatted and contain Signed-off-by.

I haven't run the test suite myself, but I've let GitHub Actions run it.

@jgottula
Copy link
Contributor Author

jgottula commented Jul 1, 2024

It was suggested to me here #16274 (comment), that I add a case to the test suite; and so I began looking into doing that. The goal, as I understand it, would basically be to ensure that the "renaming a zvol which has snaps and snapdev=visible" set of circumstances is actually covered, so that the original bug would have been caught and so that future bugs of a similar nature would also be noticed.

But seeing as this was a "kernel doing crashy things" sort of problem, it wasn't entirely clear to me how I should proceed, as I noted here #16274 (comment). What precisely should constitute a pass or a failure in this circumstance? In the context of this particular issue, failure manifested as the zfs rename command working, and the zvol having been successfully renamed after that point as far as e.g. the zfs userspace utils were concerned; but dmesg reflecting a null pointer dereference, and the z_zvol kernel thread no longer existing, and udev workers—attempting to update the /dev/zvol/... symlinks—blocking indefinitely and eventually being killed off, and so forth. So how would I even check for a kernel screwup in a test case; how would I do that in a platform-neutral manner; what if there wasn't a way to check that didn't involve hanging indefinitely; etc? And does it even actually make sense for a test case to "fail" in such a way that it screws up the kernel and requires a reboot in order for subsequent operations to not potentially hang forever?

And then, assuming that the answer to all of that is presumably just "yeah, no, you don't try to intentionally check for this sort of kernel problem in the test suite", then what sort of test case (and I suppose that by that I mainly mean, what sort of pass/fail criteria) would still make sense to do for the "renaming a zvol which has snaps and snapdev=visible" circumstance? Presumably it's still valuable to have a test which covers that situation; because even if, say, in the event of the original bug here, such a test might not have "failed" in the ordinary sense, it would still have triggered the kernel bug and caused that to be noticed, one way or another.

So yeah, some guidance on what makes the most sense to do here would be appreciated.

Copy link
Member

@robn robn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is obviously correct, given the number of similar checks, and that we don't set dk_stats for whole classes of datasets, and that as a general pattern we allow for kstats to fail to be creation.

With regard to the test case, I agree that this sort of "programmer is holding it wrong" is rarely worth testing for in a functional test suite. There is value in test cases that that test the feature, or combination of features, but you came here to fix a crash, and you have, and you've been very diligent with working to understand the problem and ensure that this doesn't just paper over some deeper problem.

In other words: great method, great outcome. Thank you :)

@jgottula jgottula requested a review from robn July 6, 2024 22:56
@jgottula
Copy link
Contributor Author

jgottula commented Jul 6, 2024

@robn Oops, I think I accidentally re-requested a review by you just now. Should probably do this on a desktop computer, where I can actually see tooltips for buttons before I press them... 😄

If a zvol is renamed, and it has one or more snapshots, and
snapdev=visible is true for the zvol, then the rename causes a kernel
null pointer dereference error. This has the effect (on Linux, anyway)
of killing the z_zvol taskq kthread, with locks still held; which in
turn causes a variety of zvol-related operations afterward to hang
indefinitely (such as udev workers, among other things).

The problem occurs because of an oversight in openzfs#15486
(e36ff84). As documented in
dataset_kstats_create, some datasets may not actually have kstats
allocated for them; and at least at the present time, this is true for
snapshots. In practical terms, this means that for snapshots,
dk->dk_kstats will be NULL. The dataset_kstats_rename function
introduced in the patch above does not first check whether dk->dk_kstats
is NULL before proceeding, unlike e.g. the nearby
dataset_kstats_update_* functions.

In the very particular circumstance in which a zvol is renamed, AND that
zvol has one or more snapshots, AND that zvol also has snapdev=visible,
zvol_rename_minors_impl will loop over not just the zvol dataset itself,
but each of the zvol's snapshots as well, so that their device nodes
will be renamed as well. This results in dataset_kstats_create being
called for snapshots, where, as we've established, dk->dk_kstats is
NULL.

Fix this by simply adding a NULL check before doing anything in
dataset_kstats_rename.

This still allows the dataset_name kstat value for the zvol to be
updated (as was the intent of the original patch), and merely blocks
attempts by the code to act upon the zvol's non-kstat-having snapshots.
If at some future time, kstats are added for snapshots, then things
should work as intended in that case as well.

Signed-off-by: Justin Gottula <justin@jgottula.com>
@tonyhutter tonyhutter merged commit 5807de9 into openzfs:master Aug 15, 2024
21 of 25 checks passed
lundman pushed a commit to openzfsonwindows/openzfs that referenced this pull request Sep 4, 2024
openzfs#16316)

If a zvol is renamed, and it has one or more snapshots, and
snapdev=visible is true for the zvol, then the rename causes a kernel
null pointer dereference error. This has the effect (on Linux, anyway)
of killing the z_zvol taskq kthread, with locks still held; which in
turn causes a variety of zvol-related operations afterward to hang
indefinitely (such as udev workers, among other things).

The problem occurs because of an oversight in openzfs#15486
(e36ff84). As documented in
dataset_kstats_create, some datasets may not actually have kstats
allocated for them; and at least at the present time, this is true for
snapshots. In practical terms, this means that for snapshots,
dk->dk_kstats will be NULL. The dataset_kstats_rename function
introduced in the patch above does not first check whether dk->dk_kstats
is NULL before proceeding, unlike e.g. the nearby
dataset_kstats_update_* functions.

In the very particular circumstance in which a zvol is renamed, AND that
zvol has one or more snapshots, AND that zvol also has snapdev=visible,
zvol_rename_minors_impl will loop over not just the zvol dataset itself,
but each of the zvol's snapshots as well, so that their device nodes
will be renamed as well. This results in dataset_kstats_create being
called for snapshots, where, as we've established, dk->dk_kstats is
NULL.

Fix this by simply adding a NULL check before doing anything in
dataset_kstats_rename.

This still allows the dataset_name kstat value for the zvol to be
updated (as was the intent of the original patch), and merely blocks
attempts by the code to act upon the zvol's non-kstat-having snapshots.
If at some future time, kstats are added for snapshots, then things
should work as intended in that case as well.

Signed-off-by: Justin Gottula <justin@jgottula.com>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Alan Somers <asomers@gmail.com>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants