Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid some crashes when importing a pool with corrupt metadata #9022

Merged
merged 1 commit into from
Dec 26, 2019
Merged

Avoid some crashes when importing a pool with corrupt metadata #9022

merged 1 commit into from
Dec 26, 2019

Conversation

smokris
Copy link
Contributor

@smokris smokris commented Jul 11, 2019

Motivation and Context

  • Skip invalid DVAs when importing pools in readonly mode (in addition to when the config is untrusted).
  • Upon encountering a DVA with a null VDEV, fail gracefully instead of panicking with a NULL pointer dereference.

How Has This Been Tested?

Start with a pool with corrupt metadata (in my case, a spacemap was corrupt).

# echo 0 > /sys/module/zfs/parameters/spa_load_verify_metadata
# bin/zpool import -o readonly=on -f zones

Before: kernel panic due to NULL pointer dereference.

After: no panic, and now I'm able to access (at least some of) the data on this pool.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (a change to man pages or other documentation)

Checklist:

  • My code follows the ZFS on Linux code style requirements.
  • I have updated the documentation accordingly.
    • This being a risky recovery feature, I'm not sure it should be part of the help text / man page (similar to -X)
  • I have read the contributing document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.
  • All commit messages are properly formatted and contain Signed-off-by.

@richardelling
Copy link
Contributor

Interesting idea, this can be useful in some of these edge cases.

  • missing man page updates in zpool man page
  • missing help message in zpool_main
  • zdb could also benefit from this

I'm also concerned about what happens if someone does this in non-readonly while rolling back. I'm fairly sure this could lead to permanent pool damage, so it will need to be appropriately tested and if very destructive, will need to be documented with warnings.

@smokris
Copy link
Contributor Author

smokris commented Jul 12, 2019

Thanks for the feedback, @richardelling. I just pushed a new commit that fixes the code-style issues, and adds the new option to the zpool man page and usage text.

zdb could also benefit from this

zdb currently uses getopt (not getopt_long), so there isn't a straightforward way to add an equivalent --load-meta-thresh option there. Should I rework zdb's option parsing to use getopt_long, or use a single-letter option (and if so, which letter)?

I'm also concerned about what happens if someone does this in non-readonly while rolling back. I'm fairly sure this could lead to permanent pool damage, so it will need to be appropriately tested and if very destructive, will need to be documented with warnings.

Same here. I don't know enough about the ZFS internals to speak to that or to design a test plan. I just know that, on the one pool I've tested it on in readwrite mode, I've been able to successfully import/export many times, copy some files off of it, and completed 1.5 scrubs so far.

@ahrens ahrens added the Status: Design Review Needed Architecture or design is under discussion label Jul 12, 2019
@ahrens
Copy link
Member

ahrens commented Jul 12, 2019

This seems like a very low-level and potentially dangerous concept to be putting on the sysadmin. I'd prefer to find a solution that requires less admin intervention, or perhaps to make this a tunable rather than a first class option.

@richardelling
Copy link
Contributor

perhaps make it a tunable that applies only when readonly import attempted?

@ahrens
Copy link
Member

ahrens commented Jul 12, 2019

If the main use case is for readonly import, could we allow an unlimited number of errors by default for readonly import?

@smokris
Copy link
Contributor Author

smokris commented Jul 12, 2019

If the main use case is for readonly import, could we allow an unlimited number of errors by default for readonly import?

Sounds good to me, since AFAIK there isn't yet a good way to recover a pool from a situation like this.

I just pushed a new commit that changes the policy default when performing a readonly import.

module/zfs/spa.c Outdated Show resolved Hide resolved
module/zfs/spa.c Outdated Show resolved Hide resolved
module/zfs/vdev_mirror.c Show resolved Hide resolved
@behlendorf behlendorf added the Status: Revision Needed Changes are required for the PR to be accepted label Sep 25, 2019
@behlendorf
Copy link
Contributor

@smokris there appears to have general agreement on the updated approach. There's just some review feedback which needed to be addressed. How would you like to proceed with this?

@behlendorf behlendorf added Status: Code Review Needed Ready for review and testing and removed Status: Design Review Needed Architecture or design is under discussion labels Oct 31, 2019
@ahrens ahrens changed the title Add 'zpool import --load-meta-thresh [n]' option Allow unlimited metadata errors during readonly import Oct 31, 2019
@ahrens ahrens changed the title Allow unlimited metadata errors during readonly import Allow metadata errors during readonly import Oct 31, 2019
@ahrens ahrens added Status: Inactive Not being actively updated and removed Status: Code Review Needed Ready for review and testing labels Nov 15, 2019
@smokris
Copy link
Contributor Author

smokris commented Dec 2, 2019

@behlendorf, thanks for the feedback, and my apologies for the delay. I just pushed a revised commit that rebases against latest master and incorporates your changes.

Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for refreshing this. The updated version makes good sense to me. Let's just run it by a few more reviewers.

@behlendorf behlendorf added Status: Code Review Needed Ready for review and testing and removed Status: Inactive Not being actively updated Status: Revision Needed Changes are required for the PR to be accepted labels Dec 2, 2019
@pzakha
Copy link
Contributor

pzakha commented Dec 2, 2019

One issue I see with this is that this instead of automatically rewinding to a safe TXG (one of the 3 latest txgs) that doesn't have metadata errors, it will proceed with the txg that has errors. We could also end up having a critical section of metadata being corrupted, although given that the pool is read-only it should generally not go into a VERIFY() code-path and cause a panic.

The example that is presented in this bug seems to be specific to a particular on-disk corruption recovery scenario. We already have spa_load_verify_dryrun and spa_load_verify_metadata tunables that can be set temporarily while importing the broken pool, so why not just rely on those?

@ahrens
Copy link
Member

ahrens commented Dec 3, 2019

@smokris Could you update the first comment of the PR to reflect the design changes made? i.e. there's no new --load-meta-thresh option.

@smokris
Copy link
Contributor Author

smokris commented Dec 3, 2019

@pzakha, my hope with this PR is to make it more straightforward to recover data from a pool corrupted in this way, by simply being able to import it readonly without having to find and use tunables.

@ahrens, sure, I just updated it.

@pzakha
Copy link
Contributor

pzakha commented Dec 3, 2019

@smokris Yeah, I understand. My main worry is the situation I've outlined:

One issue I see with this is that this instead of automatically rewinding to a safe TXG (one of the 3 latest txgs) that doesn't have metadata errors, it will proceed with the txg that has errors.

This could cause a non-corrupted pool to actually report errors when imported in read-only mode if there were issues with the very last txg. I don't know how much this happens in practice, but I feel like I did see it happen before.

Looking at spa_load_best(), we have this safe_rewind_txg logic which would not apply anymore in the case where we have metadata errors:

	safe_rewind_txg = spa->spa_last_ubsync_txg - TXG_DEFER_SIZE;
	min_txg = (rewind_flags & ZPOOL_EXTREME_REWIND) ?
	    TXG_INITIAL : safe_rewind_txg;

	/*
	 * Continue as long as we're finding errors, we're still within
	 * the acceptable rewind range, and we're still finding uberblocks
	 */
	while (rewind_error && spa->spa_uberblock.ub_txg >= min_txg &&
	    spa->spa_uberblock.ub_txg <= spa->spa_load_max_txg) {
		if (spa->spa_load_max_txg < safe_rewind_txg)
			spa->spa_extreme_rewind = B_TRUE;
		rewind_error = spa_load_retry(spa, state);
	}

@smokris
Copy link
Contributor Author

smokris commented Dec 11, 2019

[pzakha] This could cause a non-corrupted pool to actually report errors when imported in read-only mode if there were issues with the very last txg.

Ah, I see — it's a question of which is less of two evils: rewinding TXGs (potentially losing the latest changes), or ignoring metadata errors (potentially making the pool unusable). Since the latter is dangeous, I think it might be better to keep the current default behavior (rewinding TXGs), and just use spa_load_verify_metadata to override it if needed.

So the primary change in this pull request (ignore metadata errors in readonly mode in spa.c) is unneeded. The secondary change in vdev_mirror.c (avoiding null dereference) may still be beneficial though. I've modified the PR to include only the secondary change.

@smokris smokris changed the title Allow metadata errors during readonly import Avoid some crashes when importing a pool with corrupt metadata Dec 11, 2019
@codecov
Copy link

codecov bot commented Dec 11, 2019

Codecov Report

Merging #9022 into master will decrease coverage by <1%.
The diff coverage is 33%.

Impacted file tree graph

@@           Coverage Diff            @@
##           master    #9022    +/-   ##
========================================
- Coverage      80%      80%   -<1%     
========================================
  Files         385      385            
  Lines      121470   121475     +5     
========================================
- Hits        96756    96617   -139     
- Misses      24714    24858   +144
Flag Coverage Δ
#kernel 80% <33%> (ø) ⬇️
#user 67% <33%> (ø) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c587b2c...5489c80. Read the comment docs.

module/zfs/vdev_mirror.c Outdated Show resolved Hide resolved
module/zfs/vdev_mirror.c Outdated Show resolved Hide resolved
- Skip invalid DVAs when importing pools in readonly mode
  (in addition to when the config is untrusted).

- Upon encountering a DVA with a null VDEV, fail gracefully
  instead of panicking with a NULL pointer dereference.

Signed-off-by: Steve Mokris <smokris@softpixel.com>
@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Dec 26, 2019
@behlendorf behlendorf merged commit d5c97f3 into openzfs:master Dec 26, 2019
allanjude pushed a commit to allanjude/zfs that referenced this pull request Dec 28, 2019
- Skip invalid DVAs when importing pools in readonly mode
  (in addition to when the config is untrusted).

- Upon encountering a DVA with a null VDEV, fail gracefully
  instead of panicking with a NULL pointer dereference.

Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Steve Mokris <smokris@softpixel.com>
Closes openzfs#9022
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Jan 2, 2020
- Skip invalid DVAs when importing pools in readonly mode
  (in addition to when the config is untrusted).

- Upon encountering a DVA with a null VDEV, fail gracefully
  instead of panicking with a NULL pointer dereference.

Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Steve Mokris <smokris@softpixel.com>
Closes openzfs#9022
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Jan 7, 2020
- Skip invalid DVAs when importing pools in readonly mode
  (in addition to when the config is untrusted).

- Upon encountering a DVA with a null VDEV, fail gracefully
  instead of panicking with a NULL pointer dereference.

Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Steve Mokris <smokris@softpixel.com>
Closes openzfs#9022
tonyhutter pushed a commit that referenced this pull request Jan 23, 2020
- Skip invalid DVAs when importing pools in readonly mode
  (in addition to when the config is untrusted).

- Upon encountering a DVA with a null VDEV, fail gracefully
  instead of panicking with a NULL pointer dereference.

Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Steve Mokris <smokris@softpixel.com>
Closes #9022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants