-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
implement corruption correcting recv #9372
Conversation
d17126b
to
4312e3e
Compare
This feature is indeed really nice to have. However, I am curious about whether it would - even theoretically - be possible to heal metadata using a corrective receive. Support for that would definitely be a killer feature. |
I agree that it would be great to be able to heal metadata but as far as I know, there isn't enough information in the send file to do that. |
60e6d08
to
07a01dc
Compare
9cb189c
to
94deaf5
Compare
I've fixed the re-encryption code so this is ready for review now. |
Codecov Report
@@ Coverage Diff @@
## master #9372 +/- ##
========================================
- Coverage 80% 79% -<1%
========================================
Files 384 384
Lines 121788 122069 +281
========================================
- Hits 96900 96897 -3
- Misses 24888 25172 +284
Continue to review full report at Codecov.
|
This type of recv is used to heal corrupted data when a replica of the data already exists (in the form of a send file for example). With the provided send stream, corrective receive will read from disk blocks described by the WRITE records. When any of the reads come back with ECKSUM we use the data from the corresponding WRITE record to rewrite the corrupted block. Reviewed-by: Paul Dagnelie <pcd@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Paul Zuchowski <pzuchowski@datto.com> Signed-off-by: Alek Pinchuk <apinchuk@axcient.com> Closes openzfs#9372
This type of recv is used to heal corrupted data when a replica of the data already exists (in the form of a send file for example). With the provided send stream, corrective receive will read from disk blocks described by the WRITE records. When any of the reads come back with ECKSUM we use the data from the corresponding WRITE record to rewrite the corrupted block. Reviewed-by: Paul Dagnelie <pcd@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Paul Zuchowski <pzuchowski@datto.com> Signed-off-by: Alek Pinchuk <apinchuk@axcient.com> Closes openzfs#9372
This type of recv is used to heal corrupted data when a replica of the data already exists (in the form of a send file for example). With the provided send stream, corrective receive will read from disk blocks described by the WRITE records. When any of the reads come back with ECKSUM we use the data from the corresponding WRITE record to rewrite the corrupted block. Reviewed-by: Paul Dagnelie <pcd@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Paul Zuchowski <pzuchowski@datto.com> Signed-off-by: Alek Pinchuk <apinchuk@axcient.com> Closes openzfs#9372
This type of recv is used to heal corrupted data when a replica of the data already exists (in the form of a send file for example). With the provided send stream, corrective receive will read from disk blocks described by the WRITE records. When any of the reads come back with ECKSUM we use the data from the corresponding WRITE record to rewrite the corrupted block. Reviewed-by: Paul Dagnelie <pcd@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Paul Zuchowski <pzuchowski@datto.com> Signed-off-by: Alek Pinchuk <apinchuk@axcient.com> Closes openzfs#9372
This type of recv is used to heal corrupted data when a replica of the data already exists (in the form of a send file for example). With the provided send stream, corrective receive will read from disk blocks described by the WRITE records. When any of the reads come back with ECKSUM we use the data from the corresponding WRITE record to rewrite the corrupted block. Reviewed-by: Paul Dagnelie <pcd@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Paul Zuchowski <pzuchowski@datto.com> Signed-off-by: Alek Pinchuk <apinchuk@axcient.com> Closes openzfs#9372
This type of recv is used to heal corrupted data when a replica of the data already exists (in the form of a send file for example). With the provided send stream, corrective receive will read from disk blocks described by the WRITE records. When any of the reads come back with ECKSUM we use the data from the corresponding WRITE record to rewrite the corrupted block. Reviewed-by: Paul Dagnelie <pcd@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Paul Zuchowski <pzuchowski@datto.com> Signed-off-by: Alek Pinchuk <apinchuk@axcient.com> Closes openzfs#9372
This type of recv is used to heal corrupted data when a replica of the data already exists (in the form of a send file for example). With the provided send stream, corrective receive will read from disk blocks described by the WRITE records. When any of the reads come back with ECKSUM we use the data from the corresponding WRITE record to rewrite the corrupted block. Reviewed-by: Paul Dagnelie <pcd@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Paul Zuchowski <pzuchowski@datto.com> Signed-off-by: Alek Pinchuk <apinchuk@axcient.com> Closes openzfs#9372
This type of recv is used to heal corrupted data when a replica of the data already exists (in the form of a send file for example). With the provided send stream, corrective receive will read from disk blocks described by the WRITE records. When any of the reads come back with ECKSUM we use the data from the corresponding WRITE record to rewrite the corrupted block. Reviewed-by: Paul Dagnelie <pcd@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Paul Zuchowski <pzuchowski@datto.com> Signed-off-by: Alek Pinchuk <apinchuk@axcient.com> Closes openzfs#9372
This type of recv is used to heal corrupted data when a replica of the data already exists (in the form of a send file for example). With the provided send stream, corrective receive will read from disk blocks described by the WRITE records. When any of the reads come back with ECKSUM we use the data from the corresponding WRITE record to rewrite the corrupted block. Reviewed-by: Paul Dagnelie <pcd@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Paul Zuchowski <pzuchowski@datto.com> Signed-off-by: Alek Pinchuk <apinchuk@axcient.com> Closes openzfs#9372
This type of recv is used to heal corrupted data when a replica of the data already exists (in the form of a send file for example). With the provided send stream, corrective receive will read from disk blocks described by the WRITE records. When any of the reads come back with ECKSUM we use the data from the corresponding WRITE record to rewrite the corrupted block. Reviewed-by: Paul Dagnelie <pcd@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Paul Zuchowski <pzuchowski@datto.com> Signed-off-by: Alek Pinchuk <apinchuk@axcient.com> Closes openzfs#9372
This type of recv is used to heal corrupted data when a replica of the data already exists (in the form of a send file for example). With the provided send stream, corrective receive will read from disk blocks described by the WRITE records. When any of the reads come back with ECKSUM we use the data from the corresponding WRITE record to rewrite the corrupted block. Reviewed-by: Paul Dagnelie <pcd@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Paul Zuchowski <pzuchowski@datto.com> Signed-off-by: Alek Pinchuk <apinchuk@axcient.com> Closes openzfs#9372
This type of recv is used to heal corrupted data when a replica of the data already exists (in the form of a send file for example). With the provided send stream, corrective receive will read from disk blocks described by the WRITE records. When any of the reads come back with ECKSUM we use the data from the corresponding WRITE record to rewrite the corrupted block. Reviewed-by: Paul Dagnelie <pcd@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Paul Zuchowski <pzuchowski@datto.com> Signed-off-by: Alek Pinchuk <apinchuk@axcient.com> Closes openzfs#9372
This type of recv is used to heal corrupted data when a replica of the data already exists (in the form of a send file for example). With the provided send stream, corrective receive will read from disk blocks described by the WRITE records. When any of the reads come back with ECKSUM we use the data from the corresponding WRITE record to rewrite the corrupted block. Reviewed-by: Paul Dagnelie <pcd@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Paul Zuchowski <pzuchowski@datto.com> Signed-off-by: Alek Pinchuk <apinchuk@axcient.com> Closes openzfs#9372
This type of recv is used to heal corrupted data when a replica of the data already exists (in the form of a send file for example). With the provided send stream, corrective receive will read from disk blocks described by the WRITE records. When any of the reads come back with ECKSUM we use the data from the corresponding WRITE record to rewrite the corrupted block. Reviewed-by: Paul Dagnelie <pcd@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Paul Zuchowski <pzuchowski@datto.com> Signed-off-by: Alek Pinchuk <apinchuk@axcient.com> Closes openzfs#9372
This type of recv is used to heal corrupted data when a replica of the data already exists (in the form of a send file for example). With the provided send stream, corrective receive will read from disk blocks described by the WRITE records. When any of the reads come back with ECKSUM we use the data from the corresponding WRITE record to rewrite the corrupted block. Reviewed-by: Paul Dagnelie <pcd@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Paul Zuchowski <pzuchowski@datto.com> Signed-off-by: Alek Pinchuk <apinchuk@axcient.com> Closes openzfs#9372
This type of recv is used to heal corrupted data when a replica of the data already exists (in the form of a send file for example). With the provided send stream, corrective receive will read from disk blocks described by the WRITE records. When any of the reads come back with ECKSUM we use the data from the corresponding WRITE record to rewrite the corrupted block. Reviewed-by: Paul Dagnelie <pcd@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Paul Zuchowski <pzuchowski@datto.com> Signed-off-by: Alek Pinchuk <apinchuk@axcient.com> Closes openzfs#9372
This type of recv is used to heal corrupted data when a replica of the data already exists (in the form of a send file for example). With the provided send stream, corrective receive will read from disk blocks described by the WRITE records. When any of the reads come back with ECKSUM we use the data from the corresponding WRITE record to rewrite the corrupted block. Reviewed-by: Paul Dagnelie <pcd@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Paul Zuchowski <pzuchowski@datto.com> Signed-off-by: Alek Pinchuk <apinchuk@axcient.com> Closes openzfs#9372
This type of recv is used to heal corrupted data when a replica of the data already exists (in the form of a send file for example). With the provided send stream, corrective receive will read from disk blocks described by the WRITE records. When any of the reads come back with ECKSUM we use the data from the corresponding WRITE record to rewrite the corrupted block. Reviewed-by: Paul Dagnelie <pcd@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Paul Zuchowski <pzuchowski@datto.com> Signed-off-by: Alek Pinchuk <apinchuk@axcient.com> Closes openzfs#9372
This type of recv is used to heal corrupted data when a replica of the data already exists (in the form of a send file for example). With the provided send stream, corrective receive will read from disk blocks described by the WRITE records. When any of the reads come back with ECKSUM we use the data from the corresponding WRITE record to rewrite the corrupted block. Reviewed-by: Paul Dagnelie <pcd@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Paul Zuchowski <pzuchowski@datto.com> Signed-off-by: Alek Pinchuk <apinchuk@axcient.com> Closes openzfs#9372
This type of recv is used to heal corrupted data when a replica of the data already exists (in the form of a send file for example). With the provided send stream, corrective receive will read from disk blocks described by the WRITE records. When any of the reads come back with ECKSUM we use the data from the corresponding WRITE record to rewrite the corrupted block. Reviewed-by: Paul Dagnelie <pcd@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Paul Zuchowski <pzuchowski@datto.com> Signed-off-by: Alek Pinchuk <apinchuk@axcient.com> Closes openzfs#9372
Would this fix make it to release soon? |
|
This feature will make it in to the OpenZFS 2.2 release. |
This is a huge improvement. The last missing piece for my use case with offsite backup, no RaidZ and having low throughput link which makes it difficult to recreate datasets with ease in case of errors. Quite certain i am not the only one on this boat. |
@pepsinio you're not the only one on that boat. I follow this topic since before the PR was opened. no WAN use case but the generally understanding that this is a major resilience feature that stabilizes many related bits and bytes. |
New features: - Fully adaptive ARC eviction (openzfs#14359) - Block cloning (openzfs#13392) - Scrub error log (openzfs#12812, openzfs#12355) - Linux container support (openzfs#14070, openzfs#14097, openzfs#12263) - BLAKE3 Checksums (openzfs#12918) - Corrective "zfs receive" (openzfs#9372) Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
This patch implements a new type of zfs receive: corrective receive (-c). This type of recv is used to heal corrupted data when a replica of the data already exists (in the form of a sendfile for example).
Metadata can not be healed using a corrective receive.
This patch enables us to receive a send stream into an existing snapshot for the purpose of correcting data corruption.
This is the updated version of the patch in #9323
Motivation and Context
In the past in the rare cases where ZFS has experienced permanent data corruption, full recovery of the dataset(s) has not always been possible even if replicas existed.
This patch makes recovery from permanent data corruption possible.
Description
For every write record in the send stream, we read the corresponding block from disk and if that read fails with a checksum error we overwrite that block with data from the send stream.
After the data is healed we reread the block to make sure it's healed and remove the healed blocks form the corruption lists seen in zpool status.
To makes sure will have correctly matched the data in the send stream to the right dataset to heal there is a restriction that the GUID for the snapshot being received into must match the GUID in the send stream. There are likely several snapshots referring to the same potentially corrupted data so there may be many snapshots with the above condition holding that are able to heal a single block.
The other thing to point out is that we can only correct data. Specifically, we are only able to heal records of type DRR_WRITE.
To help with the review you can see my OpenZFS dev summit 2019 talk for more context on this work:
video: https://www.youtube.com/watch?v=JldbtDATrOo
slides: https://drive.google.com/file/d/1Ysc_3bJWmsJCETFNTRCzyvpseDpzjjf2/view
How Has This Been Tested?
I've been running unit testing very similar to the test that I've added to the zfs-tests
Future Work
Since DRR_SPILL record also (like DRR_WRITE) contains all of the data needed to recreate the damaged block - a future project could add support for healing of DRR_SPILL records.
The next logical extension for part two of this work is to provide a way for a corrupted pool to tell a backup system to generate a minimal send stream in such a way as to enable the corrupted pool to be healed with this generated send stream.
The interface could be something like the following, but maybe there are better suggestions?
Types of changes
Checklist:
Signed-off-by
.