-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zfs send --dedup corrupts some files #7703
Comments
Oh boy. I don't suppose this is on a system where you can try running older ZoL versions and do a git bisect to figure out if there was a point in the past where this didn't (metaphorically) catch fire? (I wouldn't suggest doing anything else with the pools while doing such testing, and if you ended up going back to 0.6.5.X releases to test, you might end up either with a read-only import or reaching a point in the past where the pool won't import because of newer feature flags.) Do you know if using the "good" command to send foo@mysnap | recv bar/baz and then using the "bad" command to send bar/baz@mysnap | recv bar/try2 would also produce a bad stream? (That is, I'm curious whether the problem is still reproducible after a "good" send/recv, which might mean there's something strange about the way it's written on the old pool, or whether it reproduces itself on recv, which suggests a different kind of problem.) What's the output of zpool get all on oldpool and newpool, and zfs get all on oldpool/dataset and wherever on newpool you're receiving it into? I'm kind of curious if the patch in #7701 might be useful, though I don't think so at a second glance, since that looks like the problem described there probably would be a race, not 100% reproducible. If it's always mangling the last block of a file, that's...very strange. Hm. I don't suppose you would be able to share the respective good/bad send streams? |
Too many questions, let me try to answer some... First, the source pool contains my "life" in data. That's why I did a md5sum of every file after transferring. So, I do not like to play with it before it is fully transferred to the new drive, and this changed to main system drive. After that, I may try something with the old data. I already know (from zdb -ccc) that there are some lost free blocks in the old volume. Could not finish testing in the new volume, zdb always dies with SIGSEGV, as sooner as high the number of inflight I/Os. Also, this was just a first transfer. While doing a second, incremental, transfer, I got a SIGSEGV right at the end of
Doing a full dataset xfer takes sometime, specially because I use
Keep this idea on. I'll try as soon as the latest md5sum run (after zfs send -I) finishes.
I could only find one unreasonable difference: In fact, I used to run
I also think #7701 would not fix, since it intends to fix a crash, not a data corruption. But COULD be a race. Note that two copies of same dataset with --dedup resulted corrupted, but not in the same files...
Also: not always the same files, not every file with same deduped content... Typically a race or lost pointer somewhere.
All datasets contain very private data, so I prefer not to share them. |
@dioni21 Okay, so you can't share the datasets. That's fine. Can you share the output of zdb -vvvvv oldpool/dataset [file id] versus zdb -vvvvv newpool/dataset [corrupted file id] versus zdb -vvvvv newpool/dataset [deduped file that is not corrupted id]? (Specifically, for the above, I'm looking for the output from a file that's mangled on the receiver side, a file that shares blocks with the first one but isn't mangled on the receiver side, and the same information for those two files on oldpool.) What do you mean, "lost free blocks in the old volume"? Also, it seems likely that zfs send without -D would probably save you a bunch of time, particularly if you're receiving it into a dataset with dedup+verify enabled anyway, though you obv. wouldn't have run across this problem if you weren't using that, so bit of a mixed bag. :) |
@rincebrain As you may have noticed in another issue (#7723), I've probably been affected by a recent kernel bug generating panics during heavy I/Os. AFAIK, my destination pool has been corrupted into a state that loading zfs driver into memory is enough to panic the host. I had to wipe one of the new disks to start copying again, but I still have the other untouched for post-mortem. Right now I am taking my (sleepless) nights trying to fix this mess.
Not right now. If I can make my system stop panicking and restart copy, I'll try to redo the bug analysis.
I did not understand what you want. If they were text files, I would send a diff. But how can I compare two zfs send streams? Specially considering that the contents would be very different: deduped data will be sent many times without --dedup, if I understood correctly.
That's the output of zdb -ccc. I'm not sure if this is the exact message, but I think it is reported from the leak tracing and space maps. From zdb (8) manual:
As an end user, my expectation was that the receiving side should not redo dedup/compress/etc verification if the source already sends the processed data. Indeed, maybe I should just have added the new disks to the pool, and do the auto-expand procedure, but I wanted to try in practice a big zfs send operation. |
@dioni21 What I wanted was the text output of zdb -vvvv [dataset] [object id] for two files, one that had deduplicated blocks and was mangled on the new pool, and one that was not, on each of the old and new pool. So if you have oldpool/dataset@snap1, and did zfs send -D [other flags] oldpool/dataset@snap1 | zfs recv newpool/dataset, and had two files, IntactFile which had deduplicated blocks but did not get mangled on newpool, and MangledFile which had deduplicated blocks but did get mangled in the send, I want:
(It would be even more convenient if IntactFile and MangledFile both shared the same blocks on oldpool, but that is not a requirement.) I believe zfs recv can't generally reuse the dedup table info from a send stream - for example, if you're using one of the salted checksums (or even just a different checksum type), the checksums would vary between src and dst but still refer to the same data block, to say nothing of all the blocks that are going to vary above the actual "data" blocks. I think it'll be faster when bandwidth is the bottleneck with zfs send -D because it'll only be sending one copy of the relevant data block, and I believe send -D will go faster if the source is using e.g. sha256 because it can then just use the existing calculated checksum rather than redoing it, but other than that, no. (It's worth noting that you can zfs send -D and recv into a pool with dedup=off without ending up with a dedup table, which is useful for some people when bandwidth is much more limited than CPU time.) |
The |
System information
Describe the problem you're observing
While transferring data to a new pool, I found some corrupted files. These files were only found because I ran a md5sum on every file on both source and destination pools. No error was generated during send/recv procedure.
Apparently, only the final part (block?) of corrupted files are junk, content from somewhere else. All corrupted files had deduped counterparts, but not all were affected. Indeed, I found an example file which had 4 copies, but only one was corrupted.
Maybe related to #2210 and #3066, but not sure...
Describe how to reproduce the problem
This operation generates a bad dataset:
This operation generates a good dataset:
Include any warning/errors/backtraces from the system logs
No error found at system logs.
The text was updated successfully, but these errors were encountered: