-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrong b2sum stored in xattr #439
Comments
Does this happen reliably, i.e. always the same files get the wrong xattr? Also, does it also fail if you copy those files in a common directory and run rmlint on them there? |
I tried to make a minimal reproduction but couldn't manage to so far. It was definitely the case for a lot of .ARW (raw camera image) files, but I didn't try running rmlint with xattr on just a pair of files where it happened before (only without --xattr). |
I am getting the same as here and also reported in the other mentioned #436 . In my case there it isn't anything else involved like ntfs, wsl, Raspberry Pi, etc. - it's a plain x86 box (albeit only with 8GB RAM and quite a few files) with one single (non-root and not much used, just archival for pics and videos) drive - 6TB btrfs volume. All native with Ubuntu 20.10. Small runs will store the expected b2sum in user.rmlint.blake2b.cksum but the large/complete run on the volume (which I've done only once) created as far as I can tell only wrong check-sums. The files aren't changed in any way, most of them for more than 10 years (and I have md5sums for lots of them and indeed they didn't change). I don't have ECC RAM but the box is really stable and has some other btrfs volumes and I'm doing quite a bit of file shuffling, if there would be some hardware issue that can corrupt checksums btrfs (and not only) would cry bloody murder. As I can't reproduce it with a few files I'll have to wipe the xattrs and try again for the whole volume (this will take quite a while). |
Ok, I've found the problem - it's coming from -C and more specifically from --write-unfinished.
|
Indeed, half finished files write checksums and I agree that this is pretty confusing. Since the next I will decide later what to do with Can you guys check if this fixes it for you? |
Wow that's a really obvious reason for this to be the case, nice :) I thought it was some complicated bug. I did not think about |
This works, no unfinished checksum is written at least for the simple cases I can easily control and where I've been able to reproduce it previously (just tried with the latest from develop branch). Thanks a lot! I think some users that have already the bad check-sums will need some cleanup, I ended up with a veeeery crude and probably easy to break:
|
In https://github.com/sahib/rmlint/tree/develop branch, So now And Both options are slower than the previous Closing issue. Feel free to re-open if you want to continue the conversation. |
Credit to Chris (@sahib) for the original fix. This version only skips writing checksums when -q/-Q are used, and sets clamp_is_used for both absolute and relative clamping. There are still false-positives and false-negatives lurking in this logic, but I'd rather fix features later than disable them now. --write-unfinished can provide a significant speedup. Related to #439
I recently noticed that rmlint wasn't finding some duplicate files when using --xattr, but it was finding them without it. I used getfattr to look at the values, and one of each pair of files had a wrong
user.rmlint.blake2b.cksum=
stored in the file.One reason I can think of is that when using
-Q
/-q
in conjunction with--xattr
, the wrong blake2b sum is stored in the xattrs with the same attribute name, which cannot be detected later, so you get false negatives.I couldn't find any command in my history that actually ran
rmlint -q
on these files, but it's the only thing I could think and verified that it can happen. Are there any other ways wrong hashes in xattr could happen? These are image files so they are basically immutable, I never modify them, and they have the same mtime.The text was updated successfully, but these errors were encountered: