Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Differing files with common prefix detected as duplicates #31

Open
TWAC opened this issue Jun 22, 2017 · 6 comments
Open

Differing files with common prefix detected as duplicates #31

TWAC opened this issue Jun 22, 2017 · 6 comments
Labels

Comments

@TWAC
Copy link

TWAC commented Jun 22, 2017

Tested on Ubuntu 16.04.

#!/bin/sh
git clone https://github.com/ssokolow/fastdupes
cd fastdupes
mkdir files
seq 100000 > files/file1; echo "1" >> files/file1
seq 100000 > files/file2; echo "2" >> files/file2
cmp files/file1 files/file2
python fastdupes.py files
Cloning into 'fastdupes'...
remote: Counting objects: 279, done.
remote: Total 279 (delta 0), reused 0 (delta 0), pack-reused 279
Receiving objects: 100% (279/279), 93.39 KiB | 0 bytes/s, done.
Resolving deltas: 100% (116/116), done.
Checking connectivity... done.
files/file1 files/file2 differ: byte 588896, line 100001
Found 2 files to be compared for duplication.        
Found 1 sets of files with identical sizes. (2 files examined)             
Found 1 sets of files with identical header hashes. (2 files examined)             
Found 1 sets of files with identical hashes. (2 files examined)             
/tmp/fastdupes/files/file2
/tmp/fastdupes/files/file1
@ssokolow
Copy link
Owner

Ugh. I hate these kinds of bugs that Indicate I somehow managed to fail to provide the kind of safety guarantee I thought.

The last few days have been busy, but I'll try to track this down as soon as possible.

@ssokolow
Copy link
Owner

I'm currently fighting off a summer cold, so it'll be a little while before I get to this. Sorry for the delay.

@TWAC TWAC mentioned this issue Jun 26, 2017
@TWAC
Copy link
Author

TWAC commented Jun 26, 2017

I understand. Anyway, I looked into it, and the normal hashing is behaving like the header hashing, see pull request.

@ssokolow
Copy link
Owner

Thanks.

I woke up today with no more traditional symptoms, but no mental capacity either, so I'll review it once that clears up.

@ssokolow ssokolow added the bug label Jul 7, 2017
@ssokolow
Copy link
Owner

ssokolow commented Jul 7, 2017

OK, I'm back on my feet, but still catching up things that slipped. Hopefully, I'll have this fixed within the next few days.

@ssokolow
Copy link
Owner

Ok, I'm back. Sorry for the silence.

Please continue discussion under PR #32.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants