Skip to content

Latest commit

 

History

History
68 lines (51 loc) · 1.96 KB

README.md

File metadata and controls

68 lines (51 loc) · 1.96 KB

We Ain't Same

Let's find out duplicate images with Perceptual Hashing algorithms.

  • Calculate perceptual hashes using ImageHash library.
  • Store and restore precalculated hashes.
  • Recursive seeking of image files.
  • Detect duplicates.
  • Organize images into the groups of duplicates.

Performance

1186 jpeg files in total. Release configuration.

  • 17.2 sec to pre-compute hashes.
  • 0.11 sec to search among pre-hashed images.

Example

samples002

Output log:

+++ Computing hashes...
+ C:\Users\Pictures\Samples2\bing20221129.jpg
+ C:\Users\Pictures\Samples2\fireworks.jpg
+ C:\Users\Pictures\Samples2\mars.jpg
+ C:\Users\Pictures\Samples2\mount-copy.jpg
+ C:\Users\Pictures\Samples2\mount-rotated-2degree.jpg
+ C:\Users\Pictures\Samples2\mount-small.jpg
+ C:\Users\Pictures\Samples2\mountains.jpg

+++ Chasing duplicates...

+++ Similarity: max= 100% / min= 43,75%

+++ Duplicate Groups (1):
Group #0
        mount-copy.jpg
        mount-rotated-2degree.jpg
        mount-small.jpg
        mountains.jpg

+++ TOTAL: 00:00:00.8170144

Precalculated hashes as JSON file:

[
    {
        "Path": "C:\\Users\\Pictures\\Samples2\\bing20221129.jpg",
        "Hash": 11695141823225099355
    },
    {
        "Path": "C:\\Users\\Pictures\\Samples2\\fireworks.jpg",
        "Hash": 10721035060630703339
    },
    // ...
]

Links