Skip to content

A use of two dimensional cross-correlation in order to find a point of maximum similarity and detect one or multiple pikachu amongst several different pokemon

License

Notifications You must be signed in to change notification settings

jasoncox576/Pikachu-Detection

Repository files navigation

Pikachu-Detection

A use of two dimensional cross-correlation in order to find a point of maximum similarity and detect one or multiple pikachu amongst several different pokemon

Another demonstration of the handiness of cross-correlation in finding similarities between images. This seemed like an easy enough project provided that one could properly handle the image processing and figure out how to detect multiple instances of the template (Pikachu) within the image. The actual cross-correlation algorithm is deceptively simple to implement given the spooky-looking equation (https://en.wikipedia.org/wiki/Cross-correlation). Using the numpy function correlate2d proved to be rather unfruitful probably because the output size ended up being different, in all cases, than the size of the image. Another setback in implementing a correlation algorithm independently was the unusual occurence that, in multiplying the values of the array together, the parts of the image that contained the most white space ended up with the highest correlation, erroniously. Discovered through the visual aid of 2D heatmaps, it only made sense that this was because multiplying two numbers such as 150 * 255 will of course return higher than multiplying 150 * 150. Thus, a Z-Normalization was used to normalize the pixels to a proper range of [-1, 1]. This displayed immediate success, the reason being that if a positive number is multiplied by a (not so similar) pixel with a negative value, it will detract from the correlation.

To start, it made sense to try this out with a simple image that shows a pikachu in the middle of a white background, and nothing else. Here are the 2D and 3D plots of the correlations returned: simple2d simple3d In my humble opinion it looks like something out of Mordor.

Next, it made sense to try something like moving pikachu to the top-left corner of the image just for a sanity-check. Here are the results. leftcorner2d leftcorner3d

And here is the mapping of the correlation values of pikachu.bmp over image.bmp, which has three pikachus scattered amongst a bunch of other pokemon. image2d Of course, the 3D topographical map was a sight to behold... image3d One can clearly see three distinct maxes, representing the three pikachus shown in the image, out As well as the bounding boxes which wrap around in the correct areas as shown in the output image.




Finally, the results of correlation over nopikachu.png, which contained a charmander, squirtle and bulbasaur but no pikachu. Unsurprisingly no major peaks can be seen and the point of max correlation is somewhere in between charmander and bulbasaur. nopikachu2d nopikachu3d out


Out of curiosity it seemed a good idea to test the efficacy of the same algorithm with a blurred image. A mean blur was used, in which the values of every pixel were averaged with all of the pixel values [1-X] units away, X being the kernel size. For this particular test, a blur kernel size of 3x3 was used.

imageblur2d imageblur3d It looks like it did a fair bit of damage. In the output file, it is apparent that neither of the other two pikachus passed the correlation threshold required to be considered another max (99%). Upon turning it down to ~91%, all three are caught. Not a major problem but one big enough to warrant taking note of.
out
As the size of the blur kernel increases, the blur becomes more intense and cross correlation begins to fall apart altogether. At that point, it is likely that the problem has shifted from the domain of cross correlation- a simple yet effective technique applicable only in certain domains - to something like convolutional neural networks, which prove more effective for extreme image distortions.


I hope you enjoyed reading this writeup, now you can clone this repository and find pikachu for yourself!

About

A use of two dimensional cross-correlation in order to find a point of maximum similarity and detect one or multiple pikachu amongst several different pokemon

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages