Skip to content

2. Model and Engine

nononoku edited this page Apr 18, 2023 · 48 revisions

Storymap

Story Map Story Map (1) Story Map (2)

Engine Architecture

Makeskis Engine Architechture

User Front-End and Handling Camera Access

Users are assigned a unique UserID based on their device for DB querying. Users will be able to choose saved puzzles from their previous history or can create a new puzzle. To take a picture of a new puzzle, users must grant device camera access on first use. Afterward, users can submit images of the specific puzzle piece they are trying to find the position of using their camera. Users also can take an image of what they have completed so far for the puzzle.

Users will be returned their puzzle image with the region the piece is in circled. The size of the region will be based on what difficulty level they choose (smaller region for easy, larger region for hard).

Main Handler

The main handler will receive information from the front-end and back-end and move information between the two interfaces. The main handler is also responsible for the running client-side processes of our application. It is where the main thread of our program will execute.

Back-End

On our backend, we will handle and process the information received from the frontend, and communicate with relevant nodes. First, we will have to handle image preprocessing before moving them to OpenCV. Since OpenCV's template matching algorithm usually locates a subimage of an image, it will not recognize the puzzle piece as a subimage of the whole puzzle unless we scale it down to size and crop out irrelevant background pixels. We match the colors of the puzzle piece with colors that are on the image of the completed puzzle. Specifics of how we manipulated the piece images for preprocessing and how we used OpenCV to find the puzzle piece on the puzzle image are described below.

OpenCV Technologies

We use a couple of different OpenCV methods and custom image manipulation to help us match our puzzle pieces.

Edge Detection

Warning: not integrated into our final solution due to subjectivity in what can be considered a "straight edge" on a puzzle piece

We use its edge detection feature, which uses the Canny edge detection algorithm. We could use this when preprocessing the image of the puzzle piece. The Canny function for OpenCV saves a list of the edges that define a sub image within an image (ex., the border of an apple on a white background). This would allow us to pick out the image of the puzzle piece or the image of the completed puzzle and ignore any background part of the photo taken by the user.

We could also use edge detection to determine if a piece is on the edge of the puzzle or not. Depending on the difficulty a user has chosen, giving a hint on where to look for a piece may include the hint that a piece is or is not on the edge of the puzzle. We could determine this by seeing if any of the edges of the piece found through Canny edge detection are straight. If at least one is, it is an edge piece. In our implementation of the function, we set a certain threshold for how straight an edge of a piece must be, and how long this edge must be, to be considered an edge piece. It can differentiate edge pieces for very clear test cases.

Edge Detection function

Crop Image

Then, we crop the puzzle piece image to its largest outer bounding box such that the background is removed, leaving most of the picture to just be the puzzle piece itself without much noise. This makes our further preprocessing below more efficient, since Canny edge detection by itself is fast.

Crop function

Inner Bounding Box/Rectangle

The avg_background_color function takes in an image and traces the top and bottom of the image and calculates the average values for red, blue, and green for the background of the image out of 255. The is_background function takes in a pixel of the average background color (a 2D array, 1x3) and another pixel from the piece image and returns a boolean value of whether or not the image pixel is within a 10% root-mean-square deviation (to account for the fact that some differences may be negative) of the average background color. If it is, the function returns true; else, it returns false.

With these two functions, we crop the puzzle piece image further to the largest rectangle inside the puzzle piece image that does not contain any background pixels. To do this, we use a greedy approach that starts at the outside of the image, after cropping it to the outer bounding box of the piece, and crops the edge of the image that has the most background pixels until the edges of the image contain no background pixels.

Average background color

Checking if a pixel is background

Cropping an image to the inner bounding box

Template Matching

Our main computer vision functionality comes from OpenCV template matching. The matchTemplate function lays a template image over another image and finds the patch of the larger image that matches it the best. It does this by calculating the percent error between the pixels and finding the location on the numpy BGR image that has the least error. Several comparison methods are implemented in OpenCV. We will mainly handle our template matching using their color comparison method, and try to match up the colors on the piece with the colors on the puzzle image.

There were a few things that we accounted for when doing our template matching that OpenCV does not. Since puzzle piece images are usually scaled up from the puzzle image on a box in reality, we had to resize the piece image to match the image on the box. Automatic scaling would be imprecise, since even with the dimensions of the puzzle we couldn't be sure of the exact dimensions of the rectangular portion of the piece. As a result, we repeatedly resize the image and use the location with the highest confidence, or the least error, after attempting to use the template matching function on all the resized images.

Another thing we accounted for is the non-rectangular shape of the puzzle piece. The base OpenCV matchTemplate function does not allow for us to ignore certain colors when matching. For example, we might want to ignore transparent pixels and remove the background behind a piece to match the image. It also does not allow us to match irregular shapes, since JPEGs must be rectangular. To combat this, we crop the piece image down to an inner bounding box so we only match with colors on the piece itself.

Template matching with rescaling and our preprocessed piece image

Draw Circle

With the location ascertained from template matching, we can draw a circle around the center of the best match location we found. We draw a bigger circle if the user chose a hard difficulty, giving them a larger location to look in, and a smaller one if they chose easy. These will be 160 and 80 pixels in diameter, respectively.

This circle drawing method also comes from the OpenCV library, and takes in a pair of coordinates, color and thickness in pixels of the circle, and radius in order to change the right pixels to the chosen color.

Drawing a yellow circle on the image at the best fit location

Image Quality Check

Finally, we use OpenCV to check for blurriness using Laplacian and if an image is too homogeneous (i.e., a puzzle piece cannot be distinguished from the background of the image). This is used to check when the user takes an insufficient picture of their puzzle piece that cannot be processed. These functions are used in our views.py file to return HTTP responses that signal the front-end to trigger popups asking the user to retake their photo.

Blurry and homogeneous function

Database

The database will store the puzzles that the user has saved on their phone/account. Each puzzle will also contain entries for the puzzle pieces that the user takes pictures of along with the chosen difficulty setting.

Table Creation image