Skip to content

Latest commit

 

History

History
297 lines (232 loc) · 8.65 KB

File metadata and controls

297 lines (232 loc) · 8.65 KB

Convolutions and Filtering

Table of Contents

What are convolutions?

A convolution is a fundamental mathematical operation commonly used in various image processing tasks.
Convolutions provide a way of 'multiplying' two arrays numbers together, typically of different sizes. While convolutions are also used to process 1D signals such as sound data, we will focus on the application of convolutions to 2D signals, i.e. images.
They form the basis for implementing image processing operations where the output pixel values are some linear combination of the corresponding input pixel value and those that happen to surround it.

Mathematically, if $f$ and $g$ are discrete functions, then $f \ast g$ is the product obtained by convolving them together, and is defined as follows: $$(f \ast g)(x) = \sum_{u = -\infty}^\infty{f(u)g(x - u)}$$
and if $f$ and $g$ are continuous functions, then $f \ast g$ is the product obtained by convolving them together, and is defined as follows: $$(f \ast g)(x) = \int_{-\infty}^\infty{f(u)g(x - u)} \cdot \mathrm{d}u$$
In this notation, $f$ is known as the kernel, while $g$ is the actual input. (Convolution is commutative, so the names can be swapped too)

Note: In our use case, where we convolve an image having a finite number of discrete rows and columns, we just need to use the discrete form of the convolution operation.

The convolution operation is better shown using an animation:

Convolution animation

What are kernels?

Kernels, also known as filters, are arrays of numbers (can be n dimensional i.e. 1D, 2D, 3D ...), which we often refer to as matrices, with which we convolve input images to bring out certain desired effects. Some such effects are blurring, sharpening, outlining, embossing, edge detection, et cetera.

So, here $$f(x) = \text{InputImage}, \ g(x) = \text{Kernel}$$

Now, we can directly apply the kernels on the input image using convolution.

Different kernels and their effects

Here are some examples of specific kernels and their effects on some example input images:

Simple box blur

Example:

Simple box blur example

Matrix:

$$ \begin{bmatrix} 1 / 9 & 1 / 9 & 1 / 9 \\\ 1 / 9 & 1 / 9 & 1 / 9 \\\ 1 / 9 & 1 / 9 & 1 / 9 \end{bmatrix} $$

Blur operations like this have the effect of averaging nearby pixels.
This simple blur corresponds to a simple average, whereas others (like the Gaussian blur we will see later) are a weighted average.
A common theme with blurs is that their entries sum to 1, such that the total luminance of the image is preserved.

Edge detection operator

Example:

Edge detection example

Matrix:

$$ \begin{bmatrix} -1 & -1 & -1 \\\ -1 & +8 & -1 \\\ -1 & -1 & -1 \end{bmatrix} $$

This also works as a high pass filter.

Sobel edge operators

Example:

Sobel example

Matrices (horizontal and vertical):

$$ \begin{bmatrix} -1 & -2 & -1 \\\ 0 & 0 & 0 \\\ +1 & +2 & +1 \end{bmatrix} \hskip{10px} \begin{bmatrix} -1 & 0 & +1 \\\ -2 & 0 & +2 \\\ -1 & 0 & +1 \end{bmatrix} $$

Sharpening operator

Example:

Sharpening example

Matrix:

$$ \begin{bmatrix} 0 & -0.5 & 0 \\\ -0.5 & +3 & -0.5 \\\ 0 & -0.5 & 0 \end{bmatrix} $$

Separable convolutions

To compute the value of one output pixel after a 2D convolution using a $m \times n$ kernel, we need to do $m \cdot n$ multiplications. This can make 2D convolution a very expensive operation! Thankfully, in certain cases, the computation can be reduced to $m + n$ multiplications. Kernels for which this is possible are called 'separable'. Here is how it works: The $m \times n$ matrix is decomposed by writing it as a product of a $m \times 1$ matrix and a $1 \times n$ matrix. Convolving the input with each of these takes $m$ and $n$ multiplications per pixel respectively, reducing the total to $m + n$.

$$ \begin{bmatrix} A \cdot a & A \cdot b & A \cdot c \\\ B \cdot a & B \cdot b & B \cdot c \\\ C \cdot a & C \cdot b & C \cdot c \end{bmatrix} = \begin{bmatrix}A \\ B \\ C\end{bmatrix} \begin{bmatrix}a & b & c\end{bmatrix} $$

The Gaussian kernel is a well-known example of a separable kernel:

$$ \begin{bmatrix} 1 / 16 & 2 / 16 & 1 / 16 \\\ 2 / 16 & 4 / 16 & 2 / 16 \\\ 1 / 16 & 2 / 16 & 1 / 16 \end{bmatrix} = \begin{bmatrix}1 / 4 \\ 1 / 2 \\ 1 / 4\end{bmatrix} \begin{bmatrix}1 / 4 & 1 / 2 & 1 / 4\end{bmatrix} $$

Separable convolution

It is worth noting that the convolution operation is associative, so either of the simpler convolutions can be applied first.

$$ \text{input} \ast \begin{bmatrix} A \cdot a & A \cdot b & A \cdot c \\\ B \cdot a & B \cdot b & B \cdot c \\\ C \cdot a & C \cdot b & C \cdot c \end{bmatrix} = \text{input} \ast \Bigg(\begin{bmatrix}A \\ B \\ C\end{bmatrix} \begin{bmatrix}a & b & c\end{bmatrix}\Bigg) = \Bigg(\text{input} \ast \begin{bmatrix}A \\ B \\ C\end{bmatrix}\Bigg) \ast \begin{bmatrix}a & b & c\end{bmatrix} $$

A suboptimal implementation

We have spun up a suboptimal implementation of the convolution operation for demonstration purposes.
To build and run it, execute the following commands from the root of this repository.

cd 4_cv_basics/4_convolutions_filtering      # Enter the project directory
make SRC=main.cpp link=src/convolution.cpp   # Use make to build the project
./Convolution_Filtering                      # Execute the built project

The demo code is purposefully kept in an unoptimized state, your task is to understand it and then improve it based on your understanding.
Feel free to make any changes you deem useful.

Some points to be considered while thinking about optimizations are:

  • Will it work for kernels having different sizes?
  • Can it handle images having different number of channels?
  • Can you improve space complexity of above implementation?

Above points are given only for the purpose of giving a rough idea about possible optimizations and are not necessarily sufficient.

Resources you can visit

Running Instruction

Without Benchmarks

  1. Move into the cv_basics directory from the root pixels directory by running following command :
    cd 4_cv_basics 
  1. Move into the Convolutions directory from the cv_basics directory by running following command :
    cd 4_convolutions_filtering 
  1. For building the file:-
  • If you want to build naive convolution then run
    make SRC=naive.cpp link=convolution.cpp
  • If you want to build convolutionUsingOpenCV then run
    make SRC=convolutionUsingOpenCV.cpp link=convolution.cpp
  • If you want to build SeparableConvolution then run
    make SRC=separableConvolutions.cpp link=convolution.cpp
  1. Finally execute the binary file created by running the following command :
    ./convolution_filtering

With Benchmarks

  1. Move into the cv_basics directory from the root pixels directory by running following command :
    cd 4_cv_basics 
  1. Move into the Convolutions directory directory from the cv_basics directory by running following command :
    cd 4_convolutions_filtering 
  1. For building the file:-
  • If you want to build naive convolution then run
    make SRC=./benchmarks/naive.cpp link=./src/convolution.cpp
  • If you want to build convolutionUsingOpenCV then run
    make SRC=./benchmarks/convolutionUsingOpenCV.cpp link=./src/convolution.cpp
  • If you want to build SeparableConvolution then run
    make SRC=./benchmarks/separableConvolutions.cpp link=./src/convolution.cpp
  1. Finally execute the binary file created by running the following command :
    ./convolution_filtering

Results

Naive Convolution:-

Naive

Convolutions Using OpenCV:-

ConvolutionUsingOpenCV

Separable Convolutions Using OpenCV:-

SeparableConvolutions