Skip to content

matrn/C-digit-recognition

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

94 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

C-digit-recognition

Handwritten digits recognition written in C using neural network trained with MNIST database.

Installation

Linux

Binary package

.AppImage file is available under release section.

Compilation from source

  • sudo apt install libopenblas-dev - installs OpenBLAS library
  • sudo apt install libgtk-3-dev - installs
  • if you don't have any trained data file ./lib/ceural/data.ceural, copy sample one using cp ./data/data.ceural ./lib/ceural/data.ceural
  • compilation:
    • use cd src && make clean && make main && ./main to run normal compilation
    • use make clean && make release to generate AppImage binary (you have to install linuxdeploy and other dependecies using make install_tools first)

GUI

GUI

Control

  • left mouse button & drag to draw
  • right mouse button to clear the draw space
  • middle mouse button or Recognise button in the GUI to run recognition process

Processing of the drawn image

Preprocessing used in MNIST database: The original black and white (bilevel) images from NIST were size normalized to fit in a 20x20 pixel box while preserving their aspect ratio. The resulting images contain grey levels as a result of the anti-aliasing technique used by the normalization algorithm. the images were centered in a 28x28 image by computing the center of mass of the pixels, and translating the image so as to position this point at the center of the 28x28 field.

  1. crop calculation - crop of the whole draw space is calculated from sides until it reaches non white pixels. After that maximum of the width & height of the cropped image is then taken and crop_x & crop_y & crop_w & crop_h is recalculated to preserve image ratio.
  2. sub image generation - using previous values and drawn image stored in pixbuf sub image is created
  3. scaling - previous cropped image is scaled to the 20x20 image
  4. conversion into grayscale - pixbuf is converted into uint8_t grayscale image
  5. adding frame - 4, 4, 4, 4 frame is added to the 20x20 image resulting into 28x28 image
  6. computation of the center of mass of the pixels - is done using mean values accross X & Y
  7. move of the submatrix - submatrix (drawn number) is moved in the framed image
  8. neural network forward propagation - this preprocessed image is fed to the neural network

Libraries

Dependency of the libraries is in this order: GUI -> ceural -> lag. For documentation see source code or use IDE (for example vscode).

Lag

Library supports many operations but more development is needed because currently uses OpenBLAS only for matrix multiplication and matrix transposition.

Naming

  • mat - stands for matrix
  • ew - stands for element wise

Notes

  • Matrix part of the library automatically checks if destination and source is same where shouldn't be same and warns using assert().

Ceural

Ceural library is created for multi-layer networks trained using MNIST dataset but with small modifications it can be used for other datasets too. See Accuracy for more info.

Accuracy

After 10 epochs of training with batch size 32 the test set accuracy is 97.47 % which is not bad considering the test error rate in MNIST database website of the 2-layer NN. Sadly accuracy is not as good in practice as it's in the test data set 🥺.

Accuracy is calculated using formula accuracy = (TP+TN)/(TP+TN+FP+FN) which is accuracy = correct/total

Performance

Even though Python is much slower than C, Python-digit-recognition is faster. The reason behind it is that Python version uses great library NumPy, which is perfectly optimized.

ToDo

  • add lag tests
  • add ceural tests
  • Use BLAS (for example OpenBLAS) library for linear algebra in more functions to improve speed
  • Add icons into gui
  • Add command line options to train & test & save & load NN
  • Create lag & ceural docs
  • Choose license
  • Create Windows compilation script & test it on Windows
  • Center digit by center of mass of the pixels before feeding it to the neural network from GUI input
  • Look into possible accuracy improvements

Resources