Skip to content

San279/train-object-detect-FOMO-esp32

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

74 Commits
 
 
 
 
 
 

Repository files navigation

Guide to training FOMO object detection model in Edge Impulse

ภาษาไทย

FOMO is a object detection model designed for constrained device. Due to it's low foot print and memory requirement, this model is highly suitable for AIOT box or Esp32-S3. This repository will provide simple tips for building a FOMO model in Edge Impulse, including data collection, training, and deployment.

What you'll need

  • AIOT, Esp32S3 or any Esp32 series.
  • OV2640.
  • Webcam (optional).
  • Edge Impulse account(free).

Before we begin

Register a free account in Edge Impulse, and create a new project.

alt text

Data collection

1. Collecting data from Esp32 can be a tediuos. Luckily, you can download and run the scripted that I've created camera-webserver-for-esp32S3 or use Webcam interface in Edge impulse.

  • The best results of this network is obtained atleast 70 images per class and 10% of background(other) images. To put in perspective, training a model to count 2 fingers requires 70 images of one, another 70 images of two, and atleast 20-30 images of other fingers or object look alike.
  • Images should has equal width and height otherwise it's width will be crop off when uploading to Edge Impulse. Here is snapshot of webserver used for data collections. Each images is 96 X 96 in dimension.


    alt text


    2. On the left tab, go to data aquisition, and upload images to Edge Impulse

    alt text

    alt text


    3. Click yes for object Detection

    alt text



Training

1. On the top of the page, navigate to labeling queue and add label to each images. Keep in mind that images with non equal dimension will be crop off during this process, which is why I've equal image dimension.

Images with non equal dimension 320 X 240, notice the black shade on each sides of the image indicates that those parts will be crop off.

alt text

Images with equal dimension 96 X 96.

alt text


2. After labeling all images, navigate to Impulse design on the left and click on Create impulse. This will take you to a page where you can choose the size of the input model and resizing mode.

- Edge Impulse reccomends the size of the model should be in multiple of 8. The higher the input size, the slower the network for inferencing. But higher size has advantage of detecting multiple objects if it's presented in the frame.

alt text

Click on add a processing block and select the only option.

alt text

Click on add learning block and select the first option, then save the impulse.

alt text


4. After saving the impulse you will be directed to a new section. In this section, you can choose whether images will be train in Grayscale or RGB feature. I've left it as RGB for this project. Click on save parameters to proceed.

alt text

After selecting the features, the page will direct you to generate feature tab, click on generate feature and you will see the graph on the right side of the page.

5. This graph uses K-nearest neibors algorithm to represented the similarities between each images. Notice that red dot represent finger no.1 and pink represent finger no.2. If two classes are too close to each other like the ones I've circled, the object detection model will have problems distinguish between two classes which will greatly reduce the accuracy. Thus images that overlaped has to be deleted.

alt text

  • After deleting and adding more images, the two classes should be seperated like this.

    alt text


    6. On the left panel select Object detection. These are the settings that can be customized.
    • Traning cycles indicates the number of epoch the model will go through, I've found that it is trivial to set it more than 80. I will be using 25 cycles for this project.
    • Data augentation, multiplies amount of your dataset significantly. leave this on as default.
    • Learning rate, determines how fast the model learn the features, this is best leave as just it is.
    • Validation set size, also best to leave this as default as well.
    • batch size, determines samples that will be propagated through the traning process e.g. if it's set 8 then the model will train on 1-8 images, then on the next cycle it will go through 9-16 and so forth. Batch size should be in the power of 2^n, e.g. 4, 8, 16, 32, 64, and etc. I've found that on small datasets 8 and 16 yield the best result. The batch size of 8 will be used for this project.

      alt text

    • Choose the model, as of now, only two FOMO models are avaiable. I will be using FOMO 0.35 for this project.

      alt text

    • Start traning the model, this process might takes up to 20 minutes.

      alt text

      Tips to improve mode's accuracy
    • Check if each class has overlapped features, go back to step no.5.
    • Increase the datasets.
    • decrease batch size.
    • Epoch should not be more than 80 for smaller datasets



Deployment

1. On the left tab, navigate to Deployment and change deployment option to Arduino library.

alt text


2. Change target option to Esp32.

alt text


3. Click on Build to start downloading the library, and you're done.

I've created two libraries for testing the model in real time, please visit FOMO-object-detect-stream-Esp32 for webserver platform or FOMO-object-detect-TFT for display on TFT screens.

Credit

Thanks to WIRELESS SOLUTION ASIA CO.,LTD for providing AIOT board to support this project. Also thanks to Bodmer / TFT_eSPI for the TFT libraries. Scripted used for Esp32 FOMO object detection inferencing were provided by Edge Impulse.

About

guide to training object detection for Arduino Esp32

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published