- Brno University of Technology
- Faculty of Information Technology
- Academic year: 2018/2019
- Bachelor thesis: Monitoring Pedestrian by Drone
- Author: Vladimir Dusek
This thesis is focused on monitoring people in video footage captured by a drone. People are detected by a trained model of the RetinaNet detector. A feature vector is extracted for each detected person using color histograms. The identification of people is realized by comparing their feature vectors concerning their distance in the frame. In the end, the trajectories of all people are visualized in a panorama image. The accuracy of the trained RetinaNet detector on hard validation data is 58.6%. The error rate is partially reduced by way of algorithm design for trajectory visualization. It is not necessary to successfully detect a person on every frame for correct visualization of its trajectories. At the same time, static objects which are detected as a person but are not moving are not considered as people and are not visualized at all. There is a lot of algorithms dealing with people detection, yet only a few of them are focused on detection people from aerial footage.
Assignment
— Description of the thesis assignment in Czech.
Text
— The text of the thesis in Czech.
Presentation
— The thesis defense presentation in Czech.
Application
— Application People Detector.
Examples
— Video and images for the experiments with the application.
- As a detector was used Fizyr Keras Retinanet.
- New model was trained on Stanford Drone Dataset.
- GUI was implemented in PyQt4.
- Process one image using CPU can last about 5 seconds so if you're planning process video I definitely recommend using GPU computing. See Tensorflow GPU.
You can manage the application using the GUI. Just select the input image/video, specify output type and run recognition.
- Clone this repository.
- Download my pretrained models from MEGA Drive. You can use your own model as well.
- Specify path to the model on this line.
- Optionally you can change the following parameters: DETECTION_TRESHOLD, EUCLIDEAN_DISTANCE_TRESHOLD, HIST_SIMILARITY_TRESHOLD, FIRST_AND_LAST_POINT_TRESHOLD and P.
- You should have Python3.7+ and its package manager Pip.
- Install all Python requirements by following command.
$ pip install --user -r app/requirements.txt
- Since PyQt isn't just Python package it has to be installed by your system package manager.
# RPM $ sudo dnf install python3-PyQt4
# Debian $ sudo apt install python3-pyqt4
- Run the People Detector.
$ cd app/src/ $ python main.py
Average precision on validation data according to trained epochs. The best value 58.6% was measured after 40 epoch. Precision is pretty low because some data are really hard. You can check it, some people are difficult to recognize even for men. Moreover, some annotations are not 100% correct. You can experiment with the models. Model after 40 epoch definitely doesn't have to be the best for every capture.
- The intuition behind RetinaNet
- A Guide to Utilizing Color Histograms for Computer Vision and Image Search Engines
- How to install TensorFlow with GPU support
- Focal Loss for Dense Object Detection
- Retinanet's loss function.
- Feature Pyramid Networks for Object Detection
- Retinanet uses structure Feature Pyramid Network.
- Deep Residual Learning for Image Recognition
- Retinanet uses Residual Neural Network.
Many other resources I used during working on this project can be found in the 'Literatura' section of the text.