Realtime vehicle detection for driver assistance using two approaches:
-
Traditional feature engineering approach: using Histogram of Oriented Gradients + Support vector machines. This approach proved to be very slow at the image level so we discontinued it in favor of the deep learning based approach.
-
Deep learning based approach: Using state of the art convolutional neural networks architecture YOLO (You Only Look Once). The system was implemented using the reference framework darknet mimicing the Tiny YOLOv2 by the original authors of the paper .
The system was developed using the following:
- Hardware:
- A PC running Ubuntu 16.04 LTS, Intel core i-5 4670-K, Nvidia GTX960, 8 GB RAM. We later ported it on a Nvidia Jetson TK1 to test its performance on a relatively modest embedded system.
- Software:
- For various scripts, we needed Python 2 so we used Anaconda2.
- Opencv 3.1.
- CUDA 8.
- cuDNN 5.1.
- Our aim was to create a realtime vehicle detection systems that would detect the surrounding vehicles from the drivers using an incoming video stream only.
- We used a CNN based approach using YOLO. An extremely fast object detection and classification network. The framework we've used (darknet) is readily implemented in C and CUDA so it provided the maximum possible support for embedded systems with a GPU (Hence, the Jetson TK1 choice).
- Installing darknet is easy and simple. Training it wasn't. However, the internet is thankfully full of resources for that. We recommend this repository.
- The network was trained using our custom CFG file and darknet pretrained vanilla weights. We used the Udacity dataset publicly available here. We've written our own scripts to convert the annotations to suit darknet. They're in the scripts folder along with other several useful scripts.
- Clone the repository, install the dependencies and build using
make
after navigating to the repository root folder. - Download and extract the udacity dataset into a folder, then divide the dataset according to your liking (we used 10,000 for training, 3,000 for validation, & another unlabeled dataset for testing).
- Run the necessary scripts to generate the files and annotations needed for training.
- Run the anchors script to generate the anchors needed for your CFG file.
run both scripts in root darknet folder
generate anchors:
gen_anchors.py -filelist path-to-training-file-list -output_dir generated_anchors -num_clusters 5
visualize anchors:
python visualize_anchors.py -anchor_dir generated_anchors/
- Create your CFG and data files as mentioned in the repository referenced here.
- Start training!
- We've obtained an IOU value of 61% at iteration 5,000. The recall rate peak was at 71%. On a GTX960 the network ran at 40FPS and on the Jetson TK1 it ran at 8FPS after various optimizations we've done to both the code and the board itself. to put things into perspective, the GTX960 has 1024 CUDA cores while the Jetson TK1 has 192 CUDA cores only.
- We were training for only 1 class, Car. We've trained for 8,000 iterations and found out that the best IOU result was obtained at iteration 5,000. You can download our weights file here.
- The config file of our network is readily present in the cfg folder, as well as the data and names files. However, you'll need to modify the training, validation and names paths in your data file to match those on your environment.
- Once you're done training (or if you're using our weights file) you can simply invoke one of the following three commands to validate or test YOLO:
./darknet detector valid data/rtcd.data cfg/rtcd.cfg path-to-weights -thresh 0.4
./darknet detector test data/rtcd.data cfg/rtcd.cfg path-to-weights path-to-image -thresh 0.4
./darknet detector demo data/rtcd.data cfg/rtcd.cfg path-to-weights path-to-video -thresh 0.4