LicensePlateFinder implements YOLOv5 to build a high-precision License Plate Detection Model. The model was fine-tuned on a custom dataset and achieved impressive performance:
- mAP@0.5:0.95:
0.762
- mAP@0.5:
0.831
(VOC metric)
- About the Dataset
- Model and Preprocessing
- Training Configuration
- Evaluation Metrics
- Inference Results
- Real-Time Inference with OCR
The dataset comprises 5,694 images of license plates with bounding box annotations provided in [xmin, ymin, xmax, ymax]
format. It is organized as follows:
Dataset
├── train
│ └── Vehicle registration plate
│ └── Label
└── validation
└── Vehicle registration plate
└── Label
- Training Samples: 5,308
- Validation Samples: 386
Below are some images with ground truth
bounding boxes:
The YOLOv5 model was chosen for its balance between speed and accuracy. Key steps in preprocessing included:
- Annotation Conversion: Converted
[xmin, ymin, xmax, ymax]
Pascal VOC bounding box annotations into[x, y, width, height]
YOLO format. - Data Configuration File: Prepared a
data.yaml
file with the following details:
train: /content/License_Plate_Detection/train/images
val: /content/License_Plate_Detection/validation/images
nc: 1
names: ['Reg-plate']
The model was fine-tuned with the following hyperparameters:
- Batch Size:
32
- Epochs:
25
- Iterations:
4,146
- calculated as
int(epochs * train_img_count / batch_size)
- calculated as
- Initial Learning Rate:
1e-3
The final COCO primary metric (mAP@0.5:0.95)
was 0.762, while the mAP@0.5
(VOC metric) was 0.831. The overall metric statistics is:
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.762
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.831
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.830
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.222
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.512
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.862
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.669
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.798
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.798
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.233
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.577
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.895
The WandB logs can be found here.
To further enhance the functionality of the license plate detection system, PaddleOCR was integrated to extract and display license plate text in real time. This allows for seamless detection and recognition of license plates.
- Detection: YOLOv5 detects license plates in images or video frames.
- Cropping: The detected bounding box regions are extracted.
- Text Recognition: PaddleOCR processes the cropped regions and extracts text.
- Display: The extracted text is overlaid on the original frames for visualization.
- Multi-Language Support: PaddleOCR supports various languages for text recognition.
- Real-Time Capability: The pipeline is optimized for real-time performance, making it suitable for applications such as traffic monitoring and automated toll systems.
The following steps outline the algorithm for real-time inference:
-
Input Frame:
- Capture a video frame or load an image.
-
License Plate Detection:
- Use YOLOv5 to detect bounding boxes for license plates.
-
Region Cropping:
- Extract the regions corresponding to the detected bounding boxes.
-
OCR Processing:
- Pass the cropped regions to PaddleOCR for text recognition.
-
Visualization:
- Overlay the detected text on the original frame near the corresponding bounding box.
-
Output Frame:
- Display or save the annotated frame with license plate numbers.
This integration showcases the power of combining advanced object detection with OCR for practical and impactful applications.