Merge pull request #274 from yanghaojin/master

Adding pre-trained models, scripts, datasets for masked facial detection using Ultra-L face models.
Linzaer · Feb 10, 2022 · dffdddd · dffdddd
2 parents ce8829e + 23156cf
commit dffdddd
Show file tree

Hide file tree

Showing 26 changed files with 673 additions and 13 deletions.
diff --git a/masked_face/README.md b/masked_face/README.md
@@ -0,0 +1,63 @@
+# Masked Face Detection 
+
+![img1](https://github.com/yanghaojin/Ultra-Light-Fast-Generic-Face-Detector-1MB/blob/master/masked_face/readme_imgs/img4.jpeg)
+
+# Extending Ultra-L face model for masked facial detection
+
+Ultra-L face detection model achieves great popularity in edge and client based applications. It has a surprising balance of model size and accuracy performance, e.g.,
+- The default FP32 *.pth model size is **1.04~1.1MB**, and the inference framework int8 quantization size is about **300KB**.
+- Only **90~109 MFlops** for 320x240 input resolution.
+- Supported inference code for [NCNN](https://github.com/Linzaer/Ultra-Light-Fast-Generic-Face-Detector-1MB/tree/master/ncnn), [MNN](https://github.com/Linzaer/Ultra-Light-Fast-Generic-Face-Detector-1MB/tree/master/MNN), [INT8](https://github.com/Linzaer/Ultra-Light-Fast-Generic-Face-Detector-1MB/tree/master/MNN/model),
+[Onnx](https://github.com/Linzaer/Ultra-Light-Fast-Generic-Face-Detector-1MB/tree/master/caffe), [OpencvDNN](https://github.com/Linzaer/Ultra-Light-Fast-Generic-Face-Detector-1MB/blob/master/caffe/ultra_face_opencvdnn_inference.py), etc.
+
+COVID-19 has ravaged the world in the past two years, and wearing masks has become the norm in our lives on many occasions. However, most traditional face datasets such as Wider Face currently lack face samples with masks. Therefore, the face detection model based on conventional datasets will fail in the scenario where all attendants wear masks.
+[Face-Mask-Detection](https://github.com/chandrikadeb7/Face-Mask-Detection) is the most popular face detection model we can find on Github that supports Mask detection. 
+However, this model is trained only using 4095 images (2165 masked / 1930 without mask), which is a pretty small dataset. 
+We will experience many false positives in the actual application scenarios.
+
+This original intention inspired me to build a larger dataset to provide better open-source masked facial detection models and help the world survive the pandemic.
+The main contribution of this project is to provide balanced facial training data combining the [wider_face_add_lm_10_10](https://drive.google.com/open?id=1OBY-Pk5hkcVBX1dRBOeLI4e4OCvqJRnH) and [MAFA face](https://imsg.ac.cn/research/maskedface.html) dataset. The [MAFA](https://imsg.ac.cn/research/maskedface.html) data was converted to pascal-VOC format and merged into the [wider_face_add_lm_10_10](https://drive.google.com/open?id=1OBY-Pk5hkcVBX1dRBOeLI4e4OCvqJRnH).
+
+## About the WIDER_MAFA_Balanced dataset
+The *Wider_MAFA_Balanced* dataset (**4.8GB**) can be downloaded at [HPI owncloud](https://owncloud.hpi.de/s/L4MUGqrpeENLbSv).
+It contains 38225 images in total where 31084 for training and 7141 for testing, respectively.
+The specific composition information is shown in the following table:
+
+Source| Class | Train | Test |Total|
+----|------|-------|------|-----
+MAFA face| masked_face | 15542 | 3922 | 19464 |
+Wider face| face | 12859 | 3219 | 16078 |
+*MAFA human body* | face | 2683 | 0 |2683
+
+*MAFA human body* indicates the extracted training samples with human body occlusions.  
+
+I use this script for converting MAFA data format to pascal VOC:
+```Shell
+masked_face/mafa2voc.py
+```
+
+## About the pre-trained models
+```Shell
+masked_face/
+   pretrained/
+      RFB-320-masked_face-v2.pth   # trained with 320x240
+      RFB-640-masked_face-v2.pth   # trained with 640x480
+      RFB-640-masked_face-v2.onnx  # suitable for 640x480
+      RFB-1280-masked_face-v2.onnx # suitable for 1280x960
+```
+
+## Detection Result (input resolution: 1280x960)
+
+The following visual results are created by using this script:
+```Shell
+masked_face/detect_imgs.py
+```
+![img1](https://github.com/yanghaojin/Ultra-Light-Fast-Generic-Face-Detector-1MB/blob/master/masked_face/readme_imgs/img1.jpeg)
+![img1](https://github.com/yanghaojin/Ultra-Light-Fast-Generic-Face-Detector-1MB/blob/master/masked_face/readme_imgs/img2.jpeg)
+![img1](https://github.com/yanghaojin/Ultra-Light-Fast-Generic-Face-Detector-1MB/blob/master/masked_face/readme_imgs/img3.jpg)
+![img1](https://github.com/yanghaojin/Ultra-Light-Fast-Generic-Face-Detector-1MB/blob/master/masked_face/readme_imgs/img5.jpeg)
+![img1](https://github.com/yanghaojin/Ultra-Light-Fast-Generic-Face-Detector-1MB/blob/master/masked_face/readme_imgs/img6.webp)
+![img1](https://github.com/yanghaojin/Ultra-Light-Fast-Generic-Face-Detector-1MB/blob/master/masked_face/readme_imgs/img7.webp)
+![img1](https://github.com/yanghaojin/Ultra-Light-Fast-Generic-Face-Detector-1MB/blob/master/masked_face/readme_imgs/img8.jpeg)
+
+Author: Haojin Yang
diff --git a/masked_face/detect_imgs.py b/masked_face/detect_imgs.py
@@ -0,0 +1,80 @@
+"""
+This code is used to batch detect images in a folder.
+"""
+import os, sys
+currentdir = os.path.dirname(os.path.realpath(__file__))
+parentdir = os.path.dirname(currentdir)
+sys.path.append(parentdir)
+
+import argparse
+import os
+import sys
+
+import cv2
+
+from vision.ssd.config.fd_config import define_img_size
+
+parser = argparse.ArgumentParser(
+    description='detect_imgs')
+
+parser.add_argument('--net_type', default="RFB", type=str,
+                    help='The network architecture ,optional: RFB (higher precision) or slim (faster)')
+parser.add_argument('--input_size', default=1280, type=int,
+                    help='define network input size,default optional value 128/160/320/480/640/1280')
+parser.add_argument('--threshold', default=0.3, type=float,
+                    help='score threshold')
+parser.add_argument('--candidate_size', default=1200, type=int,
+                    help='nms candidate size')
+parser.add_argument('--path', default="imgs", type=str,
+                    help='imgs dir')
+parser.add_argument('--test_device', default="cpu", type=str,
+                    help='cuda:0 or cpu')
+args = parser.parse_args()
+define_img_size(args.input_size)  # must put define_img_size() before 'import create_mb_tiny_fd, create_mb_tiny_fd_predictor'
+
+from vision.ssd.mb_tiny_fd import create_mb_tiny_fd, create_mb_tiny_fd_predictor
+from vision.ssd.mb_tiny_RFB_fd import create_Mb_Tiny_RFB_fd, create_Mb_Tiny_RFB_fd_predictor
+
+result_path = "detect_imgs_results"
+label_path = "./voc-model-labels.txt"
+test_device = args.test_device
+
+class_names = [name.strip() for name in open(label_path).readlines()]
+
+if args.net_type == 'RFB':
+    model_path = "pretrained/RFB-640-masked_face-v2.pth"
+    net = create_Mb_Tiny_RFB_fd(len(class_names), is_test=True, device=test_device)
+    predictor = create_Mb_Tiny_RFB_fd_predictor(net, candidate_size=args.candidate_size, device=test_device)
+else:
+    print("The net type is wrong!")
+    sys.exit(1)
+net.load(model_path)
+
+if not os.path.exists(result_path):
+    os.makedirs(result_path)
+listdir = os.listdir(args.path)
+sum = 0
+for file_path in listdir:
+    img_path = os.path.join(args.path, file_path)
+    orig_image = cv2.imread(img_path)
+    if orig_image is None: continue
+    image = cv2.cvtColor(orig_image, cv2.COLOR_BGR2RGB)
+    boxes, labels, probs = predictor.predict(image, args.candidate_size / 2, args.threshold)
+    sum += boxes.size(0)
+    for i in range(boxes.size(0)):
+        box = boxes[i, :]
+        label_index = labels[i].item()
+        cv2.rectangle(orig_image, (int(box[0]), int(box[1])), (int(box[2]), int(box[3])), (0, 165, 255), 2)
+        # label = f"""{voc_dataset.class_names[labels[i]]}: {probs[i]:.2f}"""
+        label = f"{probs[i]:.2f}"
+        # cv2.putText(orig_image, label, (box[0], box[1] - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 0, 255), 2)
+        cv2.putText(orig_image, class_names[label_index],
+                    (int(box[0]), int(box[1]) - 10),
+                    cv2.FONT_HERSHEY_SIMPLEX,
+                    0.5,  # font scale
+                    (100, 0, 255),
+                    1)  # line type
+    cv2.putText(orig_image, str(boxes.size(0)), (30, 30), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 0, 255), 2)
+    cv2.imwrite(os.path.join(result_path, file_path), orig_image)
+    print(f"Found {len(probs)} faces. The output image is {result_path}")
+print(sum)
diff --git a/masked_face/imgs/img1.jpeg b/masked_face/imgs/img1.jpeg
diff --git a/masked_face/imgs/img2.jpeg b/masked_face/imgs/img2.jpeg
diff --git a/masked_face/imgs/img3.jpg b/masked_face/imgs/img3.jpg
diff --git a/masked_face/imgs/img4.jpeg b/masked_face/imgs/img4.jpeg
diff --git a/masked_face/imgs/img5.jpeg b/masked_face/imgs/img5.jpeg
diff --git a/masked_face/imgs/img6.webp b/masked_face/imgs/img6.webp
diff --git a/masked_face/imgs/img7.webp b/masked_face/imgs/img7.webp
diff --git a/masked_face/imgs/img8.jpeg b/masked_face/imgs/img8.jpeg