Caffe prototext file of the lightweight baseline for pedestrian detection proposed in Auto-Zooming CNN-Based Framework for Real-Time Pedestrian Detection in Outdoor Surveillance Videos by:
Saghir Alfasly, Beibei Liu, Yongjian Hu, Yufei Wang, Chang-Tsun Li.
The feature extractor consists of one convolutional layer followed by seven sequenced depthwise separable convolutional layers. The following figure illustrates the proposed model and its characteristics. With the exception of the first layer that uses regular convolution, we employ depthwise separable convolution for the rest layers. We rebuilt Caffe with depthwise layer. The baseline structure is explain in detail in the paper..
Far pedestrian detection performance is shown in the video demo in IEEE Xplore.
Note: the prototext file contains the baseline without prediction part.
@ARTICLE{8781786,
author={S. {Alfasly} and B. {Liu} and Y. {Hu} and Y. {Wang} and C. {Li}},
journal={IEEE Access},
title={Auto-Zooming CNN-Based Framework for Real-Time Pedestrian Detection in Outdoor Surveillance Videos},
year={2019},
volume={7},
pages={105816-105826},
doi={10.1109/ACCESS.2019.2931915}}