Skip to content

AVAuco/ssd_head_keras

Repository files navigation

SSD-based head detector

Image showing head detections

By Pablo Medina-Suarez and Manuel J. Marin-Jimenez.

This repository contains and showcases a head detector model for people detection in images. This model is based on the Single Shot Multibox Detector (SSD), as described in:

SSD: Single Shot MultiBox Detector
Authors: Liu, Wei; Anguelov, Dragomir; Erhan, Dumitru; Szegedy, Christian; Reed, Scott; Fu, Cheng-Yang; Berg, Alexander C. 

The model has been trained using the Hollywood Heads dataset as positive samples, and a subsample of the EgoHands dataset as negative samples. This model has been developed using Pierluigi Ferarri's Keras implementation of SSD as primary source (of which we provide some essential code), and replicates the original Matconvnet version of our model.

Quick start

Cloning the repository

First, download a local copy of this repository, to do so, use the "Clone or download" button or run the following commands in a terminal:

# Install git:     
    sudo apt-get install git
# Clone ssd_head_keras from GitHub using the method of your choice: 
    git clone https://github.com/AVAuco/ssd_head_keras.git (HTTPS)
    git clone git@github.com:AVAuco/ssd_head_keras.git (SSH)

Downloading the model

In the case you just want to download our detection model, we provide a ready to use version that you can download via the following links. Skip to the next section otherwise.

Since there are differences in the object serialization methods used between Python versions previous to version 3.6, we provide two different versions of our model (we do not support Python 2.7):

In the data folder you can find a script that will download the model for you, just run the following commands:

# Install curl, if not already present
    sudo apt-get install curl
# Check your version of Python 3
    python3 --version
# Replace .Y for ".5" or ".6", depending on the output of the previous command 
    cd data
    chmod +x download_model_py3.Y.sh
# Run the script
    ./download_model_py3.Y.sh

How to use the model

A brief tutorial is provided in the Jupyter notebook demo_inference.ipynb. This tutorial explains how to use our model to detect heads over some example images.

To run this notebook on your computer, first take a look at the software requirements section, then run the following commands in a terminal:

# Activate the Python virtual environment
    source <venv_path>/bin/activate
# Set current directory to this repository's root path
    cd <download_path>/ssd_head_keras
# Start a notebook
    jupyter notebook

This command will launch a new tab on your default browser showing the Jupyter notebook environment, just click demo_inference.ipynb and follow the instructions in it.

Software requirements

These are the most relevant dependencies required to use our model (see Issue #22):

Additional, recommended requirements to increase the inference performance on a NVIDIA GPU:

  • NVIDIA CUDA Toolkit (tested on versions 9.0 and 10.0).
  • Optional: a NVIDIA cuDNN version matching the NVIDIA CUDA Toolkit version installed.

An optional, not recommended requirements file is provided in this repository, which will allow you to install a new virtualenv with all the required dependencies. Please note that this file has been used during development and may install additional, unnecessary packages in your system. If you opt for this option, run these commands in a terminal:

# Create a new Python 3 virtual environment
    virtualenv --system-site-packages -p python3 <venv_path>
# Activate the venv
    source <venv_path>/bin/activate
# Install this project dependencies using the provided requirements file
    pip install -r <download_path>/ssd_head_keras/requirements.txt

Performance

This head detector uses a 512x512 input size, favouring precision over speed (above 90% mAP on our Hollywood Heads dataset test split). Nonetheless, this model runs at an average of 40 FPS on a NVIDIA Titan Xp GPU, therefore allowing real time detections.

Qualitative results

We show some results of this head detector on the UCO-LAEO dataset in the following video. No temporal smoothing or other kind of post-processing has been applied to the output of the detectors.

EDIT 02/11/2019: we have uploaded an updated results video!

Citation

If you find this model useful, please consider citing the following paper:

@InProceedings{Marin19a,
    author       = "Marin-Jimenez, M.~J. and Kalogeiton, V. and Medina-Suarez, P. and Zisserman, A.",
    title        = "{LAEO-Net}: revisiting people {Looking At Each Other} in videos",
    booktitle    = "International Conference on Computer Vision and Pattern Recognition (CVPR)",
    year         = "2019",
}

Acknowledgements

We thank the authors of the images used in the demo code, which are licensed under a CC BY 2.0 license: