This directory contains code to import and evaluate the static SVHF-Net model trained on the VoxCeleb and VGGFace datasets as described in the paper:
A. Nagrani, S. Albanie, A. Zisserman, Seeing Voices and Hearing Faces: Cross-modal biometric matching,
CVPR, 2018
Further details can be found here.
To use the models first install the MatConvNet framework. Instructions can be found here.
To install, follow these steps:
-
Install and compile matconvnet by following instructions here.
-
Setup paths:
setup_SVHFNet
- You can then run the demo script provided to import and test the model.
test_SVHFNet
This model has been trained on static face images from the VoxCeleb and VGGFace datasets, and audio segments from the VoxCeleb dataset. The VoxCeleb dataset can be downloaded directly from here. Cropped face images can be downloaded from here.
If you use this code then please cite:
@InProceedings{Nagrani18a,
author = "Nagrani, A. and Albanie, S. and Zisserman, A.",
title = "Seeing Voices and Hearing Faces: Cross-modal biometric matching",
booktitle = "IEEE Conference on Computer Vision and Pattern Recognition",
year = "2018",
}