A Deep Network for Arousal-Valence Emotion Prediction with Acoustic-Visual Cues

This repository contains the source codes for our submissions to OMG Emotion Challenge 2018. Method descriptions can be found here.

Team Member: Songyou Peng, Le Zhang, Yutong Ban, Meng Fang, Stefan Winkler

Requirements

PyTorch
torchvision
NumPy
scikit-image
SphereFace (PyTorch)

Preprocessing

Every video should be pre-processed as follows:

Extract frames and apply MTCNN to align faces
Extract WAV files and calculate STFT

Citation

If you use the code (only for research), please consider citing our paper:

@inproceedings{peng2018omg,
 author =  {Peng, Songyou and Zhang, Le and Ban, Yutong and Fang, Meng and Winkler, Stefan},
 title = {{A Deep Network for Arousal-Valence Emotion Prediction with Acoustic-Visual Cues}},
 year = {2018},
 booktitle = {arxiv},
}

Contact Songyou Peng ✉️ for questions, comments and reporting bugs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

A Deep Network for Arousal-Valence Emotion Prediction with Acoustic-Visual Cues

Requirements

Preprocessing

Citation

Files

README.md

Latest commit

History

README.md

File metadata and controls

A Deep Network for Arousal-Valence Emotion Prediction with Acoustic-Visual Cues

Requirements

Preprocessing

Citation