Source code of Interpretable CNN for Big Five Personality Traits using Audio Data. (The Code Follows MIT License)

This package was developed by Mr.Hassan Hayat (hhassan0@uoc.edu). Please feel free to contact in case of any query regarding the package. You can run this package at your own risk. This package is free for academic use.

To read the paper: Hassan Hayat, Carles Ventura, Agata Lapedriza, “On the Use of Interpretable CNN for Personality Trait Recognition from Audio”, Artificial Intelligence Research and Development, Vol: 69, Page: 135 - 144, 2019. DOI: 10.3233/FAIA190116

Operating System

Ubuntu Linux

Requirements

Python3.x.x
GPU with CUDA support
Audio features (https://github.com/tensorflow/models/tree/master/research/audioset/vggish)
Tensorflow1.14
Matlab 2014b
libffmpeg

Dataset

Dataset ChaLearn

Setup

Extract wav files from mp4 video files in the same directory

./data_processing/mp4_to_wav.sh

Using Spectrogram

Use vggish_input.py and mel_features.py from Audio feature to generate 2D Mel_Spectrogram

From here, we divided Mel_Spectrogram into two categories:

Clip_Spectrogram: is a 3D matrix. In which, 1st dimension represents the length (in second) of the wav file, 2nd dimension represents the number of frames in a second, and 3rd dimension represents frequency bands.

Fine-Tune CNN

./clip_spectrogram_cnn/regular_cnn_model.py

Fine-Tune CAM_CNN

./clip_spectrogram_cnn/cam_cnn_model.py

CAM Generation

Store inputs with corresponding conv_features and the last layer weights

./cam_generation_clip_spectrogram/get_features_input_weights.py

Get 20 maximum predictions of all big five personality traits with corresponding inputs, conv_features, and the last layer weights

./cam_generation_clip_spectrogram/get_20_max_pred.py

Generate cam mapping of all big five personality traits

./cam_generation_clip_spectrogram/cam_mapping.m

Summary_Spectrogram: It holds the whole audio information of a video and concatenates the clip-based 2D log-Mel spectrograms along with the temporal domain. This way, we obtain a 1248x64 spectrogram. Then, an average pool operation is performed to reduce the size of the spectrogram. We take an average of 60 ms frame across all 64 Mel-spaced frequency bins, obtaining, as a result, a 208x64 spectrogram, which we refer to as Summary-Spectrogram.

Fine-Tuned CNN

./summary_spectrogram_cnn/regular_cnn_model.py

Fine-Tuned CAM_CNN

./summary_spectrogram_cnn/cam_cnn_model.py

CAM Generation

Store inputs with corresponding conv_features and the last layer weights

./cam_generation_summary_spectrogram/get_features_input_weights.py

Get 20 maximum predictions of all big five personality traits with corresponding inputs, conv_features, and the last layer weights

./cam_generation_summary_spectrogram/get_20_max_pred.py

Generate cam mapping of all big five personality traits

./cam_generation_summary_spectrogram/cam_mapping.m

Using Raw Audio Wav

Down sample the audio signal

./data_processing/signal_downsamplng.py

Convert raw wav into different frequency bins

./data_processing/get_fft_features.py

Train CAM_CNN

Train the cam_cnn model

./rawwav_cnn/cam_cnn_model.py

CAM Generation

Store inputs with corresponding conv_features and the last layer weights

./cam_generation_rawwav/get_features_input_weights.py

Get 20 maximum predictions of all big five personality traits with corresponding inputs, conv_features, and the last layer weights

./cam_generation_rawwav/get_20_max_pred.py

Generate cam mapping of all big five personality traits

./cam_generation_rawwav/cam_mapping.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Source code of Interpretable CNN for Big Five Personality Traits using Audio Data. (The Code Follows MIT License)

Setup

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
cam_generation_clip_spectrogram		cam_generation_clip_spectrogram
cam_generation_rawwav		cam_generation_rawwav
cam_generation_summary_spectrogram		cam_generation_summary_spectrogram
clip_spectrogram_cnn		clip_spectrogram_cnn
data_processing		data_processing
rawwav_cnn		rawwav_cnn
summary_spectrogram_cnn		summary_spectrogram_cnn
README.md		README.md

HassanHayat08/Interpretable-CNN-for-Big-Five-Personality-Traits-using-Audio-Data

Folders and files

Latest commit

History

Repository files navigation

Source code of Interpretable CNN for Big Five Personality Traits using Audio Data. (The Code Follows MIT License)

Setup

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages