SpeechEmo

Emotional context classification from acoustic feature of speech signal
Contains Different version of deep learning networks and trained models written in Keras backend tensorflow.
- CNN
- LSTM
- CNN_LSTM
- CTC with BLSTM

Dependancies

Tensorflow >=1
keras >= 2.0
numpy
librosa 0.4
scipy
python_speech_features

Datasets

I have used Interactive Emotional Dyadic Motion Capture (IEMOCAP) database. It is is an acted, multimodal and multispeaker database

How to label/format the dataset

IEMOCAP database is recorded in different session,each session contains a set of audios and corresponding evalutation script that helps us to segment the dataset based on emotional categories.

Change the directory to src\utils folder and specify your IEMOCAP dataset path and output path in the utils.py

   cd src/utils/
   python utils.py

To split the dataset to train and test set

   python split.py

Feature Extraction and Training

Extract acoustic features(spectrogram and MFCC features) and train the models.

    # CNN model
    cd src/cnn/
    python cnn_feat_extract.py
    python cnn_train.py
    
    # LSTM model
    cd src/rnn/
    python rnn_feat.py
    python cnn_lstm_model.py
    
    
    # CNN_LSTM
    cd src/cnn_lst
    python feature_extract.py
    python rnn_train.py
    
    #CTC modle
    cd src/ctc/
    python utils/util.py
    python ctc_train.py

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.vscode		.vscode
demo		demo
models		models
src		src
.gitignore		.gitignore
README.md		README.md
Requirements.txt		Requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SpeechEmo

About

Releases

Packages

Languages

lypenghao/SpeechEmo

Folders and files

Latest commit

History

Repository files navigation

SpeechEmo

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages