Speech Emotion Recognition

Web-application based on ML model for recognition of emotion for selected audio file

Description

This project is a part of the final Data Mining project for ITC Fellow Program 2020.

Datasets used in this project

Crowd-sourced Emotional Mutimodal Actors Dataset (Crema-D)
Ryerson Audio-Visual Database of Emotional Speech and Song (Ravdess: https://zenodo.org/record/1188976#.YDU_i-hvbIX)
Surrey Audio-Visual Expressed Emotion (Savee: http://kahlan.eps.surrey.ac.uk/savee/Download.html)
Toronto emotional speech set (Tess: https://tspace.library.utoronto.ca/handle/1807/24487)

Digital signal processing is an emerging field of research in this era. Recently, many researchers have developed a various approaches in this area for SER from over the past decade.

Typically, the SER task is divided into two main sections: features selection and classification. The discriminative features selection and classification method that correctly recognizes the emotional state of the speaker in this domain is a challenging task

Our project pipeline

Nowadays, mostly researchers utilize deep learning techniques for SER using Mel-scale filter bank speech spectrogram as an input feature. A spectrogram is a 2-D representation of speech signals which is widely used in convolutional neural networks (CNNs) for extracting the salient and discriminative features. Similarly, we can utilize the transfer learning strategies for SER using speech spectrograms passing through pre-trained CNNs models like VGG, DenseNet or Alex-Net.

Mel-Frequency Cepstral Coefficients, which are a representation of the short-term power spectrum of a sound by transforming the audio signal, are also considered to be an important feature for SER.

The Mel scale is important because it better approximates human-based perception of sound as opposed to linear scales. In filter-source theory, "the source is the vocal cords and the filter represents the vocal tract." The length and shape of the vocal tract determine how sound is outputted from a human and the cepstrum can describe the filter.

In our project we have combined two models: pretrained DenseNet for mel-spectrograms and CNN for MFCC's.

Installation

It is recommended to use the provided requirements.txt file to set your virtual environment.

To install the app run this commands

!git clone https://github.com/CyberMaryVer/speech-emotion-webapp.git
!cd speech-emotion-webapp
!python -m virtualenv your_venv
!your_venv/Scripts/activate
!pip install -r requirements.txt

After that you can run the app

!streamlit run api-test.py

Usage

Example of an execution:

Our app

Check out our app: http://34.217.207.244:8501/

Our Medium article

Check out our Medium article about this subject: https://talbaram3192.medium.com/classifying-emotions-using-audio-recordings-and-python-434e748a95eb

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Team


Maria Startseva	Tal Baram	Asher

License

Speech Emotion Recognition Project is released under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
images		images
.gitignore		.gitignore
EDA_for_all_datasets.ipynb		EDA_for_all_datasets.ipynb
Experiments.ipynb		Experiments.ipynb
Experiments_with_audio.ipynb		Experiments_with_audio.ipynb
README.md		README.md
app.py		app.py
combined.csv		combined.csv
combined_cnn.ipynb		combined_cnn.ipynb
demo.gif		demo.gif
model3.h5		model3.h5
model_improvement (4) (7).ipynb		model_improvement (4) (7).ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Speech Emotion Recognition

Description

Installation

Usage

Our app

Our Medium article

Contributing

Team

License

About

Releases

Packages

Languages

talbaram3192/Emotion_Recognition_project

Folders and files

Latest commit

History

Repository files navigation

Speech Emotion Recognition

Description

Installation

Usage

Our app

Our Medium article

Contributing

Team

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages