Skip to content

Based on speech recognition techniques, the code is used for Loading, Preprocessing and Dataset formation of MFCCs.

Notifications You must be signed in to change notification settings

acen20/MFCC-Feature-Extraction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Output

Generates 13 MFCCs (Columns), 44 Samples (Rows) for each sound signal and a Target column (Label) in addition to 13 MFCCs. (Review the dataset.csv to have an idea about shape of final data)

REVIEW THE GRAPHS BELOW

Quick guide

  1. For each type of sound, create a directory or folder in the audio/ directory.
  2. To see what I mean by that, explore the audio folder in this repository. I have placed an audio as an example.
  3. After you are done making directories for sounds,
    place this script in the directory as I have placed it in this repository.

Following is the sequence of transitions that the signal goes through until MFCCs are generated

Waveform

Waveform

Fourier Transform

FFT

Power Spectrum

Power Spec

Spectrogram

spec

Log Spectrogram

log spec

MFCCs (Mel Frequency Cepstral Coefficients)

MFCCs

Key points

  1. This code rejects the audio signals having lower sample rate than 22050.
  2. Number of MFCCs selected are 13.
  3. Hop length across the signal is 512.
  4. Number of fast fourier transformation is 2048.

About

Based on speech recognition techniques, the code is used for Loading, Preprocessing and Dataset formation of MFCCs.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published