IET-MusicSpeechClassifier

The aim of this project is to build various models to classify a given audio input as a music or speech file. The dataset used was obtained from Marsyas. It contains 64 samples of each, speech and music. The mentioned features were extracted from the audio files using librosa. Scipy was used to build the models. The parameters of the model were fine tuned to get the best results.
Dataset used: "http://marsyas.info/downloads/datasets.html".
Research Paper Referred: "https://link.springer.com/article/10.1155/2009/239892".

Features extracted:

Standard deviation of energy.
Mean value and standard deviation of difference energy.
Standard deviation of autocorrelation.
Standard deviation of autocorrelation difference.
Mean and standard deviation of difference of 9th, 7th, 4th Mel Frequency Cepstrum Coefficients.
Low Short time Energy ratio

Classification Models

K-Nearest Neighbour
Decision Tree
SVC (kernel: linear)
SVC (kernel: rbf)
Logistic Regression
Naive Bayes
Ensemble-Random Forest

Libraries and tools

numpy for array related operations and pandas.
scikit for built in models.
librosa
spyder

Project Members

Bhargav S (Mentor)
Skanda U
Rahul Gite
Abhishek Ranjan

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

IET-MusicSpeechClassifier

Features extracted:

Classification Models

Libraries and tools

Project Members

Files

README.md

Latest commit

History

README.md

File metadata and controls

IET-MusicSpeechClassifier

Features extracted:

Classification Models

Libraries and tools

Project Members