Classifying the genre of a music using deep neural networks
Music Genre classification is one of the branches of Music Information Retrieval (MIR).
A robust recommendation system begins with the categorization of music genres. Sound processing is a huge reaseach area
through which we can find solutions to various medical or mental issues through music theraphy solutions.
There are various music applications such as Spotify, Google Play, Apple Music, etc., but for implementation, one of the most
important steps is to classify the genre of a music which requires audio processing, it is one of the most complex tasks that
involves time signal processing, time series, spectrograms, spectral coefficients, and audio feature extraction to feed a neural network.
The dataset used is GTZAN (the famous GTZAN dataset, the MNIST of sounds)
The GTZAN dataset contains 1000 audio files. Contains a total of 10 genres, each genre contains 100 audio files.
1.Blues
2.Classical
3.Country
4.Disco
5.Hip-hop
6.Jazz
7.Metal
8.Pop
9.Reggae
10.Rock
A compilation of ten genres, each with 100 audio recordings, each lasting 30 seconds (the famous GTZAN dataset, the MNIST of sounds)
Each audio file has a visual representation. Neural networks are one technique to classify data because they usually take in some form of picture representation.
The audio files' features are contained within. Each song lasts for 30 seconds long has a mean and variance computed across several features taken from an audio file in one file. The songs are separated into 3 second audio files in the other file, which has the same format.