CRNN model for keyword spotting

This is the implementation of some methods I applied in the Tensorflow Speech Recognition Challenge held by Google Brain on Kaggle.com.

Model Structure

The goal of the challenge is to spot some simple key word occuring in voice record, which is well stated by the Keyword spotting problem in voice / speech recognition area. Inspired by This paper and some other articles by an attendent of BirdCLEF 2016 , I transformed audio sources into mel-spectrum images, construct a network with CNN top extracting features and RNN tail processing time relation. (Though pure CNN also works as well as CRNN method). I use InceptionNet and Resnet like network to construct the CNN top, and GRU for recurrent part. The implementation here end up 0.87 on private board for single model without ensemble. The best ensemble record I had was 0.89 on private with a bunch of networks of similar architechture.

prerequisite

python>=3.5
tensorflow>=1.8 (for some new features added when doning refactor)
keras>=2.1
librosa>=0.5.1
opencv

Usage

Data precess

I used librosa to do the audio-spectrum transformation and opencv to handle image data. Modify the data root in config.py to your position that holds the data and run data_preprocessor.py

Training with keras

keras_train.py uses keras pipeline to train. It's very simple and straightforward. You can directly config the training hyper-parameters from the command line.

Training with tensorflow

tensorflow_training.py uses use tf.data to consume data, and use tf.data tf.Estimator API to construct models and train. It reuses the model construct in keras directly, but by using tensorflow it provides easier way to custom loss function (making multi-task training possible) and the natively paralyzed data input pipeline provider higher GPU usage and efficiency. As the keras training script, the hyperparameters are configurable through command line arguments

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
ipynb		ipynb
keras_model		keras_model
.gitignore		.gitignore
README.md		README.md
config.py		config.py
data_proecesser.py		data_proecesser.py
keras_train.py		keras_train.py
predict_test.py		predict_test.py
tensorflow_training.py		tensorflow_training.py
tf_dataset.py		tf_dataset.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CRNN model for keyword spotting

Model Structure

prerequisite

Usage

Data precess

Training with keras

Training with tensorflow

About

Releases

Packages

Languages

buptlxy/TF-CRNN-kaggle-voice-competition

Folders and files

Latest commit

History

Repository files navigation

CRNN model for keyword spotting

Model Structure

prerequisite

Usage

Data precess

Training with keras

Training with tensorflow

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages