This project focuses on automatic speech recognition task, specifically speech transcription, using Deep Neural Network (DNN) architecture. The model was trained and test on 10% of train-clean-100
and test-clean
from Librispeech. The implementation refered to AssemblyAI tutorial on E2E speech recognition system
This project employs CRNN structure with convolutional and GRU blocks to process the input spectrogram. The model output the prediction probabilities of the letters over the time steps.
To run the code, you need python
, pytorch
, and numpy
asr_main.py
incorperates the training loop and the testing stage of the speech transcription model
- Diep Luong
- Fareeda Mohammad