Implementation of "An open-source end-to-end ASR system for Brazilian Portuguese using DNNs built from newly assembled corpora" by Igor Quintanilha, Luiz Wagner Pereira Biscainho, and Sergio Lima Netto. (submitted).
- pytorch >= 1.0.1
- cudatoolkit >= 9.0
- torchvision
- torchaudio
- ignite
- pyyaml
- wget
- num2words
- unidecode
- editdistance
- ctcdecode
All datasets can be found here.
AM | Trained on | Method | WER | Download |
---|---|---|---|---|
DeepSpeech 2 | BRSD v2 | Scratch | 52.55% (2.42%) | Link |
DeepSpeech 2 | BRSD v2 | Fine-tuned | 47.41% (1.73%) | Link |
Language model* | RP | Size | LapsBM | BRTD |
---|---|---|---|---|
word 3-gram | 25 | 1.9G | 173.79 | 161.29 |
word 5-gram | 42 | 7.8G | 136.50 | 135.12 |
char 5-gram | 5 | 41M | <=2,334.48 | <=2,694.51 |
char 10-gram | 10 | 4.7G | <=271.86$ | <=323.71 |
char 15-gram* | 15 | 5.4G | <=239.59$ | <=198.49 |
char 20-gram* | 20 | 8.8G | <=227.84$ | <=189.53 |
*All models were trained using KenLM. More detailed information in the paper.