Official implementation of the Interspeech 2024 paper "Lightweight Transducer Based on Frame Level Criterion".
- Python3
pip install -r requirements.txt
- (Optional) Refer to https://kheafield.com/code/kenlm/ for installing KenLM.
Download AISHELL-1 and extract it to the directory data/aishell
.
python data/aishell/get_csv.py
python data/aishell/get_vocab.py
python train.py
python avg_model.py
python test.py
Training one epoch takes about five minutes with a single GPU RTX 4090 and CPU i9-13900K.
Testset | Sub | Del | Ins | CER |
---|---|---|---|---|
dev | 3.79 | 0.10 | 0.07 | 3.96 |
test | 4.10 | 0.16 | 0.05 | 4.31 |
Download resource_aishell and extract it to the directory data/aishell
.
python data/aishell/get_text.py
../kenlm/build/bin/lmplz -o 3 --text data/aishell/aishell_train.txt --arpa data/aishell/aishell_train.arpa -S 10% --interpolate_unigrams 0
python rescore.py
Testset | Sub | Del | Ins | CER |
---|---|---|---|---|
dev | 3.61 | 0.10 | 0.07 | 3.78 |
test | 3.82 | 0.16 | 0.04 | 4.03 |