Use an n-gram LM to rescore the lattice from fast_beam_search. #365

csukuangfj · 2022-05-14T13:01:02Z

The PR adds another two decoding methods

fast_beam_search_nbest, similar to fast_beam_search, but it uses k2.random_paths() to sample n paths from the lattice instead of using k2.shortest_path()
fast_beam_search_with_nbest_rescoring: It uses an n-gram LM to rescore the lattice obtained from fast_beam_search. However, it does not seem to be helpful.

beam_4.0_max_contexts_32_max_states_8_num_paths_200_nbest_scale_0.5_temperature_1.0_ngram_lm_scale_0.1  2.15    best for test-clean
beam_4.0_max_contexts_32_max_states_8_num_paths_200_nbest_scale_0.5_temperature_1.0_ngram_lm_scale_0.2  2.41
beam_4.0_max_contexts_32_max_states_8_num_paths_200_nbest_scale_0.5_temperature_1.0_ngram_lm_scale_0.3  2.61
beam_4.0_max_contexts_32_max_states_8_num_paths_200_nbest_scale_0.5_temperature_1.0_ngram_lm_scale_0.4  2.77
beam_4.0_max_contexts_32_max_states_8_num_paths_200_nbest_scale_0.5_temperature_1.0_ngram_lm_scale_0.5  2.9
beam_4.0_max_contexts_32_max_states_8_num_paths_200_nbest_scale_0.5_temperature_1.0_ngram_lm_scale_0.6  2.96
beam_4.0_max_contexts_32_max_states_8_num_paths_200_nbest_scale_0.5_temperature_1.0_ngram_lm_scale_0.7  3.02
beam_4.0_max_contexts_32_max_states_8_num_paths_200_nbest_scale_0.5_temperature_1.0_ngram_lm_scale_0.8  3.08
beam_4.0_max_contexts_32_max_states_8_num_paths_200_nbest_scale_0.5_temperature_1.0_ngram_lm_scale_0.9  3.13
beam_4.0_max_contexts_32_max_states_8_num_paths_200_nbest_scale_0.5_temperature_1.0_ngram_lm_scale_1.0  3.17
beam_4.0_max_contexts_32_max_states_8_num_paths_200_nbest_scale_0.5_temperature_1.0_ngram_lm_scale_1.1  3.21
beam_4.0_max_contexts_32_max_states_8_num_paths_200_nbest_scale_0.5_temperature_1.0_ngram_lm_scale_1.2  3.24
beam_4.0_max_contexts_32_max_states_8_num_paths_200_nbest_scale_0.5_temperature_1.0_ngram_lm_scale_1.3  3.25
beam_4.0_max_contexts_32_max_states_8_num_paths_200_nbest_scale_0.5_temperature_1.0_ngram_lm_scale_1.4  3.27
beam_4.0_max_contexts_32_max_states_8_num_paths_200_nbest_scale_0.5_temperature_1.0_ngram_lm_scale_1.5  3.3

beam_4.0_max_contexts_32_max_states_8_num_paths_200_nbest_scale_0.5_temperature_1.0_ngram_lm_scale_0    1.99    best for test-clean
beam_4.0_max_contexts_32_max_states_8_num_paths_200_nbest_scale_0.5_temperature_1.0_ngram_lm_scale_0.01 2.0
beam_4.0_max_contexts_32_max_states_8_num_paths_200_nbest_scale_0.5_temperature_1.0_ngram_lm_scale_0.02 2.01
beam_4.0_max_contexts_32_max_states_8_num_paths_200_nbest_scale_0.5_temperature_1.0_ngram_lm_scale_-0.02        2.02
beam_4.0_max_contexts_32_max_states_8_num_paths_200_nbest_scale_0.5_temperature_1.0_ngram_lm_scale_0.05 2.04
beam_4.0_max_contexts_32_max_states_8_num_paths_200_nbest_scale_0.5_temperature_1.0_ngram_lm_scale_-0.05        2.05
beam_4.0_max_contexts_32_max_states_8_num_paths_200_nbest_scale_0.5_temperature_1.0_ngram_lm_scale_-0.1 2.12
beam_4.0_max_contexts_32_max_states_8_num_paths_200_nbest_scale_0.5_temperature_1.0_ngram_lm_scale_-0.2 2.27
beam_4.0_max_contexts_32_max_states_8_num_paths_200_nbest_scale_0.5_temperature_1.0_ngram_lm_scale_-0.3 2.54

danpovey · 2022-05-16T13:00:12Z

Hm, interesting. Regarding the n-best stuff, we should try to figure out why your results seem to be different from Liyong's.
I wonder whether there might be something going wrong regarding epsilons somehow? I expect we would have to add epsilon self-loops to both the lattice and the language model for composition, since the lattice naturally has epsilons.

csukuangfj · 2022-05-16T13:50:06Z

I expect we would have to add epsilon self-loops to both the lattice and the language model for composition, since the lattice naturally has epsilons.

I just added also epsilon self-loops to G and remove epsilon self-loops from the rescored word_fsas after instercting with G. The results are the same, i.e., not improved.

danpovey · 2022-05-16T14:07:55Z

why does 'fast_beam_search_with_nbest_rescoring' have nbest in it if it is lattice based?

danpovey · 2022-05-16T14:16:04Z

Anyway the decoding method is cool. I looked briefly at the code and did not see any obvious problems.
Perhaps we can merge this and then Liyong can try various comparisons vs. his KenLM setup to try to debugthis.

csukuangfj · 2022-05-16T14:29:53Z

why does 'fast_beam_search_with_nbest_rescoring' have nbest in it if it is lattice based?

I am using nbest rescoring, i.e., extracting n paths from the lattice, unique them, and then intersect them with the given G.

That is why the decoding name contains nbest.

I am not intersecting the generated lattice with the G directly since the generated lattice is an acceptor containing token IDs, while G contains word IDs.

I think we can intersect the genreated lattice with an LG graph, instead of a G.

glynpu · 2022-05-17T07:22:52Z

Liyong can try various comparisons vs. his KenLM setup to try to debugthis.

Nice implementation. I will study this.

danpovey · 2022-05-17T07:30:46Z

One possibility: Liyong might simply be using a larger LM, since KenLM is a compact format?

glynpu · 2022-05-17T07:35:02Z

One possibility: Liyong might simply be using a larger LM, since KenLM is a compact format?

~~I am using this one downloaded by torchaudio. https://github.com/pytorch/audio/blob/8fd60cc89fb0973c10b1c37ef77f0f22ddd47bd0/examples/asr/librispeech_ctc_decoder/inference.py#L19~~

After checking the config, I was using a larger LM, with 23G(mine) vs. 4.1G(downlowned from https://www.openslr.org/11/) .
/ceph-data2/ly/kenlm/train_lm/train.arpa

danpovey · 2022-05-17T08:22:06Z

that is probably converted from the 4-gram.arpa.gz downloaded from here https://www.openslr.org/11/
but IDK if that is what fangjun is using?

csukuangfj · 2022-05-17T08:31:10Z

that is probably converted from the 4-gram.arpa.gz downloaded from here https://www.openslr.org/11/ but IDK if that is what fangjun is using?

I have tried the 4-gram and 3-gram that are used by the conformer_ctc setup.

icefall/egs/librispeech/ASR/local/download_lm.py

Lines 60 to 61 in f6ce135

    
           "3-gram.pruned.1e-7.arpa.gz", 
        
           "4-gram.arpa.gz",

glynpu · 2022-05-18T05:52:24Z

Here is the Larger arpa I am using trained by myself.
You can try it if you want. @csukuangfj
/ceph-data2/ly/kenlm/train_lm/train.arpa

ezerhouni · 2022-07-07T15:10:55Z

@csukuangfj I am testing this branch on my machine and I am not seeing the same results. Could you tell me which models are you using (epoch and average) ? Thank you !

csukuangfj · 2022-07-08T00:08:38Z

@ezerhouni
Could you try https://huggingface.co/csukuangfj/icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13 ?

I just uploaded two new checkpoints to the exp directory.

ezerhouni · 2022-07-08T06:45:23Z

@csukuangfj I am not seeing them

Last commit is from 21 days ago (the yesterday commit is only about modifying the readme)

csukuangfj · 2022-07-08T06:50:30Z

@ezerhouni

Sorry. Please check again.

ezerhouni · 2022-07-08T07:01:00Z

@csukuangfj Thanks ! I just test it. I got the same result as yours for the test-clean (i.e ngram_lm_scale_0) is the best, for test-other I am getting slightly better :

For test-other, WER of different settings are:
beam_4.0_max_contexts_8_max_states_32_num_paths_200_nbest_scale_0.5_temperature_1.0_ngram_lm_scale_0.05	4.85	best for test-other
beam_4.0_max_contexts_8_max_states_32_num_paths_200_nbest_scale_0.5_temperature_1.0_ngram_lm_scale_0.1	4.85
beam_4.0_max_contexts_8_max_states_32_num_paths_200_nbest_scale_0.5_temperature_1.0_ngram_lm_scale_0.01	4.88
beam_4.0_max_contexts_8_max_states_32_num_paths_200_nbest_scale_0.5_temperature_1.0_ngram_lm_scale_0.02	4.88
beam_4.0_max_contexts_8_max_states_32_num_paths_200_nbest_scale_0.5_temperature_1.0_ngram_lm_scale_0	4.89
beam_4.0_max_contexts_8_max_states_32_num_paths_200_nbest_scale_0.5_temperature_1.0_ngram_lm_scale_-0.02	4.95
beam_4.0_max_contexts_8_max_states_32_num_paths_200_nbest_scale_0.5_temperature_1.0_ngram_lm_scale_-0.05	5.02
beam_4.0_max_contexts_8_max_states_32_num_paths_200_nbest_scale_0.5_temperature_1.0_ngram_lm_scale_0.3	5.02
beam_4.0_max_contexts_8_max_states_32_num_paths_200_nbest_scale_0.5_temperature_1.0_ngram_lm_scale_-0.1	5.16
beam_4.0_max_contexts_8_max_states_32_num_paths_200_nbest_scale_0.5_temperature_1.0_ngram_lm_scale_0.5	5.3
beam_4.0_max_contexts_8_max_states_32_num_paths_200_nbest_scale_0.5_temperature_1.0_ngram_lm_scale_0.8	5.53
beam_4.0_max_contexts_8_max_states_32_num_paths_200_nbest_scale_0.5_temperature_1.0_ngram_lm_scale_-0.2	5.6
beam_4.0_max_contexts_8_max_states_32_num_paths_200_nbest_scale_0.5_temperature_1.0_ngram_lm_scale_1.0	5.61
beam_4.0_max_contexts_8_max_states_32_num_paths_200_nbest_scale_0.5_temperature_1.0_ngram_lm_scale_1.5	5.76
beam_4.0_max_contexts_8_max_states_32_num_paths_200_nbest_scale_0.5_temperature_1.0_ngram_lm_scale_2.5	5.89
beam_4.0_max_contexts_8_max_states_32_num_paths_200_nbest_scale_0.5_temperature_1.0_ngram_lm_scale_3	5.94
beam_4.0_max_contexts_8_max_states_32_num_paths_200_nbest_scale_0.5_temperature_1.0_ngram_lm_scale_-0.5	6.63

I will try to work on it a bit in the coming days (if I can find some spare time) and I will let you know if we can improve the results

csukuangfj · 2022-07-08T07:03:56Z

1.0_ngram_lm_scale_0.01 4.88
beam_4.0_max_contexts_8_max_states_32_num_paths_200_nbest_scale_0.5_temperature_1.0_ngram_lm_scale_0.02 4.88
beam_4.0_max_contexts_8_max_states_32_num_paths_200_nbest_scale_0.5_temperature_1.0_ngram_lm_scale_0 4.89
beam_4.0_max_contexts_8_max_states_32_num_paths_200_nbest_scale_0.5_temperature

Thanks!

ezerhouni · 2022-07-18T08:48:49Z

I think this PR can be closed

Use an n-gram LM to rescore the lattice from fast_beam_search.

9ffc77a

Add epsilon self-loops to G.

3d833d9

Fix an error.

8301fae

ezerhouni mentioned this pull request Jun 13, 2022

WIP: Begin to add RNNLM. #127

Closed

4 tasks

ezerhouni mentioned this pull request Jul 11, 2022

[WIP] Rnn-T LM nbest rescoring #471

Merged

csukuangfj closed this Jul 18, 2022

csukuangfj deleted the rnnt-lm-rescoring branch July 28, 2023 02:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use an n-gram LM to rescore the lattice from fast_beam_search. #365

Use an n-gram LM to rescore the lattice from fast_beam_search. #365

csukuangfj commented May 14, 2022

danpovey commented May 16, 2022

csukuangfj commented May 16, 2022

danpovey commented May 16, 2022

danpovey commented May 16, 2022

csukuangfj commented May 16, 2022

glynpu commented May 17, 2022

danpovey commented May 17, 2022

glynpu commented May 17, 2022 •

edited

Loading

danpovey commented May 17, 2022

csukuangfj commented May 17, 2022

glynpu commented May 18, 2022

ezerhouni commented Jul 7, 2022

csukuangfj commented Jul 8, 2022

ezerhouni commented Jul 8, 2022

csukuangfj commented Jul 8, 2022 •

edited

Loading

ezerhouni commented Jul 8, 2022

csukuangfj commented Jul 8, 2022

ezerhouni commented Jul 18, 2022

Use an n-gram LM to rescore the lattice from fast_beam_search. #365

Use an n-gram LM to rescore the lattice from fast_beam_search. #365

Conversation

csukuangfj commented May 14, 2022

danpovey commented May 16, 2022

csukuangfj commented May 16, 2022

danpovey commented May 16, 2022

danpovey commented May 16, 2022

csukuangfj commented May 16, 2022

glynpu commented May 17, 2022

danpovey commented May 17, 2022

glynpu commented May 17, 2022 • edited Loading

danpovey commented May 17, 2022

csukuangfj commented May 17, 2022

glynpu commented May 18, 2022

ezerhouni commented Jul 7, 2022

csukuangfj commented Jul 8, 2022

ezerhouni commented Jul 8, 2022

csukuangfj commented Jul 8, 2022 • edited Loading

ezerhouni commented Jul 8, 2022

csukuangfj commented Jul 8, 2022

ezerhouni commented Jul 18, 2022

glynpu commented May 17, 2022 •

edited

Loading

csukuangfj commented Jul 8, 2022 •

edited

Loading