Keyword-Spotting-Based-on-Low-Quality-Acoustic-Model-Output

Chinese only. Do you want to implement a Chinese speech keyword spotting program like me, but only have a small number of Chinese open data sets which is not enough to generate a competitive end-to-end speech recognition/keyword spotting program (CER about 30%)? Try this tool!

The principle is to convert Chinese characters into Pinyin and match the target keywords.

Input: output sting of your recognition/spotting program. Output: "keyword1;keyword1;keyword2;keyword4....."

I strongly recommend removing the language model of your speech recognition program so that the string can convey the row speech more faithfully. After all, if your acoustic model is only 70 percent accurate, Kenlm can't help you much more than distorting row pronunciation.

Using

demo keyword: 你好；再见；没错； ###code import spotting ss = keywordSpoter(["keyword1","keyword2","keyword3"],simithreashold = 78) ss.getWord(inputstr) ###demo $ python spotting.py ##parameters simithreshold: 78 default. If the Similarity of Pinyin string of the input and keyword is greater than simithreshold, the word will be corrected to the target word and added to output.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
README.md		README.md
requirements.txt		requirements.txt
spotting.py		spotting.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Keyword-Spotting-Based-on-Low-Quality-Acoustic-Model-Output

Using

About

Releases

Packages

Languages

azuredream/Keyword-Spotting-Based-on-Low-Quality-Acoustic-Model-Output

Folders and files

Latest commit

History

Repository files navigation

Keyword-Spotting-Based-on-Low-Quality-Acoustic-Model-Output

Using

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages