You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Although this tool can handle polyphony words well, it is wrong for some common Mandarin pronunciation, maybe for mandarin users, we can use pypinyin to get partial_results in prepare_data ?
we can first replace the non polyphone chars:
but "拥" is polyphone, I need to find another way to solve it
Yes, there has some difference between Taiwan Mandarin and Chinese Mandarin.
So, in order to use g2p in Taiwan, we collect and annotate the training data for the situation. Hence, this model is trained that dataset.
Some suggests for you to handle this problem:
For monophonic characters, you can revise the dictionary
Train g2pW on high qaulity Chinese Mandarin dataset (maybe?
头发
拥抱
.. maybe there are many cases..
Although this tool can handle polyphony words well, it is wrong for some common Mandarin pronunciation, maybe for mandarin users, we can use pypinyin to get
partial_results
in prepare_data ?we can first replace the non polyphone chars:
![image](https://user-images.githubusercontent.com/24568452/184079166-5c9a9028-4939-41f5-a642-be00c3a5d224.png)
but "拥" is polyphone, I need to find another way to solve it
maybe we have to modify
POLYPHONIC_CHARS.txt
refer to this https://www.zhihu.com/question/31151037The text was updated successfully, but these errors were encountered: