Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TTS/TTS/tts/layers/xtts/tokenizer.py", line 180, in expand_abbreviations_multilingual for regex, replacement in _abbreviations[lang]: KeyError: 'zh-cn'[Bug] #3189

Closed
lucasjinreal opened this issue Nov 10, 2023 · 10 comments
Assignees
Labels
bug Something isn't working wontfix This will not be worked on but feel free to help.

Comments

@lucasjinreal
Copy link

Describe the bug

TTS/TTS/tts/layers/xtts/tokenizer.py", line 180, in expand_abbreviations_multilingual
for regex, replacement in _abbreviations[lang]:
KeyError: 'zh-cn'

To Reproduce

TTS/TTS/tts/layers/xtts/tokenizer.py", line 180, in expand_abbreviations_multilingual
for regex, replacement in _abbreviations[lang]:
KeyError: 'zh-cn'

Expected behavior

TTS/TTS/tts/layers/xtts/tokenizer.py", line 180, in expand_abbreviations_multilingual
for regex, replacement in _abbreviations[lang]:
KeyError: 'zh-cn'

Logs

TTS/TTS/tts/layers/xtts/tokenizer.py", line 180, in expand_abbreviations_multilingual
    for regex, replacement in _abbreviations[lang]:
KeyError: 'zh-cn'

Environment

TTS/TTS/tts/layers/xtts/tokenizer.py", line 180, in expand_abbreviations_multilingual
    for regex, replacement in _abbreviations[lang]:
KeyError: 'zh-cn'

Additional context

No response

@lucasjinreal lucasjinreal added the bug Something isn't working label Nov 10, 2023
@douhaohaode
Copy link

douhaohaode commented Nov 10, 2023

If zh-cn and zh represent Chinese, it is recommended to use one.

如果想运行可以手动先更改TTS文件下tokenizer.py中118行和283行 zh改为zh-cn

@lucasjinreal
Copy link
Author

I think the tokenizer these map's keys should be consistent with language codes.

@AIFSH
Copy link

AIFSH commented Nov 11, 2023

before offical fix

pip uninstall TTS
pip install TTS==0.20.2

work!

@jbang2004
Copy link

If zh-cn and zh represent Chinese, it is recommended to use one.

如果想运行可以手动先更改TTS文件下tokenizer.py中118行和283行 zh改为zh-cn

可以啊兄弟,对了,兄弟知道怎么保存说话人的潜在特征和嵌入,使用这些特征生成多段对话吗?现在每次都要先生成特征,再推理,效率很低

@lucasjinreal
Copy link
Author

@jbang2004 可以,但是官方似乎压根没有考虑这个问题

@jbang2004
Copy link

@jbang2004 可以,但是官方似乎压根没有考虑这个问题

研究了一个上午,官方文档里有个直接从模型提取特征,然后用torchaudio生成wav的方法,这个可以一直沿用相同的特征进行转换,不过这种方法生成的效果比使用api差一些,不知道为什么

@lucasjinreal
Copy link
Author

@jbang2004 方便分享一下代码吗

@Edresson
Copy link
Contributor

Edresson commented Nov 14, 2023

I fixed it on #3216. "zh-cn" is what we have in the config and docs so I rename "zh" to "zh-cn".

@genglinxiao
Copy link

I think there are 2 places that the key code used for Chinese language are inconsistent:
The model uses "zh-cn" for the Chinese (simplifed) language. However, the key defined in the _abbreviations and the _symbols_multilingual for Chinese language is "zh". These 2 structures are used in expand_abbreviations_multilingual() and expand_symbols_multilingual() respectively, resulting in key errors.

In my case, I changed the key from "zh-cn" to "zh" inside these 2 functions by adding the following lines to the functions.

    if lang=="zh-cn":
        lang="zh"

But I think there ought to be a cleaner solution.

Copy link

stale bot commented Dec 16, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. You might also look our discussion channels.

@stale stale bot added the wontfix This will not be worked on but feel free to help. label Dec 16, 2023
@stale stale bot closed this as completed Dec 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working wontfix This will not be worked on but feel free to help.
Projects
None yet
Development

No branches or pull requests

6 participants