Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[tts] 基于 BERT 实现语音合成文本前端的多音字预测 #1283

Closed
yt605155624 opened this issue Jan 6, 2022 · 7 comments
Closed

Comments

@yt605155624
Copy link
Collaborator

yt605155624 commented Jan 6, 2022

目前的多音字使用 pypinyin 或者 g2pM,精度有限,想做一个基于 BERT (或者 ERNIE) 多音字预测模型,简单来说就是假设某语言有 100 个多音字,每个多音字最多有 3 个发音,那么可以在 BERT 后面接 100 个 3 分类器(简单的 fc 层即可),在预测时,找到对应的分类器进行分类即可。
参考论文:
tencent_polyphone.pdf

数据可以用 https://github.com/kakaobrain/g2pM 提供的数据

进阶:多任务的 BERT
image

@Jzow
Copy link

Jzow commented Jan 18, 2022

但是我发现 并没有英语的合成的 示例,客观评价paddle在这块的doc 远远不如其他开源,mozilla 和 tensorflow的 TTS 会有明确的文档

@yt605155624
Copy link
Collaborator Author

ljspeech 和 vctk 都是英文的合成数据集,包含示例

@Jzow
Copy link

Jzow commented Jan 18, 2022

@yt605155624 非常感谢你的及时回复,我会留意看一下,

@stale stale bot added the Stale label Mar 5, 2022
@zh794390558 zh794390558 changed the title 基于 BERT 实现语音合成文本前端的多音字预测 [tts] 基于 BERT 实现语音合成文本前端的多音字预测 Mar 29, 2022
@stale stale bot removed the Stale label Mar 29, 2022
@stale stale bot added the Stale label Jun 11, 2022
@stale stale bot closed this as completed Jul 12, 2022
@stale stale bot moved this to Done in PaddleSpeech Jul 12, 2022
@yt605155624 yt605155624 reopened this Jul 28, 2022
@stale stale bot removed the Stale label Jul 28, 2022
@GloryRoadWangzh
Copy link

基于bert实现语音合成文本前端的多音字预测有代码实现吗?

@yt605155624
Copy link
Collaborator Author

yt605155624 commented Aug 8, 2022

@GloryRoadWangzh 目前没有,可以参考标点预测来做,基于 paddlenlp,目前有开发者正在把 g2pw 加到我们的前端,是基于 bert 的,所以我们可能就不自己搞多音字预测了 #2230

@PaddlePaddle PaddlePaddle deleted a comment from stale bot Sep 7, 2022
@PaddlePaddle PaddlePaddle deleted a comment from stale bot Sep 7, 2022
@PaddlePaddle PaddlePaddle deleted a comment from stale bot Sep 7, 2022
@lucasjinreal
Copy link

@yt605155624 请教一下,为什么有了g2pw 就不需要多因子预测了,比如下面的句子能预测对马:

孩子,别吃了,这里的肉脏,走,跟我去太平间

@yt605155624
Copy link
Collaborator Author

@jinfagang 因为 g2pw 就是一种基于 bert 的多音字预测模型

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

No branches or pull requests

4 participants