Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

collaboration Request #1

Open
mush42 opened this issue Mar 6, 2023 · 1 comment
Open

collaboration Request #1

mush42 opened this issue Mar 6, 2023 · 1 comment

Comments

@mush42
Copy link

mush42 commented Mar 6, 2023

Hi @rmcpantoja

Recently, one-stage TTS models have gained a lot of ground in terms of quality and efficiency. They are vastly better than two stage models.

So, I suggest switching to vits or jets for this add-on. This will vastly simplify the deployment of voices.

One-stage TTS models could be exported to ONNX, and inferenced using onnxruntime. Even better, the inference can be accelerated using Cuda or DirectML without any additional efforts.

Last week, I've started a similar project to make some Vits TTS voices available for Windows, and the initial results are promising. You can check out my work at:

https://github.com/blindpandas/crystal-tts

If you want to collaborate on this, let's get in touch. I'm @mush42 on Discord.

Best
Musharraf

@mush42 mush42 changed the title Use ONNX Runtime collaboration Request Mar 6, 2023
@rmcpantoja
Copy link
Owner

rmcpantoja commented Mar 8, 2023

Hi @mush42 , thanks for your interest in this addon! And to be honest, I'm really new at this and I would really like to receive a collaboration from you and other users. They are welcome!
ForwardTacotron also works in real time in Cuda, even on long texts. The question would be to make two distributed versions, or maybe we will have better ideas if we talk about it on Discord.
I am going to read Cristal TTS and I will comment to you, and also, if it is possible to make new voices, I will be willing to do so.
And as for VITS TTS, it is also a good synthesizer, but it also has some disadvantages. For example, it doesn't work on long texts as it causes a minor speech problem, but I know that with your help it could be added as a separate Synth Driver.
I'll be in touch soon!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants