implement example_rvc with TextToAudioStream rather than BufferStream #143

Meshwa428 · 2024-10-06T12:01:44Z

can you please update the example_rvc/xtts_rvc_synthesizer.py to utilize TextToAudioStream rather than using BufferStream.

because for cpu users, the audio is very oddly being played.

Recording.2024-10-06.172841.mp4

as you can see in the above video

KoljaB · 2024-10-06T12:44:20Z

This is very probably because CPU inference is too slow to do realtime XTTS synthesis. Even a lot of GPUs are. I don't think BufferStream has anything to do with is.

Meshwa428 · 2024-10-06T13:23:08Z

But I can do real time inferencing using the coqui engine and there is no interference like this one is having. When I use the Coqui engine, it runs completely fine.

I am not using XTTS either, the one I am using is way smaller than that

It's tts_models/en/vctk/vits

Isn't there a way to fix this?

Maybe the issue is that we are applying rvc on the audio chunk rather than full audio of the whole sentence

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

implement example_rvc with TextToAudioStream rather than BufferStream #143

implement example_rvc with TextToAudioStream rather than BufferStream #143

Meshwa428 commented Oct 6, 2024

KoljaB commented Oct 6, 2024

Meshwa428 commented Oct 6, 2024 •

edited

Loading

implement example_rvc with TextToAudioStream rather than BufferStream #143

implement example_rvc with TextToAudioStream rather than BufferStream #143

Comments

Meshwa428 commented Oct 6, 2024

KoljaB commented Oct 6, 2024

Meshwa428 commented Oct 6, 2024 • edited Loading

Meshwa428 commented Oct 6, 2024 •

edited

Loading