Making talk-llama feel real-time #803

AuroMun · 2023-04-06T07:12:41Z

AuroMun
Apr 6, 2023

Currently, the talk-llama demo waits for the model's output to finish generating before displaying it and sending it to the TTS.

Would it be useful to have the TTS run on a separate thread, and queue up partial responses as they're generated? E.g. for a response that goes:

The meaning of life is subjective and varies from person to person. Some believe that it involves finding happiness through personal growth and fulfillment while others view it as simply existing until death. Ultimately, the answer lies within oneself.

We could have the first line, or even fragment "The meaning of life is subjective" being read aloud, and during this time push the rest of the response to the queue. At the same time, we display the response as a stream instead of wait for it to finish.

I've done a quick and loose implementation of this and voice chatting with the bot seems more prompt and fun.

ggerganov · 2023-04-07T15:16:34Z

ggerganov
Apr 7, 2023
Maintainer

Sure, this would be cool addition. It should be relatively easy to add.

I was thinking in breaking the generated text into sentences and passing them to the TTS.
Also, some TTS engines have the option to provide a context so that the when generating a new speech segment, it can continue the intonation based on the previous text.

3 replies

ArunRaj000 Jan 4, 2024

can you make an solution for this, i am also facing this latency when i use text to speech,

ArunRaj000 Jan 4, 2024

can you make an solution for this, i am also facing this latency when i use text to speech,

ArunRaj000 Jan 4, 2024

can you give me the source code for this

yadamonk · 2023-04-16T21:18:38Z

yadamonk
Apr 16, 2023

I just stumbled upon this and am blown away by the quality: https://catid.io/posts/tts/

3 replies

BarfingLemurs Apr 16, 2023

tts --model_name tts_models/en/ljspeech/tacotron2-DDC_ph --vocoder_name vocoder_models/en/ljspeech/univnet --text "When I choose to see the good side of things, I'm not being naive. It is strategic and necessary. It's how I've learned to survive through everything. I know you go through life with your fists held tight. You see yourself as a fighter. Well, I see myself as one too. This is how I fight." --progress_bar True --use_cuda True

it doesn't seem like there are tts ai that run with cpu. (and not sure if tacotron2 is the best and fastest for local use)

ArunRaj000 Jan 4, 2024

does it reduce latency ? if yes can you please give me source code

0wwafa May 29, 2024

I created great voices using TTS! Using it's voice cloning ability and mixing diofferent voices...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Making talk-llama feel real-time #803

{{title}}

Replies: 2 comments 6 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Making talk-llama feel real-time #803

Replies: 2 comments · 6 replies

ggerganov Apr 7, 2023 Maintainer

Replies: 2 comments 6 replies

ggerganov
Apr 7, 2023
Maintainer