You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
i'm trying to use the generated audio for some automation.
Is there any way to ascertain something like word/character "timestamps" from the generation process? either would work.
Obviously the tts blends, it isn't sounding one character or one word at a time, but i'd imagine it still has to organise itself somehow.
Sorry i'm not too familiar with how tts engines work, hopefully that makes sense?
The text was updated successfully, but these errors were encountered:
i'm trying to use the generated audio for some automation.
Is there any way to ascertain something like word/character "timestamps" from the generation process? either would work.
Obviously the tts blends, it isn't sounding one character or one word at a time, but i'd imagine it still has to organise itself somehow.
Sorry i'm not too familiar with how tts engines work, hopefully that makes sense?
The text was updated successfully, but these errors were encountered: