Flux9665
released this
07 Oct 13:22
·
12 commits
to MassiveScaleToucan
since this release
This release includes a new GUI that allows you to control exactly how an utterance sounds.
You can generate a bunch of different realizations until you get one that you like. Then you can modify it further by dragging around the pitch values and the durations of individual phones. You can also exchange the voice for a different one while keeping your changes to the intonation and duration exactly as they are. And of course you can do so in over 7000 languages.
Just update the new requirements and run the run_advanced_GUI_demo.py
script. By default it will load the pretrained models from Hugging Face🤗, but you can also specify our own.