Skip to content

Improved TTS in 7000 Languages

Latest
Compare
Choose a tag to compare
@Flux9665 Flux9665 released this 25 Jul 07:16

What's Changed

This release provides new checkpoints and improves some aspects of the previous release that were not included due to time constraints. For more information on the universal TTS model for 7000 languages, please refer to the previous release v3.0

  • Prosody prediction in terms of pitch, energy and durations are now stochastic and sample from a distribution instead of assuming a one-to-one mapping.
  • Added support for more IPA modifiers to cover more languages
  • Added more languages into the pretraining
  • Overhauled language similarity prediction modules and visualization

Full Changelog: v3.0...v3.1