Skip to content

TTS in 7000 Languages

Compare
Choose a tag to compare
@Flux9665 Flux9665 released this 10 Jun 15:21
· 171 commits to MassiveScaleToucan since this release

This release extends the toolkits' functionality and provides new checkpoints.

  • We improved the overall TTS quality, with further enhancements already on their way
  • Watermarking is added to prevent misuse
  • We extend the support for almost all languages in the ISO-639-3 standard (that's over 7000 languages!)
  • With a few clever designs, we were able to extrapolate from a pretrained checkpoint using 462 languages to a checkpoint that can speak all languages for which we now support a text frontend!
  • Lots of simplifications and quality of life changes.

This is the outcome of a collaboration with colleagues from the University of Groningen and the Fraunhofer IIS in Erlangen. Together with our group from the University of Stuttgart, we have built this model, which is the first of its kind.

We will present this at the Interspeech 2024, the full list of authors is Florian Lux, Sarina Meyer, Lyonel Behringer, Frank Zalkow, Phat Do, Matt Coler, Emanuël Habets and Thang Vu.

Paper: https://arxiv.org/abs/2406.06403
Dataset: https://huggingface.co/datasets/Flux9665/BibleMMS
Interactive Demo: https://huggingface.co/spaces/Flux9665/MassivelyMultilingualTTS
Static Demo: https://anondemos.github.io/MMDemo/