🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
-
Updated
Aug 16, 2024 - Python
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
😝 TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
General Speech Restoration
DiffWave is a fast, high-quality neural vocoder and waveform synthesizer.
PyTorch Implementation of FastDiff (IJCAI'22)
VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested Adversarial Network
A fast, high-quality neural vocoder.
Unofficial PyTorch Implementation of UnivNet Vocoder (https://arxiv.org/abs/2106.07889)
iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier Transform
Vietnamese Text to Speech library
Official repository for the paper "Chunked Autoregressive GAN for Conditional Waveform Synthesis"
A toolkit for non-parallel voice conversion based on vector-quantized variational autoencoder
Official Implement of Multi-Stage Multi-Codebook (MSMC) TTS
Include Basis-MelGAN, MelGAN, HifiGAN and Multiband-HifiGAN, maybe NHV in the future.
Add a description, image, and links to the vocoder topic page so that developers can more easily learn about it.
To associate your repository with the vocoder topic, visit your repo's landing page and select "manage topics."