Releases: coqui-ai/TTS
Releases · coqui-ai/TTS
v0.10.0
What's Changed
- v0.9.0 by @erogol in #1942
- fixed tutorial 2 incompatibility with new dev by @Aya-AlJafari in #2161
- Fix capacitron test when cuda is enabled by @WeberJulian in #2189
- Fix VITS multi-speaker voice conversion inference by @Edresson in #2187
- Handle espeak 1.48.15 by @erogol in #2203
- Python API implementation by @erogol in #2195
- Update README by @erogol in #2204
- Update formatters.py by @p0p4k in #2194
- Adding OverFlow by @shivammehta25 in #2183
- Add YourTTS VCTK recipe by @Edresson in #2198
- Add Original YourTTS vocabulary on YourTTS recipe for full transfer learning by @Edresson in #2206
- Adding pre-trained Overflow model by @erogol in #2211
- Fixup overflow by @erogol in #2218
- 🚀 v0.10.0 by @erogol in #2205
New Contributors
- @shivammehta25 made their first contribution in #2183
Full Changelog: v0.9.0...v0.10.0
v0.10.0_models
- Overflow model trained on LJSpeech dataset using pretrained HifiGAN vocoder
vocoder_models/en/ljspeech/hifigan_v2
.
v0.9.0
New models
- Added 25 new models covering 25 different EU languages from 👑https://github.com/NeonGeckoCom/neon-tts-plugin-coqui
What's Changed
- Trick to Upsampling to High sampling rates using VITS model by @Edresson in #1456
- Update Coqpit requirement by @Edresson in #1539
- Missing
f
prefix on f-strings fix by @code-review-doctor in #1532 - tiny improvement in data_path resolvement by @taras-sereda in #1567
- Fix VITS upsampling asserts by @Edresson in #1550
- Fix the bug in eSpeak wrapper for eSpeak version 1.48.15 by @Edresson in #1560
- 🐍 Python 3.10.x support and drop Python 3.6 support by @erogol in #1565
- Update CI tests by @erogol in #1572
- Build and publish CPU only Docker image by @erogol in #1573
- Add an assert for the upsampling trick by @erogol in #1538
- Add audio length sampler balancer by @Edresson in #1561
- Change the VITS upsampling interpolation trick to linear by @Edresson in #1564
- Capacitron by @a-froghyar in #977
- Fixed use_cuda issue in compute_embeddings.py by @ribeiromiranda in #1587
- Training recipes for thorsten dataset by @noranraskin in #1020
- fix invalid json by @s3781009 in #1599
- Use fsspec and torch for embedding file IO by @erogol in #1581
- Adding TTS Tutorials by @Aya-AlJafari in #1584
- Internal formatter by @WeberJulian in #1629
- Update training_a_model.md by @klotlabs in #1620
- Add synpaflex formatter by @WeberJulian in #1616
- added support for model_info in CLI by @p0p4k in #1623
- Add Thorsten VITS model by @WeberJulian in #1675
- Checkpoint bug fix by @manmay-nakhashi in #1641
- docs : Adding <model_type> in the arguments for CLI by @camillem in #1469
- Fix Publish CI by @WeberJulian in #1597
- Fix tokenizer for punc only by @WeberJulian in #1717
- Add durations as aux input for VITS by @WeberJulian in #1694
- feat: updated capacitron recipes and lr fix by @a-froghyar in #1718
- Implement VitsAudioConfig by @erogol in #1556
- Fix aux tests by @WeberJulian in #1753
- Fix for FloorDiv Function Warning by @iprovalo in #1760
- Update download_vctk.sh by @mengting7tw in #1739
- Update decoder.py by @p0p4k in #1792
- Update requirements.txt for python 3.10 support by @p0p4k in #1791
- Update README.md by @yuripourre in #1776
- Fix & update WaveRNN vocoder model by @vanIvan in #1749
- Update requirements.txt; inflect==5.6 by @p0p4k in #1809
- Update README.md; download progress bar in CLI. by @p0p4k in #1797
- Update wavenet.py by @p0p4k in #1796
- Adjust default to be able to process longer sentences by @lkiesow in #1835
- Fix language flags generated by espeak-ng phonemizer by @Lokhozt in #1801
- fix get_random_embeddings --> get_random_embedding by @manmay-nakhashi in #1726
- Introduce numpy and torch transforms by @erogol in #1705
- Implement bucketed weighted sampling for VITS by @erogol in #1871
- capacitron_layers multi speaker bug fix by @manmay-nakhashi in #1664
- updates to dataset analysis notebooks by @jreus in #1853
- Fix BCE loss issue by @erogol in #1872
- Remove deprecated files by @erogol in #1873
- Handle when no batch sampler by @erogol in #1882
- Fix tune wavegrad by @geth-network in #1844
- Add new DE Thorsten models by @erogol in #1898
- Add speaker encoder recipe by @Edresson in #1912
- Add capacitron v2 model by @WeberJulian in #1768
- Fixes a race condition with multiple simultaneous get requests. by @KyuubiYoru in #1807
- Fix find unique phonemes script by @Edresson in #1928
- Add YourTTS and SC-GlowTTS on available models by @Edresson in #1933
- Korean Phonemizer by @harmlessman in #1822
- Add espeak support for Chinese by @happylittlecat2333 in #1905
- Replace pyworld by pyin by @Edresson in #1946
- d-vector handling by @erogol in #1945
- Fixups by @erogol in #1967
- Fix VC by @WeberJulian in #1971
- Update readme by @erogol in #1978
- Add metafile arg to compute embeddings script by @erogol in #1977
- Fix dataset handling with the new embedding file keys by @Edresson in #1991
- Fix colliding dataset cache file names by @Edresson in #1994
- Write non-speech files in a TXT by @erogol in #2048
- Minor bug fixes on VITS/YourTTS and inference by @Edresson in #2054
- Check num of columns in coqui format by @erogol in #2066
- Remove
/
prefix from the relative path by @erogol in #2065 - Update Tutorial_2_train_your_first_TTS_model.ipynb by @CeadeS in #2079
- Update forward_tts.md by @mrshu in #2019
- Use "formatter" key in the datasets json array by @humada05 in #2114
- capacitron training fixes by @victor-shepardson in #2086
- mailabs formatter: back/forward slash in file path fix by @freezerain in #1938
- Add Discord server badge by @erogol in #2136
- Remove langs expect
en
andde
by @erogol in #2135 - Cache fsspec downloaded files by @erogol in #2132
- Update dep caching in actions by @erogol in #2138
- Update README.md by @eltociear in #2146
- Makes docker images lighter by @WeberJulian in #2149
- Doc update docker by @WeberJulian in #2153
- Add neon models by @loganhart420 in #2140
- Fix documentation by @WeberJulian in #2154
New Contributors
- @code-review-doctor made their first contribution in #1532
- @taras-sereda made their first contribution in #1567
- @ribeiromiranda made their first contribution in #1587
- @s3781009 made their first contribution in #1599
- @Aya-AlJafari made their first contribution in #1584
- @klotlabs made their first contribution in #1620
- @p0p4k made their first contribution in #1623
- @manmay-nakhashi made their first contribution in #1641
- @camillem made their first contribution in #1469
- @iprovalo made their first contribution in #1760
- @mengting7tw made their first contribution in #1739
- @yuripourre made their first contribution in #1776
- @vanIvan made their first contribution in #1749
- @lkiesow made their first contribution in #1835
- @Lokhozt made their first contribution in #1801
- @jreus made their first contribution in #1853
- @geth-network made their first contribution in #1844
- @KyuubiYoru made their first contribution in #1807
- @harmlessman made their first contribution in #1822
- @happylittlecat2333 made their first contribution in #1905
- @CEA...
v0.8.0 models
✨New models ✨ from @thorstenMueller
✨New models ✨ from @NeonGeckoCom
v0.8.0
What's Changed
- Trick to Upsampling to High sampling rates using VITS model by @Edresson in #1456
- Update Coqpit requirement by @Edresson in #1539
- Missing
f
prefix on f-strings fix by @code-review-doctor in #1532 - tiny improvement in data_path resolvement by @taras-sereda in #1567
- Fix VITS upsampling asserts by @Edresson in #1550
- Fix the bug in eSpeak wrapper for eSpeak version 1.48.15 by @Edresson in #1560
- 🐍 Python 3.10.x support and drop Python 3.6 support by @erogol in #1565
- Update CI tests by @erogol in #1572
- Build and publish CPU only Docker image by @erogol in #1573
- Add an assert for the upsampling trick by @erogol in #1538
- Add audio length sampler balancer by @Edresson in #1561
- Change the VITS upsampling interpolation trick to linear by @Edresson in #1564
- Capacitron by @a-froghyar in #977
- Fixed use_cuda issue in compute_embeddings.py by @ribeiromiranda in #1587
- Training recipes for thorsten dataset by @noranraskin in #1020
- fix invalid json by @s3781009 in #1599
- Use fsspec and torch for embedding file IO by @erogol in #1581
- Adding TTS Tutorials by @Aya-AlJafari in #1584
- Internal formatter by @WeberJulian in #1629
- Update training_a_model.md by @klotlabs in #1620
- Add synpaflex formatter by @WeberJulian in #1616
- added support for model_info in CLI by @p0p4k in #1623
- Add Thorsten VITS model by @WeberJulian in #1675
- Checkpoint bug fix by @manmay-nakhashi in #1641
- docs : Adding <model_type> in the arguments for CLI by @camillem in #1469
- Fix Publish CI by @WeberJulian in #1597
- Fix tokenizer for punc only by @WeberJulian in #1717
- Add durations as aux input for VITS by @WeberJulian in #1694
- feat: updated capacitron recipes and lr fix by @a-froghyar in #1718
- Implement VitsAudioConfig by @erogol in #1556
- Fix aux tests by @WeberJulian in #1753
- Fix for FloorDiv Function Warning by @iprovalo in #1760
- Update download_vctk.sh by @mengting7tw in #1739
- Update decoder.py by @p0p4k in #1792
- Update requirements.txt for python 3.10 support by @p0p4k in #1791
- Update README.md by @yuripourre in #1776
- Fix & update WaveRNN vocoder model by @vanIvan in #1749
- Update requirements.txt; inflect==5.6 by @p0p4k in #1809
- Update README.md; download progress bar in CLI. by @p0p4k in #1797
- Update wavenet.py by @p0p4k in #1796
- Adjust default to be able to process longer sentences by @lkiesow in #1835
- Fix language flags generated by espeak-ng phonemizer by @Lokhozt in #1801
- fix get_random_embeddings --> get_random_embedding by @manmay-nakhashi in #1726
- Introduce numpy and torch transforms by @erogol in #1705
- Implement bucketed weighted sampling for VITS by @erogol in #1871
- capacitron_layers multi speaker bug fix by @manmay-nakhashi in #1664
- updates to dataset analysis notebooks by @jreus in #1853
- Fix BCE loss issue by @erogol in #1872
- Remove deprecated files by @erogol in #1873
- Handle when no batch sampler by @erogol in #1882
- Fix tune wavegrad by @geth-network in #1844
- Add new DE Thorsten models by @erogol in #1898
New Contributors
- @code-review-doctor made their first contribution in #1532
- @taras-sereda made their first contribution in #1567
- @ribeiromiranda made their first contribution in #1587
- @s3781009 made their first contribution in #1599
- @Aya-AlJafari made their first contribution in #1584
- @klotlabs made their first contribution in #1620
- @p0p4k made their first contribution in #1623
- @manmay-nakhashi made their first contribution in #1641
- @camillem made their first contribution in #1469
- @iprovalo made their first contribution in #1760
- @mengting7tw made their first contribution in #1739
- @yuripourre made their first contribution in #1776
- @vanIvan made their first contribution in #1749
- @lkiesow made their first contribution in #1835
- @Lokhozt made their first contribution in #1801
- @jreus made their first contribution in #1853
- @geth-network made their first contribution in #1844
Full Changelog: v0.6.2...v0.8.0
v0.7.1 models
Add capacitron V2 model to TTS zoo. It's more stable and just as expressive!
v0.7.1
v0.7.0
What's Changed
- Trick to Upsampling to High sampling rates using VITS model by @Edresson in #1456
- Update Coqpit requirement by @Edresson in #1539
- Missing
f
prefix on f-strings fix by @code-review-doctor in #1532 - tiny improvement in data_path resolvement by @taras-sereda in #1567
- Fix VITS upsampling asserts by @Edresson in #1550
- Fix the bug in eSpeak wrapper for eSpeak version 1.48.15 by @Edresson in #1560
- 🐍 Python 3.10.x support and drop Python 3.6 support by @erogol in #1565
- Update CI tests by @erogol in #1572
- Build and publish CPU only Docker image by @erogol in #1573
- Add an assert for the upsampling trick by @erogol in #1538
- Add audio length sampler balancer by @Edresson in #1561
- Change the VITS upsampling interpolation trick to linear by @Edresson in #1564
- Capacitron by @a-froghyar in #977
- Fixed use_cuda issue in compute_embeddings.py by @ribeiromiranda in #1587
- Training recipes for thorsten dataset by @noranraskin in #1020
- fix invalid json by @s3781009 in #1599
- Use fsspec and torch for embedding file IO by @erogol in #1581
- Adding TTS Tutorials by @Aya-AlJafari in #1584
- Internal formatter by @WeberJulian in #1629
- Update training_a_model.md by @klotlabs in #1620
- Add synpaflex formatter by @WeberJulian in #1616
- added support for model_info in CLI by @p0p4k in #1623
- v0.7.0 by @erogol in #1537
New Contributors
- @code-review-doctor made their first contribution in #1532
- @taras-sereda made their first contribution in #1567
- @ribeiromiranda made their first contribution in #1587
- @s3781009 made their first contribution in #1599
- @Aya-AlJafari made their first contribution in #1584
- @klotlabs made their first contribution in #1620
- @p0p4k made their first contribution in #1623
Full Changelog: v0.6.2...v0.7.0
Speaker Encoder Model
Speaker encoder model and config file.
v0.7.0 models
- English Capacitron-T2 models + corresponding HiFiGAN V2 vocoder. Implemented and trained by 👑@a-froghyar
- German VITS model trained on Thorsten Dataset by 👑@thorstenMueller 👑@domcross