28 Jun 22:46

erogol

a035b25

😄 v0.15.0

What's Changed

Update stochastic_duration_predictor.py by @mengting7tw in #2663
Fix Tortoise load by @erogol in #2697
Inference API for 🐶Bark by @erogol in #2685
Drop Python 3.7 and 3.8 and stage Python 3.11 by @erogol in #2700

Running 🐶Bark

text = "Hello, my name is Manmay , how are you?"

from TTS.tts.configs.bark_config import BarkConfig
from TTS.tts.models.bark import Bark

config = BarkConfig()
model = Bark.init_from_config(config)
model.load_checkpoint(config, checkpoint_dir="path/to/model/dir/", eval=True)

# with random speaker
output_dict = model.synthesize(text, config, speaker_id="random", voice_dirs=None)

# cloning a speaker.
# It assumes that you have a speaker file in `bark_voices/speaker_n/speaker.wav` or `bark_voices/speaker_n/speaker.npz`
output_dict = model.synthesize(text, config, speaker_id="ljspeech", voice_dirs="bark_voices/")

Using 🐸TTS API:

from TTS.api import TTS

# Load the model to GPU
# Bark is really slow on CPU, so we recommend using GPU.
tts = TTS("tts_models/multilingual/multi-dataset/bark", gpu=True)


# Cloning a new speaker
# This expects to find a mp3 or wav file like `bark_voices/new_speaker/speaker.wav`
# It computes the cloning values and stores in `bark_voices/new_speaker/speaker.npz`
tts.tts_to_file(text="Hello, my name is Manmay , how are you?",
                file_path="output.wav",
                voice_dir="bark_voices/",
                speaker="ljspeech")


# When you run it again it uses the stored values to generate the voice.
tts.tts_to_file(text="Hello, my name is Manmay , how are you?",
                file_path="output.wav",
                voice_dir="bark_voices/",
                speaker="ljspeech")


# random speaker
tts = TTS("tts_models/multilingual/multi-dataset/bark", gpu=True)
tts.tts_to_file("hello world", file_path="out.wav")

Using 🐸TTS Command line:

# cloning the `ljspeech` voice
tts --model_name  tts_models/multilingual/multi-dataset/bark \
--text "This is an example." \
--out_path "output.wav" \
--voice_dir bark_voices/ \
--speaker_idx "ljspeech" \
--progress_bar True

# Random voice generation
tts --model_name  tts_models/multilingual/multi-dataset/bark \
--text "This is an example." \
--out_path "output.wav" \
--progress_bar True

Full Changelog: v0.14.3...v0.15.0

Contributors

erogol and mengting7tw

Assets 2

2 Join discussion

06 Jun 07:43

erogol

v0.14.3

49cf6a5

👉 v0.14.3

Bump up to v0.14.3

Assets 2

1 Join discussion

05 Jun 20:41

erogol

v0.14.2

547a72c

⛈️ v0.14.2

Full Changelog: v0.14.1...v0.14.2

Assets 2

05 Jun 09:30

erogol

v0.14.1

a494f0c

🚗 v0.14.1

What's Changed

Fetch all built-in speakers from API by @reuben in #2626
fix typo by @vodiylik in #2647
Port Fairseq TTS models by @erogol in #2628

New Contributors

@vodiylik made their first contribution in #2647

Full Changelog: v0.14.0...v0.14.1

Example text to speech using Fairseq models in ~1100 languages 🤯.

For these models use the following name format: tts_models/<lang-iso_code>/fairseq/vits.

You can find the list of language ISO codes here and learn about the Fairseq models here.

from TTS.api import TTS
api = TTS(model_name="tts_models/eng/fairseq/vits", gpu=True)
api.tts_to_file("This is a test.", file_path="output.wav")

# TTS with on the fly voice conversion
api = TTS("tts_models/deu/fairseq/vits")
api.tts_with_vc_to_file(
    "Wie sage ich auf Italienisch, dass ich dich liebe?",
    speaker_wav="target/speaker.wav",
    file_path="ouptut.wav"
)

Contributors

reuben, erogol, and vodiylik

Assets 2

16 May 08:09

erogol

v0.14.0

bc0a532

v0.14.0

What's Changed

Typos and minor fixes by @prakharpbuf in #2508
Add FR and ES gruut languages as requirement to avoid inference issues by @Edresson in #2572
Lighter docker image by @WeberJulian in #2600
Use default_factory for audio parameter by @v4hn in #2576
Update README.md by @HighnessAtharva in #2577
Add Jenny model by @erogol in #2603
Warn when lang is not avail by @erogol in #2460
Update VAD for silence trimming. by @erogol in #2604
Tortoise TTS inference by @manmay-nakhashi in #2547
Draft ONNX export for VITS by @erogol in #2563

New Contributors

@prakharpbuf made their first contribution in #2508
@v4hn made their first contribution in #2576
@HighnessAtharva made their first contribution in #2577

Full Changelog: v0.13.3...v0.14.0

Contributors

v4hn, erogol, and 5 other contributors

Assets 2

2 Join discussion

27 Apr 14:39

erogol

v0.14.1_models

ba40a1c

v0.14.1_models Pre-release

Pre-release

Update README.md (#2577)

Update link to point to blob instead of edit

Assets 11

08 May 10:02

erogol

v0.14.0_models

c1875f6

v0.14.0_models Pre-release

Pre-release

Jenny VITS model trained by 👑@noml4u

tts --model_name tts_models/en/jenny/jenny --text "This is a test. This is also a test."

Contributors

noml4u

Assets 3

17 Apr 14:15

erogol

v0.13.3

2071088

🐶 v0.13.3

What's Changed

Bangla models by @erogol in #2532

Full Changelog: v0.13.2...v0.13.3

Contributors

erogol

Assets 2

0 Join discussion

17 Apr 11:47

erogol

v0.13.3_models

e4c5c27

v0.13.3_models

Single speaker Bangla Male/Female models

These are single-speaker VITS models with a 22050hz sampling rate.

By 👑 @mobassir94
Original repo: https://github.com/mobassir94/comprehensive-bangla-tts

Male Model

tts --model_name tts_models/bn/custom/vits-male --text "এটি ডেমো করার উদ্দেশ্যে একটি ডেমো"

from TTS.api import TTS

tts = TTS(model_name="tts_models/bn/custom/vits-male")
tts.tts_to_file(text="এটি ডেমো করার উদ্দেশ্যে একটি ডেমো", file_path="output.wav")

# TTS with voice conversion to a reference speaker in `target_speaker.wav`
tts_with_vc_to_file(text="এটি ডেমো করার উদ্দেশ্যে একটি ডেমো", speaker_wav="target_speaker.wav", file_path="output.wav")

Female Model

tts --model_name tts_models/bn/custom/vits-female --text "এটি ডেমো করার উদ্দেশ্যে একটি ডেমো"

from TTS.api import TTS

tts = TTS(model_name="tts_models/bn/custom/vits-female")
tts.tts_to_file(text="এটি ডেমো করার উদ্দেশ্যে একটি ডেমো", file_path="output.wav")

# TTS with voice conversion to a reference speaker in `target_speaker.wav`
tts_with_vc_to_file(text="এটি ডেমো করার উদ্দেশ্যে একটি ডেমো", speaker_wav="target_speaker.wav", file_path="output.wav")

Contributors

mobassir94

Assets 4

14 Apr 08:48

erogol

v0.13.2

b3b4034

🌈v0.13.2

What's Changed

🐸Studio models by tts by @erogol in #2515
Update VAD by @erogol in #2509
🌈 v0.13.2 by @erogol in #2519

Full Changelog: v0.13.1...v0.13.2

Contributors

erogol

Assets 2

0 Join discussion

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's Changed

Running 🐶Bark

Contributors

What's Changed

New Contributors

Example text to speech using Fairseq models in ~1100 languages 🤯.

Contributors

What's Changed

New Contributors

Contributors

Contributors

What's Changed

Contributors

Single speaker Bangla Male/Female models

Male Model

Female Model

Contributors

What's Changed

Contributors

Releases: coqui-ai/TTS

😄 v0.15.0

What's Changed

Running 🐶Bark

Contributors

👉 v0.14.3

⛈️ v0.14.2

🚗 v0.14.1

What's Changed

New Contributors

Example text to speech using Fairseq models in ~1100 languages 🤯.

Contributors

v0.14.0

What's Changed

New Contributors

Contributors

v0.14.1_models

v0.14.0_models

Contributors

🐶 v0.13.3

What's Changed

Contributors

v0.13.3_models

Single speaker Bangla Male/Female models

Male Model

Female Model

Contributors

🌈v0.13.2

What's Changed

Contributors