😄 v0.15.0
What's Changed
- Update stochastic_duration_predictor.py by @mengting7tw in #2663
- Fix Tortoise load by @erogol in #2697
- Inference API for 🐶Bark by @erogol in #2685
- Drop Python 3.7 and 3.8 and stage Python 3.11 by @erogol in #2700
Running 🐶Bark
text = "Hello, my name is Manmay , how are you?"
from TTS.tts.configs.bark_config import BarkConfig
from TTS.tts.models.bark import Bark
config = BarkConfig()
model = Bark.init_from_config(config)
model.load_checkpoint(config, checkpoint_dir="path/to/model/dir/", eval=True)
# with random speaker
output_dict = model.synthesize(text, config, speaker_id="random", voice_dirs=None)
# cloning a speaker.
# It assumes that you have a speaker file in `bark_voices/speaker_n/speaker.wav` or `bark_voices/speaker_n/speaker.npz`
output_dict = model.synthesize(text, config, speaker_id="ljspeech", voice_dirs="bark_voices/")
Using 🐸TTS API:
from TTS.api import TTS
# Load the model to GPU
# Bark is really slow on CPU, so we recommend using GPU.
tts = TTS("tts_models/multilingual/multi-dataset/bark", gpu=True)
# Cloning a new speaker
# This expects to find a mp3 or wav file like `bark_voices/new_speaker/speaker.wav`
# It computes the cloning values and stores in `bark_voices/new_speaker/speaker.npz`
tts.tts_to_file(text="Hello, my name is Manmay , how are you?",
file_path="output.wav",
voice_dir="bark_voices/",
speaker="ljspeech")
# When you run it again it uses the stored values to generate the voice.
tts.tts_to_file(text="Hello, my name is Manmay , how are you?",
file_path="output.wav",
voice_dir="bark_voices/",
speaker="ljspeech")
# random speaker
tts = TTS("tts_models/multilingual/multi-dataset/bark", gpu=True)
tts.tts_to_file("hello world", file_path="out.wav")
Using 🐸TTS Command line:
# cloning the `ljspeech` voice
tts --model_name tts_models/multilingual/multi-dataset/bark \
--text "This is an example." \
--out_path "output.wav" \
--voice_dir bark_voices/ \
--speaker_idx "ljspeech" \
--progress_bar True
# Random voice generation
tts --model_name tts_models/multilingual/multi-dataset/bark \
--text "This is an example." \
--out_path "output.wav" \
--progress_bar True
Full Changelog: v0.14.3...v0.15.0