Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding TTS Tutorials #1584

Merged
merged 7 commits into from
Jun 2, 2022
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
198 changes: 198 additions & 0 deletions notebooks/use-pretrained-TTS.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,198 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "45ea3ef5",
"metadata": {
"tags": []
},
"source": [
"# Easy inferencing with 🐸 TTS ⚡\n",
"\n",
"You want to quicly synthesize speech using Coqui (🐸) TTS model?\n",
"\n",
"💡: Grab a pre-trained model and use it to synthesize speech using any speaker voice, including yours! ⚡\n",
"\n",
"🐸 TTS comes with a list of pretrained models and speaker voices. You can even start a local demo server that you can open it on your favorite web browser and 🗣️ .\n",
"\n",
"\n",
"In this notebook, we will:\n",
"\n",
"1. Download a pre-trained TTS english model.\n",
"\n",
"\n",
"So, let's jump right in!\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "fa2aec77",
"metadata": {},
"outputs": [],
"source": [
"## Install Coqui STT\n",
"! pip install -U pip\n",
"! pip install TTS"
]
},
{
"cell_type": "markdown",
"id": "8c07a273",
"metadata": {},
"source": [
"## ✅ List available pre-trained 🐸 TTS models\n",
"\n",
"Coqui 🐸TTS comes with a list of pretrained models for different model types (ex: TTS, vocoder), languages, datasets used for training and architectures. \n",
"\n",
"You can either use your own model or the release models under 🐸TTS.\n",
"\n",
"Use `tts --list_models` to find out the availble models.\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "608d203f",
"metadata": {},
"outputs": [],
"source": [
"! tts --list_models"
]
},
{
"cell_type": "markdown",
"id": "ed9dd7ab",
"metadata": {},
"source": [
"## ✅ Run a TTS model\n",
"\n",
"### **First things first**: Using a release model and default vocoder:\n",
"\n",
"#### You can simply copy the full model name from the list above and use it \n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "cc9e4608-16ec-4dcd-bd6b-bd10d62286f8",
"metadata": {},
"outputs": [],
"source": [
"!tts --text \"hello world\" \\\n",
"--model_name \"tts_models/en/ljspeech/glow-tts\" \\\n",
"--out_path output.wav\n"
]
},
{
"cell_type": "markdown",
"id": "0ca2cb14-1aba-400e-a219-8ce44d9410be",
"metadata": {},
"source": [
"## 📣 Listen to the synthesized wave 📣"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5fe63ef4-9284-4461-9dda-1ca7483a8f9b",
"metadata": {},
"outputs": [],
"source": [
"import IPython\n",
"IPython.display.Audio(\"output.wav\")"
]
},
{
"cell_type": "markdown",
"id": "5e67d178-1ebe-49c7-9a47-0593251bdb96",
"metadata": {},
"source": [
"### **Second things second**:\n",
"\n",
"#### If you want to run a multispeaker model from the released models list, you can first check the speaker ids using `--list_speaker_idx` flag and use this speaker voice to synthesize speech."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "87b18839-f750-4a61-bbb0-c964acaecab2",
"metadata": {},
"outputs": [],
"source": [
"# list the possible speaker IDs.\n",
"!tts --model_name \"tts_models/en/vctk/vits\" \\\n",
"--list_speaker_idxs \n"
]
},
{
"cell_type": "markdown",
"id": "c4365a9d-f922-4b14-88b0-d2b22a245b2e",
"metadata": {},
"source": [
"## 💬 Synthesize speech using speaker ID 💬"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "52be0403-d13e-4d9b-99c2-c10b85154063",
"metadata": {},
"outputs": [],
"source": [
"!tts --text \"Trying out specific speaker voice\"\\\n",
"--out_path spkr-out.wav --model_name \"tts_models/en/vctk/vits\" \\\n",
"--speaker_idx \"p341\""
]
},
{
"cell_type": "markdown",
"id": "894a560a-f9c8-48ce-aaa6-afdf516c01f6",
"metadata": {},
"source": [
"## 📣 Listen to the synthesized speaker specific wave 📣"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ed485b0a-dfd5-4a7e-a571-ebf74bdfc41d",
"metadata": {},
"outputs": [],
"source": [
"import IPython\n",
"IPython.display.Audio(\"spkr-out.wav\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c9116400-aff7-4a04-810f-7f89e66d2950",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.10"
}
},
"nbformat": 4,
"nbformat_minor": 5
}