-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding TTS Tutorials #1584
Adding TTS Tutorials #1584
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed feedback on Element. Looking good 🚀
Looking good but notebooks are not testable. So far any notebook we released as a tutorial could not be maintained. We need a way to have this notebook in the CI tests. |
They are testable, we test our notebooks in the STT CI. Can probably copy that and adapt. |
can you link me where in the STT? |
@Aya-AlJafari I see you are still committing. Should I wait for more? |
Sorry I shared with Aya on chat but forgot to add here. the STT notebook CI I referred to in the PR is here and here |
"\n", | ||
"So, let's jump right in!\n", | ||
"\n", | ||
"*PS - If you just want a working, off-the-shelf model, check out the [🐸 Model Zoo](https://www.coqui.ai/models)*" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Model zoo doesn't have TTS models.
"\n", | ||
"If you have a single audio file and you need to **split** it into clips. It is also important to use a lossless audio file format to prevent compression artifacts. We recommend using **wav** file format.\n", | ||
"\n", | ||
"The data format we will be adopting for this tutorial is taken from widely-used the **LJSpeech** dataset, where **waves** are collected under a folder:\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"The data format we will be adopting for this tutorial is taken from widely-used the **LJSpeech** dataset, where **waves** are collected under a folder:\n", | |
"The data format we will be adopting for this tutorial is taken from the widely-used **LJSpeech** dataset, where **waves** are collected under a folder:\n", |
"\n", | ||
"### **First things first**: we need some data.\n", | ||
"\n", | ||
"We're training a Text-to-Speech model, so we need some _text_ and we need some _speech_. Specificially, we want _transcribed speech_. The speech must be divided into audio clips and each clip needs transcription. \n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's also many other requirements in terms of the recording characteristics, background noise, vocabulary coverage, etc. Even if going into details is not appropriate here we should at least link to more extensive documentation.
"<span style=\"color:purple;font-size:15px\">\n", | ||
"/wavs<br /> \n", | ||
"  | - audio1.wav<br /> \n", | ||
"  | - udio2.wav<br /> \n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"  | - udio2.wav<br /> \n", | |
"  | - audio2.wav<br /> \n", |
" ...<br /> \n", | ||
"</span>\n", | ||
"\n", | ||
"and a **metdata.txt** file will have the audioname in parallel to the transcript, delimeted by `|`: \n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"and a **metdata.txt** file will have the audioname in parallel to the transcript, delimeted by `|`: \n", | |
"and a **metadata.txt** file will have the audio file name in parallel to the transcript, delimited by `|`: \n", |
"## ⏳️ Loading your dataset\n", | ||
"Load one of the dataset supported by 🐸TTS.\n", | ||
"\n", | ||
"For this tutorial we will be using LJSpeech dataset.\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was already said above.
" os.makedirs(output_path)\n", | ||
"\n", | ||
"dataset_config = BaseDatasetConfig(\n", | ||
" name=\"ljspeech\", meta_file_train=\"metadata.csv\", path=os.path.join(output_path, \"LJSpeech-1.1/\")\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the examples above the metadata file has a .txt
extension.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"dataset_config = BaseDatasetConfig(\n", | ||
" name=\"ljspeech\", meta_file_train=\"metadata.csv\", path=os.path.join(output_path, \"LJSpeech-1.1/\")\n", | ||
")\n", | ||
"# You need to download LJSpeech inside output_path\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should make the notebook do this instead of asking people to do it.
" --model_path $test_ckpt \\\n", | ||
" --config_path $test_config \\\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Jupyter lets you access Python variables in inline shell calls like this, so you don't have to set them in os.environ
above, just create normal Python variables test_ckpt
and test_config
.
"metadata": {}, | ||
"source": [ | ||
"## 🎉 Congratulations! 🎉 You now have trained your first TTS model! \n", | ||
"Follow up with the next tutorials to learn more adnavced material." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Follow up with the next tutorials to learn more adnavced material." | |
"Follow up with the next tutorials to learn more advanced material." |
@erogol yes I will be adding one more tutorial today |
@TrycsPublic interesting way to send commits :) How about sending a PR? It is challenging this way to see what you changed. |
You can even make a PR for another PR by setting the base branch to |
No description provided.