Skip to content

Latest commit

 

History

History
13 lines (9 loc) · 1.59 KB

README_en.md

File metadata and controls

13 lines (9 loc) · 1.59 KB

For better user experience, refer to the official documentation on WEB -> Text-to-speech

OCR

Text-to-speech (TTS) task can realize the conversion of text into speech, which has been widely used in a variety of speech interactive devices.

  • Recommended Models
Model Name Model Introduction
Text-to-speech transformer_tts_ljspeech TansformerTTS is a fusion of Transformer and Tacotron2 with satisfactory results. It is an English TTS model and supports the prediction only.
Text-to-speech fastspeech_ljspeech FastSpeech is based on the attention diagonal line extracted from the teacher model in the encoder-decoder structure to make pronunciation duration prediction. It is an English TTS model and supports the prediction only.
Text-to-speech deepvoice3_ljspeech Deep Voice 3 is an end-to-end TTS model released by Baidu Research Institute in 2017 (paper accepted in ICLR 2018). It is a seq2seq model based on the convolutional neural network and attention mechanism. It is an English TTS model and supports the prediction only.