Streamlit component providing some tools to convert text to speech and autoplay the audio directly in the browser in an invisible iframe.
pip install streamlit-TTS
Three functions are provided:
auto_play(audio,wait=True,lag=0.25,key=None)
Plays some audio directly in the browser without showing an audio player.
The iframe containing the component is made invisible and shouldn't mess with the app's layout.
audio
must be a dict with the following structure:
audio={
'bytes':bytes, # wav audio bytes
'sample_rate':sample_rate, # the sample rate of the audio
'sample_width':sample_width, # the sample width of the audio
... # possibly other keys that will be ignored by the function
}
wait
determines whether the component should pause execution of the python script until the audio has finished playing (avoids putting another audio on play while the previous one is still playing). Since there is no easy way, given the constraints of streamlit implementation, to get instant information from the front-end whether it has finished playing or not, the waiting time is calculated based on the duration of the audio passed. Works generally well, but can sometimes give poor results, depending on the communication lag between front end and back end. lag
(in seconds) allows to adapt to this lag by waiting a bit more time to let the audio finish playing before resuming execution.
audio=text_to_audio(text,language='en',cleanup_hook=None)
Converts some text in the chosen language into spoken audio (uses gTTS and pydub).
A default cleanup of the text is being implemented before passing it to vocal synthesis (Markdown syntax elements, LaTeX formulas, links, long code snippets are removed). But you can pass your own as the cleanup_hook
if you want more control.
The audio returned is a dictionary with the same structure as accepted by auto_play
.
(This function is provided for convenience, it's not a streamlit component.)
text_to_speech(text,language='en',cleanup_hook=None,wait=True,lag=0.25,key=None)
Builds on the first two functions to convert text into speech and play the audio directly.
import streamlit as st
from streamlit_TTS import auto_play, text_to_speech, text_to_audio
from gtts.lang import tts_langs
langs=tts_langs().keys()
#get the audio first
audio=text_to_audio("Choose a language, type some text, and click 'Speak it out!'.",language='en')
#then play it
auto_play(audio)
lang=st.selectbox("Choose a language",options=langs)
text=st.text_input("Choose a text to speak out:")
speak=st.button("Speak it out!")
if lang and text and speak:
#plays the audio directly
text_to_speech(text=text, language=lang)