-
Notifications
You must be signed in to change notification settings - Fork 509
The best practice to call TTS service in server scenario
In many scenarios, you may want to call the Azure TTS service in server end (e.g. a website backend). Here are some suggestions to improve the performance in such scenarios.
TTS service latency is decided by the synthesis time and network.
For synthesis time, usually the longer text will take longer time to synthesize. Using streaming mode format will be helpful for long text. If you have long text, it is also useful to send the text sentence by sentence to service to reduce latency.
For network latency, HTTP connection usually takes time. When possible, use persistent connections and reuse the connetion for multiple requests.
It's recommended to use our speech SDK to call TTS services.
Each synthesizer has its own HTTP/Websocket connection. So, reusing the synthesizers could reduce the latency as there's no need to establish a new connection for a new synthesis request.
You can use an object pool to manange the synthesizers.
You can bind to the Sythesizing
event or use AudioDataStream
to receieve the audio asynchronously in streaming mode.
synthesizer.Synthesizing += (s, e) =>
{
// receive the audio chunk data here.
Console.WriteLine($"Synthesizing event received with audio chunk of {e.Result.AudioData.Length} bytes.");
};
using (var audioDataStream = AudioDataStream.FromResult(result))
{
// You can save all the data in the audio data stream to a file
string fileName = "outputaudio.wav";
await audioDataStream.SaveToWaveFileAsync(fileName);
Console.WriteLine($"Audio data was saved to [{fileName}]");
// You can also read data from audio data stream and process it in memory
// Reset the stream position to the beginnging since saving to file puts the postion to end
audioDataStream.SetPosition(0);
byte[] buffer = new byte[16000];
uint totalSize = 0;
uint filledSize = 0;
while ((filledSize = audioDataStream.ReadData(buffer)) > 0)
{
Console.WriteLine($"{filledSize} bytes received.");
totalSize += filledSize;
}
Console.WriteLine($"{totalSize} bytes of audio data received");
}
For more details, see our samples in C#.
If you call REST API directly, try following steps:
- Try to establish connection before posting actual content (using a warm up request).
- Reuse the HTTP connection For example, in C#, reuse HttpClient object for each request. Don't create a new one.
- You need to get the auth token to call the TTS REST API. You can get and refresh the token asynchronously in a background thread to keep the token ready.
- Use streaming to receive the synthesized audio. In C#, refer this.
- Azure TTS: Empower every person and every organization on the planet to have a delightful digital voice!
- Azure Custom Voice: Build your one-of-a-kind Custom Voice and close to human Neural TTS in cloud and edge!