-
Notifications
You must be signed in to change notification settings - Fork 509
How to call multiple voice in SSML
Customer may want to use multiple voices in one SSML to deliver some interesting experiences like role play story telling. Azure TTS support combing multiple voices with SSML.
To use multiple prebuild voices, one should have SSML composed to refer to the voices to be used.
<speak version="1.0" xmlns="https://www.w3.org/2001/10/synthesis" xml:lang="en-US">
<voice name="en-US-AriaNeural">
This is the text that is spoken.
</voice>
<voice name="en-US-GuyNeural">
This is the text that is spoken.
</voice>
</speak>
then everything is the same like SSML with single voice.
For custom voice, currently the custom endpoint needs to have the custom voice deployment id. Refer to:
https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/regions#custom-voices
To access multiple custom voices in the SSML like above, each voice need to be deployed into their own endpoint. Then use multiple deploymentId
parameter in endpoint URL to specify the voices needed.
For example:
If there are 3 voices deployed in custom voice into 3 endpoints
Voice Name | Endpoint URL |
---|---|
VoiceA | https://eastasia.voice.speech.microsoft.com/cognitiveservices/v1?deploymentId=44aa21a9-56cb-4959-b4c8-91a14a68b0b2 |
VoiceB | https://eastasia.voice.speech.microsoft.com/cognitiveservices/v1?deploymentId=da90e300-3f79-462e-85cc-dac44b44ad33 |
VoiceC | https://eastasia.voice.speech.microsoft.com/cognitiveservices/v1?deploymentId=3655bafe-073a-4291-aae7-7d2e7160b0f6 |
Combine their deploymentId
into one URL to access the 3 voices in one endpoint:
https://eastasia.voice.speech.microsoft.com/cognitiveservices/v1?deploymentId=44aa21a9-56cb-4959-b4c8-91a14a68b0b2&deploymentId=da90e300-3f79-462e-85cc-dac44b44ad33&deploymentId=3655bafe-073a-4291-aae7-7d2e7160b0f6
With the following SSML:
<speak version="1.0" xmlns="https://www.w3.org/2001/10/synthesis" xml:lang="en-US">
<voice name="VoiceA">
This is VoiceA.
</voice>
<voice name="VoiceB">
Then VoiceB.
</voice>
<voice name="VoiceC">
And this is VoiceC.
</voice>
</speak>
All endpoints need to be in the same subscription, voices can be in different languages.
All deploymentId
must be valid. Any invalid ID will fail the request.
If there are too many voices (more than 10) to put into the URL, it is recommended to have some code to construct the URL dynamically based on the SSML content.
Currently we don't support to mix custom voice and prebuild voice in one SSML
- Azure TTS: Empower every person and every organization on the planet to have a delightful digital voice!
- Azure Custom Voice: Build your one-of-a-kind Custom Voice and close to human Neural TTS in cloud and edge!