Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: text to speech support (#103) #113

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

alphdevcode
Copy link

@alphdevcode alphdevcode commented Jan 18, 2024

Summary

Text to speech feature was added to the OpenAI SDK.

Addresses #103

Code sample

This is the basic workflow for this feature:

[RequireComponent(typeof(AudioSource))]
public class TextToSpeech : MonoBehaviour
{
    private OpenAIApi openai = new OpenAIApi();
    
    void Start()
    {
        audioSource = GetComponent<AudioSource>();
  
        var request = new CreateTextToSpeechRequest
        {
            Input ="It feels so good to be able to talk!", // The text to be read aloud
            Model = "tts-1", // Text to speech model to use
            Voice = "alloy" // Voice to use
        };
        
        var response = await openai.CreateTextToSpeech(request);
        
        if(response.AudioClip) audioSource.PlayOneShot(response.AudioClip);
    }
}

Changes

  • Added the corresponding classes for handling the request creation to Runtime/DataTypes.cs.
  • Implemented the DispatchAudioRequest method to properly handle the response from the OpenAI API whether it's the generated audio or an error. None of the existing dispatch methods could be used due to the response not being in a json format, but a raw byte array.
  • Created the Runtime/Interfaces/IAudioResponse.cs following the same structure of the existing code for the dispatch methods.
  • Implemented the method CreateTextToSpeech. This is the method that will be called from outside the SDK.
  • Set up a sample scene to showcase the feature and add it to the package.json so it can be imported into a project like the other ones.

Checklist

  • My code follows the style guidelines of this project.
  • Self-review of my own code performed.
  • No error nor warning in the console.
    • Note that OpenAI by default limits the text to speech models to 3 request per minute per session, if that limit is exceeded the request will return an error. This limit can be increased in the OpenAI dashboard: https://platform.openai.com/account/limits

* feat: add text to speech support

* feat: add text to speech sample
@BlakCake
Copy link

BlakCake commented Dec 6, 2024

Any plans to add streaming capabilities to this?

@alphdevcode
Copy link
Author

Hey @BlakCake I don't think the developer is willing to merge this PR. It's been here for almost a year

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants