Real-Time OpenAI GPT-4 Voice Assistant Demo

This repository contains a Python demo that showcases real-time interaction with OpenAI's GPT-4 model using audio streams. The application captures microphone input, sends it to the OpenAI API, and plays back the assistant's audio response in real-time.

Features

Real-Time Communication: Interact with GPT-4 in real-time using your microphone and speaker.
Audio Streaming: Sends live audio to the OpenAI API and plays back the assistant's response.
Threaded Architecture: Utilizes multithreading for handling audio input/output and WebSocket communication simultaneously.
Customizable Session: Configure the assistant's behavior, voice, and other session parameters.

Prerequisites

Python 3.7 or higher
OpenAI API Key: You must have an API key from OpenAI with access to the GPT-4 real-time API.
Supported Platforms: This demo is designed for Unix-like systems. Windows support may require additional configuration.

Installation

Clone the Repository

git clone https://github.com/yourusername/your-repo-name.git
cd your-repo-name

Create a Virtual Environment (Optional but Recommended)

python3 -m venv venv
source venv/bin/activate  # On Windows use `venv\Scripts\activate`

Install Dependencies

pip install -r requirements.txt

Set Up Environment Variables

Create a .env file in the root directory and add your OpenAI API key:

OPENAI_API_KEY=your-openai-api-key

Usage

Run the Demo

python demo.py

Interaction

Speak into your microphone.
The assistant will process your speech and respond in real-time.
Press Ctrl+C to terminate the demo.

Configuration

You can customize the assistant's behavior by modifying the SESSION_DATA dictionary in demo.py:

Instructions: Set the assistant's personality and guidelines.
Voice: Choose the assistant's voice (e.g., "voice": "Sol").
Temperature: Adjust the creativity level (0.0 to 1.0). Example:

SESSION_DATA = {
    "type": "session.update",
    "session": {
        "instructions": "You are a helpful assistant.",
        "temperature": 0.7,
        "voice": "Sol",
        "modalities": ["audio", "text"],
        # ... other configurations
    }
}

Project Structure

demo.py: The main script that runs the demo.
requirements.txt: Python dependencies.
.env: Environment variables (not included; you need to create this file).

Dependencies

The demo relies on the following Python packages:

pyaudio: For capturing and playing audio.
websocket-client: For WebSocket communication with the OpenAI API.
python-dotenv: For loading environment variables from a .env file.
logging: For logging events and errors.

Install all dependencies using:

pip install -r requirements.txt

Note: pyaudio may require additional system dependencies. For example, on Ubuntu:

sudo apt-get install portaudio19-dev python3-pyaudio

Troubleshooting

Microphone or Speaker Issues: Ensure your microphone and speakers are properly connected and configured.
API Errors: Verify that your OpenAI API key is correct and has access to the GPT-4 real-time API.
WebSocket Errors: Check your internet connection and firewall settings.

Contributing

Contributions are welcome! Please follow these steps:

Fork the repository.
Create a new branch:

git checkout -b feature/your-feature-name

Make your changes and commit them:

git commit -am 'Add some feature'

Push to the branch:

git push origin feature/your-feature-name

Submit a pull request.

License

This project is licensed under the Apache License 2.0. See the LICENSE file for details.

Attribution Requirement: If you use any part of this code, please provide appropriate credit by mentioning the original author.

Disclaimer: This is a demo application intended for educational purposes. Use it responsibly and ensure compliance with OpenAI's usage policies.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
demo.ipynb		demo.ipynb
demo.py		demo.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Real-Time OpenAI GPT-4 Voice Assistant Demo

Features

Prerequisites

Installation

Usage

Configuration

Project Structure

Dependencies

Troubleshooting

Contributing

License

About

Releases

Packages

Languages

License

datastudy-nl/openai-realtime-demo

Folders and files

Latest commit

History

Repository files navigation

Real-Time OpenAI GPT-4 Voice Assistant Demo

Features

Prerequisites

Installation

Usage

Configuration

Project Structure

Dependencies

Troubleshooting

Contributing

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages