Skip to content

Latest commit

 

History

History
844 lines (651 loc) · 28.3 KB

README.md

File metadata and controls

844 lines (651 loc) · 28.3 KB

Ruby OpenAI

Gem Version GitHub license CircleCI Build Status

Use the OpenAI API with Ruby! 🤖🩵

Stream text with GPT-4, transcribe and translate audio with Whisper, or create images with DALL·E...

🚢 Hire me | 🎮 Ruby AI Builders Discord | 🐦 Twitter | 🧠 Anthropic Gem | 🚂 Midjourney Gem

Table of Contents

Installation

Bundler

Add this line to your application's Gemfile:

gem "ruby-openai"

And then execute:

$ bundle install

Gem install

Or install with:

$ gem install ruby-openai

and require with:

require "openai"

Usage

Quickstart

For a quick test you can pass your token directly to a new client:

client = OpenAI::Client.new(access_token: "access_token_goes_here")

With Config

For a more robust setup, you can configure the gem with your API keys, for example in an openai.rb initializer file. Never hardcode secrets into your codebase - instead use something like dotenv to pass the keys safely into your environments.

OpenAI.configure do |config|
    config.access_token = ENV.fetch("OPENAI_ACCESS_TOKEN")
    config.organization_id = ENV.fetch("OPENAI_ORGANIZATION_ID") # Optional.
end

Then you can create a client like this:

client = OpenAI::Client.new

You can still override the config defaults when making new clients; any options not included will fall back to any global config set with OpenAI.configure. e.g. in this example the organization_id, request_timeout, etc. will fallback to any set globally using OpenAI.configure, with only the access_token overridden:

client = OpenAI::Client.new(access_token: "access_token_goes_here")

Custom timeout or base URI

The default timeout for any request using this library is 120 seconds. You can change that by passing a number of seconds to the request_timeout when initializing the client. You can also change the base URI used for all requests, eg. to use observability tools like Helicone, and add arbitrary other headers e.g. for openai-caching-proxy-worker:

client = OpenAI::Client.new(
    access_token: "access_token_goes_here",
    uri_base: "https://oai.hconeai.com/",
    request_timeout: 240,
    extra_headers: {
      "X-Proxy-TTL" => "43200", # For https://github.com/6/openai-caching-proxy-worker#specifying-a-cache-ttl
      "X-Proxy-Refresh": "true", # For https://github.com/6/openai-caching-proxy-worker#refreshing-the-cache
      "Helicone-Auth": "Bearer HELICONE_API_KEY", # For https://docs.helicone.ai/getting-started/integration-method/openai-proxy
      "helicone-stream-force-format" => "true", # Use this with Helicone otherwise streaming drops chunks # https://github.com/alexrudall/ruby-openai/issues/251
    }
)

or when configuring the gem:

OpenAI.configure do |config|
    config.access_token = ENV.fetch("OPENAI_ACCESS_TOKEN")
    config.organization_id = ENV.fetch("OPENAI_ORGANIZATION_ID") # Optional
    config.uri_base = "https://oai.hconeai.com/" # Optional
    config.request_timeout = 240 # Optional
    config.extra_headers = {
      "X-Proxy-TTL" => "43200", # For https://github.com/6/openai-caching-proxy-worker#specifying-a-cache-ttl
      "X-Proxy-Refresh": "true", # For https://github.com/6/openai-caching-proxy-worker#refreshing-the-cache
      "Helicone-Auth": "Bearer HELICONE_API_KEY" # For https://docs.helicone.ai/getting-started/integration-method/openai-proxy
    } # Optional
end

Extra Headers per Client

You can dynamically pass headers per client object, which will be merged with any headers set globally with OpenAI.configure:

client = OpenAI::Client.new(access_token: "access_token_goes_here")
client.add_headers("X-Proxy-TTL" => "43200")

Verbose Logging

You can pass Faraday middleware to the client in a block, eg. to enable verbose logging with Ruby's Logger:

  client = OpenAI::Client.new do |f|
    f.response :logger, Logger.new($stdout), bodies: true
  end

Azure

To use the Azure OpenAI Service API, you can configure the gem like this:

    OpenAI.configure do |config|
        config.access_token = ENV.fetch("AZURE_OPENAI_API_KEY")
        config.uri_base = ENV.fetch("AZURE_OPENAI_URI")
        config.api_type = :azure
        config.api_version = "2023-03-15-preview"
    end

where AZURE_OPENAI_URI is e.g. https://custom-domain.openai.azure.com/openai/deployments/gpt-35-turbo

Counting Tokens

OpenAI parses prompt text into tokens, which are words or portions of words. (These tokens are unrelated to your API access_token.) Counting tokens can help you estimate your costs. It can also help you ensure your prompt text size is within the max-token limits of your model's context window, and choose an appropriate max_tokens completion parameter so your response will fit as well.

To estimate the token-count of your text:

OpenAI.rough_token_count("Your text")

If you need a more accurate count, try tiktoken_ruby.

Models

There are different models that can be used to generate text. For a full list and to retrieve information about a single model:

client.models.list
client.models.retrieve(id: "text-ada-001")

Examples

  • GPT-4 (limited beta)
    • gpt-4 (uses current version)
    • gpt-4-0314
    • gpt-4-32k
  • GPT-3.5
    • gpt-3.5-turbo
    • gpt-3.5-turbo-0301
    • text-davinci-003
  • GPT-3
    • text-ada-001
    • text-babbage-001
    • text-curie-001

Chat

GPT is a model that can be used to generate text in a conversational style. You can use it to generate a response to a sequence of messages:

response = client.chat(
    parameters: {
        model: "gpt-3.5-turbo", # Required.
        messages: [{ role: "user", content: "Hello!"}], # Required.
        temperature: 0.7,
    })
puts response.dig("choices", 0, "message", "content")
# => "Hello! How may I assist you today?"

Streaming Chat

Quick guide to streaming Chat with Rails 7 and Hotwire

You can stream from the API in realtime, which can be much faster and used to create a more engaging user experience. Pass a Proc (or any object with a #call method) to the stream parameter to receive the stream of completion chunks as they are generated. Each time one or more chunks is received, the proc will be called once with each chunk, parsed as a Hash. If OpenAI returns an error, ruby-openai will raise a Faraday error.

client.chat(
    parameters: {
        model: "gpt-3.5-turbo", # Required.
        messages: [{ role: "user", content: "Describe a character called Anna!"}], # Required.
        temperature: 0.7,
        stream: proc do |chunk, _bytesize|
            print chunk.dig("choices", 0, "delta", "content")
        end
    })
# => "Anna is a young woman in her mid-twenties, with wavy chestnut hair that falls to her shoulders..."

Note: OpenAPI currently does not report token usage for streaming responses. To count tokens while streaming, try OpenAI.rough_token_count or tiktoken_ruby. We think that each call to the stream proc corresponds to a single token, so you can also try counting the number of calls to the proc to get the completion token count.

Vision

You can use the GPT-4 Vision model to generate a description of an image:

messages = [
  { "type": "text", "text": "What’s in this image?"},
  { "type": "image_url",
    "image_url": {
      "url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
    },
  }
]
response = client.chat(
    parameters: {
        model: "gpt-4-vision-preview", # Required.
        messages: [{ role: "user", content: messages}], # Required.
    })
puts response.dig("choices", 0, "message", "content")
# => "The image depicts a serene natural landscape featuring a long wooden boardwalk extending straight ahead"

JSON Mode

You can set the response_format to ask for responses in JSON (at least for gpt-3.5-turbo-1106):

  response = client.chat(
    parameters: {
        model: "gpt-3.5-turbo-1106",
        response_format: { type: "json_object" },
        messages: [{ role: "user", content: "Hello! Give me some JSON please."}],
        temperature: 0.7,
    })
    puts response.dig("choices", 0, "message", "content")
    {
      "name": "John",
      "age": 30,
      "city": "New York",
      "hobbies": ["reading", "traveling", "hiking"],
      "isStudent": false
    }

You can stream it as well!

  response = client.chat(
    parameters: {
      model: "gpt-3.5-turbo-1106",
      messages: [{ role: "user", content: "Can I have some JSON please?"}],
        response_format: { type: "json_object" },
        stream: proc do |chunk, _bytesize|
          print chunk.dig("choices", 0, "delta", "content")
        end
  })
  {
    "message": "Sure, please let me know what specific JSON data you are looking for.",
    "JSON_data": {
      "example_1": {
        "key_1": "value_1",
        "key_2": "value_2",
        "key_3": "value_3"
      },
      "example_2": {
        "key_4": "value_4",
        "key_5": "value_5",
        "key_6": "value_6"
      }
    }
  }

Functions

You can describe and pass in functions and the model will intelligently choose to output a JSON object containing arguments to call those them. For example, if you want the model to use your method get_current_weather to get the current weather in a given location:

def get_current_weather(location:, unit: "fahrenheit")
  # use a weather api to fetch weather
end

response =
  client.chat(
    parameters: {
      model: "gpt-3.5-turbo-0613",
      messages: [
        {
          "role": "user",
          "content": "What is the weather like in San Francisco?",
        },
      ],
      functions: [
        {
          name: "get_current_weather",
          description: "Get the current weather in a given location",
          parameters: {
            type: :object,
            properties: {
              location: {
                type: :string,
                description: "The city and state, e.g. San Francisco, CA",
              },
              unit: {
                type: "string",
                enum: %w[celsius fahrenheit],
              },
            },
            required: ["location"],
          },
        },
      ],
    },
  )

message = response.dig("choices", 0, "message")

if message["role"] == "assistant" && message["function_call"]
  function_name = message.dig("function_call", "name")
  args =
    JSON.parse(
      message.dig("function_call", "arguments"),
      { symbolize_names: true },
    )

  case function_name
  when "get_current_weather"
    get_current_weather(**args)
  end
end
# => "The weather is nice 🌞"

Edits

Send a string and some instructions for what to do to the string:

response = client.edits(
    parameters: {
        model: "text-davinci-edit-001",
        input: "What day of the wek is it?",
        instruction: "Fix the spelling mistakes"
    }
)
puts response.dig("choices", 0, "text")
# => What day of the week is it?

Embeddings

You can use the embeddings endpoint to get a vector of numbers representing an input. You can then compare these vectors for different inputs to efficiently check how similar the inputs are.

response = client.embeddings(
    parameters: {
        model: "text-embedding-ada-002",
        input: "The food was delicious and the waiter..."
    }
)

puts response.dig("data", 0, "embedding")
# => Vector representation of your embedding

Files

Put your data in a .jsonl file like this:

{"prompt":"Overjoyed with my new phone! ->", "completion":" positive"}
{"prompt":"@lakers disappoint for a third straight night ->", "completion":" negative"}

and pass the path to client.files.upload to upload it to OpenAI, and then interact with it:

client.files.upload(parameters: { file: "path/to/sentiment.jsonl", purpose: "fine-tune" })
client.files.list
client.files.retrieve(id: "file-123")
client.files.content(id: "file-123")
client.files.delete(id: "file-123")

Finetunes

Upload your fine-tuning data in a .jsonl file as above and get its ID:

response = client.files.upload(parameters: { file: "path/to/sarcasm.jsonl", purpose: "fine-tune" })
file_id = JSON.parse(response.body)["id"]

You can then use this file ID to create a fine tuning job:

response = client.finetunes.create(
    parameters: {
    training_file: file_id,
    model: "gpt-3.5-turbo-0613"
})
fine_tune_id = response["id"]

That will give you the fine-tune ID. If you made a mistake you can cancel the fine-tune model before it is processed:

client.finetunes.cancel(id: fine_tune_id)

You may need to wait a short time for processing to complete. Once processed, you can use list or retrieve to get the name of the fine-tuned model:

client.finetunes.list
response = client.finetunes.retrieve(id: fine_tune_id)
fine_tuned_model = response["fine_tuned_model"]

This fine-tuned model name can then be used in completions:

response = client.completions(
    parameters: {
        model: fine_tuned_model,
        prompt: "I love Mondays!"
    }
)
response.dig("choices", 0, "text")

You can also capture the events for a job:

client.finetunes.list_events(id: fine_tune_id)

Assistants

Assistants can call models to interact with threads and use tools to perform tasks (see Assistant Overview).

To create a new assistant (see API documentation):

response = client.assistants.create(
    parameters: {
        model: "gpt-3.5-turbo-1106",         # Retrieve via client.models.list. Assistants need 'gpt-3.5-turbo-1106' or later.
        name: "OpenAI-Ruby test assistant", 
        description: nil,
        instructions: "You are a helpful assistant for coding a OpenAI API client using the OpenAI-Ruby gem.",
        tools: [
            { type: 'retrieval' },           # Allow access to files attached using file_ids
            { type: 'code_interpreter' },    # Allow access to Python code interpreter 
        ],
        "file_ids": ["file-123"],            # See Files section above for how to upload files
        "metadata": { my_internal_version_id: '1.0.0' }
    })
assistant_id = response["id"]

Given an assistant_id you can retrieve the current field values:

client.assistants.retrieve(id: assistant_id)

You can get a list of all assistants currently available under the organization:

client.assistants.list

You can modify an existing assistant using the assistant's id (see API documentation):

response = client.assistants.modify(
        id: assistant_id,
        parameters: {
            name: "Modified Test Assistant for OpenAI-Ruby",
            metadata: { my_internal_version_id: '1.0.1' }
        })

You can delete assistants:

client.assistants.delete(id: assistant_id)

Threads and Messages

Once you have created an assistant as described above, you need to prepare a Thread of Messages for the assistant to work on (see introduction on Assistants). For example, as an initial setup you could do:

# Create thread
response = client.threads.create # Note: Once you create a thread, there is no way to list it
                                 # or recover it currently (as of 2023-12-10). So hold onto the `id` 
thread_id = response["id"]

# Add initial message from user (see https://platform.openai.com/docs/api-reference/messages/createMessage)
message_id = client.messages.create(
    thread_id: thread_id,
    parameters: {
        role: "user", # Required for manually created messages
        content: "Can you help me write an API library to interact with the OpenAI API please?"
    })["id"]

# Retrieve individual message
message = client.messages.retrieve(thread_id: thread_id, id: message_id)

# Review all messages on the thread
messages = client.messages.list(thread_id: thread_id)

To clean up after a thread is no longer needed:

# To delete the thread (and all associated messages):
client.threads.delete(id: thread_id)

client.messages.retrieve(thread_id: thread_id, id: message_id) # -> Fails after thread is deleted

Runs

To submit a thread to be evaluated with the model of an assistant, create a Run as follows (Note: This is one place where OpenAI will take your money):

# Create run (will use instruction/model/tools from Assistant's definition)
response = client.runs.create(thread_id: thread_id,
    parameters: {
        assistant_id: assistant_id
    })
run_id = response['id']

# Retrieve/poll Run to observe status
response = client.runs.retrieve(id: run_id, thread_id: thread_id)
status = response['status']

The status response can include the following strings queued, in_progress, requires_action, cancelling, cancelled, failed, completed, or expired which you can handle as follows:

while true do
    
    response = client.runs.retrieve(id: run_id, thread_id: thread_id)
    status = response['status']

    case status
    when 'queued', 'in_progress', 'cancelling'
        puts 'Sleeping'
        sleep 1 # Wait one second and poll again
    when 'completed'
        break # Exit loop and report result to user
    when 'requires_action'
        # Handle tool calls (see below)
    when 'cancelled', 'failed', 'expired'
        puts response['last_error'].inspect
        break # or `exit`
    else
        puts "Unknown status response: #{status}"
    end
end

If the status response indicates that the run is completed, the associated thread will have one or more new messages attached:

# Either retrieve all messages in bulk again, or...
messages = client.messages.list(thread_id: thread_id) # Note: as of 2023-12-11 adding limit or order options isn't working, yet

# Alternatively retrieve the `run steps` for the run which link to the messages:
run_steps = client.run_steps.list(thread_id: thread_id, run_id: run_id)
new_message_ids = run_steps['data'].filter_map { |step|
  if step['type'] == 'message_creation'
    step.dig('step_details', "message_creation", "message_id")
  end # Ignore tool calls, because they don't create new messages.
}

# Retrieve the individual messages
new_messages = new_message_ids.map { |msg_id|
  client.messages.retrieve(id: msg_id, thread_id: thread_id)
}

# Find the actual response text in the content array of the messages
new_messages.each { |msg|
    msg['content'].each { |content_item|
        case content_item['type']
        when 'text'
            puts content_item.dig('text', 'value')
            # Also handle annotations
        when 'image_file'
            # Use File endpoint to retrieve file contents via id
            id = content_item.dig('image_file', 'file_id')
        end
    }
}

At any time you can list all runs which have been performed on a particular thread or are currently running (in descending/newest first order):

client.runs.list(thread_id: thread_id)

Runs involving function tools

In case you are allowing the assistant to access function tools (they are defined in the same way as functions during chat completion), you might get a status code of requires_action when the assistant wants you to evaluate one or more function tools:

def get_current_weather(location:, unit: "celsius")
    # Your function code goes here
    if location =~ /San Francisco/i
        return unit == "celsius" ? "The weather is nice 🌞 at 27°C" : "The weather is nice 🌞 at 80°F"
    else
        return unit == "celsius" ? "The weather is icy 🥶 at -5°C" : "The weather is icy 🥶 at 23°F"
    end 
end

if status == 'requires_action'

    tools_to_call = response.dig('required_action', 'submit_tool_outputs', 'tool_calls')

    my_tool_outputs = tools_to_call.map { |tool|
        # Call the functions based on the tool's name
        function_name = tool.dig('function', 'name')
        arguments = JSON.parse(
              tool.dig("function", "arguments"),
              { symbolize_names: true },
        )
        
        tool_output = case function_name
        when "get_current_weather"
            get_current_weather(**arguments)
        end

        { tool_call_id: tool['id'], output: tool_output }
    }

    client.runs.submit_tool_outputs(thread_id: thread_id, run_id: run_id, parameters: { tool_outputs: my_tool_outputs })
end

Note that you have 10 minutes to submit your tool output before the run expires.

Image Generation

Generate an image using DALL·E! The size of any generated images must be one of 256x256, 512x512 or 1024x1024 - if not specified the image will default to 1024x1024.

response = client.images.generate(parameters: { prompt: "A baby sea otter cooking pasta wearing a hat of some sort", size: "256x256" })
puts response.dig("data", 0, "url")
# => "https://oaidalleapiprodscus.blob.core.windows.net/private/org-Rf437IxKhh..."

Ruby

Image Edit

Fill in the transparent part of an image, or upload a mask with transparent sections to indicate the parts of an image that can be changed according to your prompt...

response = client.images.edit(parameters: { prompt: "A solid red Ruby on a blue background", image: "image.png", mask: "mask.png" })
puts response.dig("data", 0, "url")
# => "https://oaidalleapiprodscus.blob.core.windows.net/private/org-Rf437IxKhh..."

Ruby

Image Variations

Create n variations of an image.

response = client.images.variations(parameters: { image: "image.png", n: 2 })
puts response.dig("data", 0, "url")
# => "https://oaidalleapiprodscus.blob.core.windows.net/private/org-Rf437IxKhh..."

Ruby Ruby

Moderations

Pass a string to check if it violates OpenAI's Content Policy:

response = client.moderations(parameters: { input: "I'm worried about that." })
puts response.dig("results", 0, "category_scores", "hate")
# => 5.505014632944949e-05

Whisper

Whisper is a speech to text model that can be used to generate text based on audio files:

Translate

The translations API takes as input the audio file in any of the supported languages and transcribes the audio into English.

response = client.audio.translate(
    parameters: {
        model: "whisper-1",
        file: File.open("path_to_file", "rb"),
    })
puts response["text"]
# => "Translation of the text"

Transcribe

The transcriptions API takes as input the audio file you want to transcribe and returns the text in the desired output file format.

response = client.audio.transcribe(
    parameters: {
        model: "whisper-1",
        file: File.open("path_to_file", "rb"),
    })
puts response["text"]
# => "Transcription of the text"

Speech

The speech API takes as input the text and a voice and returns the content of an audio file you can listen to.

response = client.audio.speech(
  parameters: {
    model: "tts-1",
    input: "This is a speech test!",
    voice: "alloy"
  }
)
File.binwrite('demo.mp3', response)
# => mp3 file that plays: "This is a speech test!"

Errors

HTTP errors can be caught like this:

  begin
    OpenAI::Client.new.models.retrieve(id: "text-ada-001")
  rescue Faraday::Error => e
    raise "Got a Faraday error: #{e}"
  end

Development

After checking out the repo, run bin/setup to install dependencies. You can run bin/console for an interactive prompt that will allow you to experiment.

To install this gem onto your local machine, run bundle exec rake install.

Warning

If you have an OPENAI_ACCESS_TOKEN in your ENV, running the specs will use this to run the specs against the actual API, which will be slow and cost you money - 2 cents or more! Remove it from your environment with unset or similar if you just want to run the specs against the stored VCR responses.

Release

First run the specs without VCR so they actually hit the API. This will cost 2 cents or more. Set OPENAI_ACCESS_TOKEN in your environment or pass it in like this:

OPENAI_ACCESS_TOKEN=123abc bundle exec rspec

Then update the version number in version.rb, update CHANGELOG.md, run bundle install to update Gemfile.lock, and then run bundle exec rake release, which will create a git tag for the version, push git commits and tags, and push the .gem file to rubygems.org.

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/alexrudall/ruby-openai. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the code of conduct.

License

The gem is available as open source under the terms of the MIT License.

Code of Conduct

Everyone interacting in the Ruby OpenAI project's codebases, issue trackers, chat rooms and mailing lists is expected to follow the code of conduct.