Add support to LogProbs retrieval by implementing the native Structured Output Parser from OpenAI #1223

WittmannF · 2024-11-27T16:57:18Z

Is your feature request related to a problem? Please describe.
I've been reading the previous pull requests and found that #916 has been deprecated in favor of #938 which calls structured outputs using tool call by setting strict to True. In theory this is great for compatibility with previous models, however, as reported in #742 , tool/function calls doesn't output logprobs. Here's an example:

import openai
import json

openai_client = openai.OpenAI()

completion = openai_client.chat.completions.create(
    model="gpt-4o-mini",
    tools=[
        {
            "type": "function",
            "function": {
                "name": "ExtractUser",
                "description": "Correctly extracted `ExtractUser` with all the required parameters with correct types",
                "parameters": {
                    "properties": {
                        "name": {"title": "Name", "type": "string"},
                        "age": {"title": "Age", "type": "integer"},
                    },
                    "required": ["age", "name"],
                    "type": "object",
                },
            },
        }
    ],
    tool_choice={"type": "function", "function": {"name": "ExtractUser"}},
    logprobs=True,
    messages=[
        {"role": "user", "content": "Extract Jason is 25 years old"},
    ],
)
print(completion.choices[0].logprobs)

# Output: ChoiceLogprobs(content=None, refusal=None)

While using the native structured output parser from OpenAI will provide logprobs:

import os
from pydantic import BaseModel, field_validator
import openai

client = openai.OpenAI()

class User(BaseModel):
    name: str
    age: int

response = client.beta.chat.completions.parse(
    model="gpt-4o-mini",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": "Extract John is 18 years old.",
        }
    ],
    response_format=User,
    logprobs=True,
)


response_logprobs = response.choices[0].logprobs.content
print(response_logprobs)

# output: [ChatCompletionTokenLogprob(token='{"', bytes=[123, 34], logprob=-1.9361265e-07, top_logprobs=[]), ChatCompletionTokenLogprob(token='name', bytes=[110, 97, 109, 101], logprob=0.0, top_logprobs=[]), ChatCompletionTokenLogprob(token='":"', bytes=[34, 58, 34], logprob=0.0, top_logprobs=[]), ChatCompletionTokenLogprob(token='John', bytes=[74, 111, 104, 110], logprob=0.0, top_logprobs=[]), ChatCompletionTokenLogprob(token='","', bytes=[34, 44, 34], logprob=0.0, top_logprobs=[]), ChatCompletionTokenLogprob(token='age', bytes=[97, 103, 101], logprob=0.0, top_logprobs=[]), ChatCompletionTokenLogprob(token='":', bytes=[34, 58], logprob=0.0, top_logprobs=[]), ChatCompletionTokenLogprob(token='18', bytes=[49, 56], logprob=0.0, top_logprobs=[]), ChatCompletionTokenLogprob(token='}', bytes=[125], logprob=0.0, top_logprobs=[])]

Describe the solution you'd like
Continue #916

Describe alternatives you've considered
Using the native structured outputs from open ai client (client.beta.chat.completions.parse)

The text was updated successfully, but these errors were encountered:

github-actions bot added documentation Improvements or additions to documentation enhancement New feature or request labels Nov 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support to LogProbs retrieval by implementing the native Structured Output Parser from OpenAI #1223

Add support to LogProbs retrieval by implementing the native Structured Output Parser from OpenAI #1223

WittmannF commented Nov 27, 2024

Add support to LogProbs retrieval by implementing the native Structured Output Parser from OpenAI #1223

Add support to LogProbs retrieval by implementing the native Structured Output Parser from OpenAI #1223

Comments

WittmannF commented Nov 27, 2024