Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat (provider/openai): support reasoning_effort setting #4139

Merged
merged 3 commits into from
Dec 18, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .changeset/new-apes-knock.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
'@ai-sdk/openai': patch
---

feat (provider/openai): support reasoning_effort setting
26 changes: 19 additions & 7 deletions content/providers/01-ai-sdk-providers/01-openai.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -212,6 +212,12 @@ The following optional settings are available for OpenAI chat models:
Enable this if the model that you are using does not support streaming.
Defaults to `false`.

- **reasoningEffort** _'low' | 'medium' | 'high'_

Reasoning effort for reasoning models. Defaults to `medium`. If you use
`experimental_providerMetadata` to set the `reasoningEffort` option, this
model setting will be ignored.

#### Structured Outputs

You can enable [OpenAI structured outputs](https://openai.com/index/introducing-structured-outputs-in-the-api/) by setting the `structuredOutputs` option to `true`.
Expand Down Expand Up @@ -352,21 +358,27 @@ Currently, `o1`, `o1-mini`, and `o1-preview` are available.

Reasoning models currently only generate text, have several limitations, and are only supported using `generateText` and `streamText`.

Reasoning models support two additional options:
Reasoning models support additional settings and response metadata:

- You can use `experimental_providerMetadata` to set

- the `maxCompletionTokens` option, which determines the maximum number of both reasoning and output tokens that the model generates.
- the `reasoningEffort` option (or alternatively the `reasoningEffort` model setting), which determines the amount of reasoning the model performs.

- You can use request `experimental_providerMetadata` to set the `maxCompletionTokens` option, which determines the maximum number
of both reasoning and output tokens that the model generates.
- You can use response `experimental_providerMetadata` to access the number of reasoning tokens that the model generated.

```ts highlight="4,7-9,15"
```ts highlight="4,7-12,18"
import { openai } from '@ai-sdk/openai';
import { generateText } from 'ai';

const { text, usage, experimental_providerMetadata } = await generateText({
model: openai('o1-mini'),
prompt: 'Invent a new holiday and describe its traditions.',
experimental_providerMetadata: {
openai: { maxCompletionTokens: 1000 },
openai: {
reasoningEffort: 'low',
maxCompletionTokens: 1000,
},
},
});

Expand All @@ -380,8 +392,8 @@ console.log('Usage:', {
<Note type="warning">
Reasoning models like `o1`, `o1-preview`, and `o1-mini` require additional
runtime inference to complete their reasoning phase before generating a
response. This introduces longer latency compared to other models, with
`o1-preview` exhbiting significantly more inference time than `o1-mini`.
response. You can use the `reasoningEffort` model setting to influence the
inference time.
</Note>

#### Prompt Caching
Expand Down
60 changes: 60 additions & 0 deletions packages/openai/src/openai-chat-language-model.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -411,6 +411,66 @@ describe('doGenerate', () => {
});
});

it('should pass reasoningEffort setting from provider metadata', async () => {
prepareJsonResponse({ content: '' });

const model = provider.chat('o1-mini');

await model.doGenerate({
inputFormat: 'prompt',
mode: { type: 'regular' },
prompt: TEST_PROMPT,
providerMetadata: {
openai: { reasoningEffort: 'low' },
},
});

expect(await server.getRequestBodyJson()).toStrictEqual({
model: 'o1-mini',
messages: [{ role: 'user', content: 'Hello' }],
reasoning_effort: 'low',
});
});

it('should pass reasoningEffort setting from settings', async () => {
prepareJsonResponse({ content: '' });

const model = provider.chat('o1-mini', { reasoningEffort: 'high' });

await model.doGenerate({
inputFormat: 'prompt',
mode: { type: 'regular' },
prompt: TEST_PROMPT,
});

expect(await server.getRequestBodyJson()).toStrictEqual({
model: 'o1-mini',
messages: [{ role: 'user', content: 'Hello' }],
reasoning_effort: 'high',
});
});

it('should prioritize reasoningEffort from provider metadata over settings', async () => {
prepareJsonResponse({ content: '' });

const model = provider.chat('o1-mini', { reasoningEffort: 'high' });

await model.doGenerate({
inputFormat: 'prompt',
mode: { type: 'regular' },
prompt: TEST_PROMPT,
providerMetadata: {
openai: { reasoningEffort: 'low' },
},
});

expect(await server.getRequestBodyJson()).toStrictEqual({
model: 'o1-mini',
messages: [{ role: 'user', content: 'Hello' }],
reasoning_effort: 'low',
});
});

it('should pass tools and toolChoice', async () => {
prepareJsonResponse({ content: '' });

Expand Down
12 changes: 7 additions & 5 deletions packages/openai/src/openai-chat-language-model.ts
Original file line number Diff line number Diff line change
Expand Up @@ -176,11 +176,13 @@ export class OpenAIChatLanguageModel implements LanguageModelV1 {
seed,

// openai specific settings:
max_completion_tokens:
providerMetadata?.openai?.maxCompletionTokens ?? undefined,
store: providerMetadata?.openai?.store ?? undefined,
metadata: providerMetadata?.openai?.metadata ?? undefined,
prediction: providerMetadata?.openai?.prediction ?? undefined,
max_completion_tokens: providerMetadata?.openai?.maxCompletionTokens,
store: providerMetadata?.openai?.store,
metadata: providerMetadata?.openai?.metadata,
prediction: providerMetadata?.openai?.prediction,
reasoning_effort:
providerMetadata?.openai?.reasoningEffort ??
this.settings.reasoningEffort,

// messages:
messages: convertToOpenAIChatMessages({
Expand Down
5 changes: 5 additions & 0 deletions packages/openai/src/openai-chat-settings.ts
Original file line number Diff line number Diff line change
Expand Up @@ -104,4 +104,9 @@ Enable this if the model that you are using does not support streaming.
Defaults to `false`.
*/
simulateStreaming?: boolean;

/**
Reasoning effort for reasoning models. Defaults to `medium`.
*/
reasoningEffort?: 'low' | 'medium' | 'high';
}
Loading