add max_completions_tokens for o1 series models #857

chococola · 2024-09-20T18:01:11Z

Describe the change
For new o1 series models attribute max_tokens not working. In documentation this field marks deprecated.
OpenAi introduced new attribute max_completion_tokens.

To manage costs with the o1 series models, you can limit the total number of tokens the model generates (including both reasoning and completion tokens) by using the [max_completion_tokens](https://platform.openai.com/docs/api-reference/chat/create#chat-create-max_completion_tokens) parameter.

In previous models, the max_tokens parameter controlled both the number of tokens generated and the number of tokens visible to the user, which were always equal. However, with the o1 series, the total tokens generated can exceed the number of visible tokens due to the internal reasoning tokens.

Because some applications might rely on max_tokens matching the number of tokens received from the API, the o1 series introduces max_completion_tokens to explicitly control the total number of tokens generated by the model, including both reasoning and visible completion tokens. This explicit opt-in ensures no existing applications break when using the new models. The max_tokens parameter continues to function as before for all previous models.

Provide OpenAI documentation link
https://platform.openai.com/docs/api-reference/chat/create#chat-create-max_tokens
https://platform.openai.com/docs/api-reference/chat/create#chat-create-max_completion_tokens
https://platform.openai.com/docs/guides/reasoning/controlling-costs
https://community.openai.com/t/why-was-max-tokens-changed-to-max-completion-tokens/938077

Describe your solution
Update ChatCompletionRequest model

codecov · 2024-09-20T18:06:59Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.03%. Comparing base (774fc9d) to head (be774ca).
Report is 51 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #857      +/-   ##
==========================================
+ Coverage   98.46%   99.03%   +0.57%     
==========================================
  Files          24       26       +2     
  Lines        1364     1452      +88     
==========================================
+ Hits         1343     1438      +95     
+ Misses         15        8       -7     
  Partials        6        6

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

chococola · 2024-09-20T18:08:39Z

@sashabaranov please take a look to the PR 🙃

sashabaranov · 2024-09-20T18:18:53Z

@chococola great catch, thank you for the PR!

Could we also panic when o1 is used and max_tokens≠0?

sashabaranov · 2024-09-20T18:20:32Z

Or just return an error, panic might be too much :)

chococola · 2024-09-20T18:34:27Z

@chococola great catch, thank you for the PR!

Could we also panic when o1 is used and max_tokens≠0?

panic would be to much, and propably break backwards capabilities for oldest models.

Maybe we should mark attribute as deprecated ? or leave the comment?

return an error

in all places ? and what about batch requests ?

sashabaranov · 2024-09-20T18:42:10Z

@chococola let's cover the most frequent chat completion + streaming case, all other cases would eventually be covered by other PRs :D

chococola · 2024-09-20T20:33:35Z

@sashabaranov i add validation check for o1 series models and add some validation for fields which are limited in beta
https://platform.openai.com/docs/guides/reasoning/beta-limitations

sashabaranov

Wow, o1 is a can of worms! Thank you so much for looking so deeply into this! 🙌🏻

add max_completions_tokens for o1 series models

0ffcd83

add validation for o1 series models validataion + beta limitations

be774ca

chococola force-pushed the max_completions_tokens branch from 4d313aa to be774ca Compare September 20, 2024 20:31

sashabaranov approved these changes Sep 20, 2024

View reviewed changes

sashabaranov merged commit 9add1c3 into sashabaranov:master Sep 20, 2024
3 checks passed

sashabaranov mentioned this pull request Sep 20, 2024

feat: add MaxCompletionTokens and CompletionTokensDetails #851

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add max_completions_tokens for o1 series models #857

add max_completions_tokens for o1 series models #857

chococola commented Sep 20, 2024

codecov bot commented Sep 20, 2024 •

edited

Loading

chococola commented Sep 20, 2024

sashabaranov commented Sep 20, 2024

sashabaranov commented Sep 20, 2024

chococola commented Sep 20, 2024

sashabaranov commented Sep 20, 2024

chococola commented Sep 20, 2024

sashabaranov left a comment

add max_completions_tokens for o1 series models #857

add max_completions_tokens for o1 series models #857

Conversation

chococola commented Sep 20, 2024

codecov bot commented Sep 20, 2024 • edited Loading

Codecov Report

chococola commented Sep 20, 2024

sashabaranov commented Sep 20, 2024

sashabaranov commented Sep 20, 2024

chococola commented Sep 20, 2024

sashabaranov commented Sep 20, 2024

chococola commented Sep 20, 2024

sashabaranov left a comment

Choose a reason for hiding this comment

codecov bot commented Sep 20, 2024 •

edited

Loading