Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request]: Generalize token accounting and budgeting #1680

Closed
afourney opened this issue Feb 14, 2024 · 1 comment
Closed

[Feature Request]: Generalize token accounting and budgeting #1680

afourney opened this issue Feb 14, 2024 · 1 comment
Assignees
Labels
enhancement New feature or request

Comments

@afourney
Copy link
Member

afourney commented Feb 14, 2024

Is your feature request related to a problem? Please describe.

At present, model token limits are hardcoded here:

max_token_limit = {
"gpt-3.5-turbo": 4096,
"gpt-3.5-turbo-0301": 4096,
"gpt-3.5-turbo-0613": 4096,
"gpt-3.5-turbo-instruct": 4096,
"gpt-3.5-turbo-16k": 16385,
"gpt-3.5-turbo-16k-0613": 16385,
"gpt-3.5-turbo-1106": 16385,
"gpt-4": 8192,
"gpt-4-32k": 32768,
"gpt-4-32k-0314": 32768, # deprecate in Sep
"gpt-4-0314": 8192, # deprecate in Sep
"gpt-4-0613": 8192,
"gpt-4-32k-0613": 32768,
"gpt-4-1106-preview": 128000,
"gpt-4-0125-preview": 128000,
"gpt-4-turbo-preview": 128000,
"gpt-4-vision-preview": 128000,
}

Token costs are hardcoded here:

OAI_PRICE1K = {
"text-ada-001": 0.0004,
"text-babbage-001": 0.0005,
"text-curie-001": 0.002,
"code-cushman-001": 0.024,
"code-davinci-002": 0.1,
"text-davinci-002": 0.02,
"text-davinci-003": 0.02,
"gpt-3.5-turbo-instruct": (0.0015, 0.002),
"gpt-3.5-turbo-0301": (0.0015, 0.002), # deprecate in Sep
"gpt-3.5-turbo-0613": (0.0015, 0.002),
"gpt-3.5-turbo-16k": (0.003, 0.004),
"gpt-3.5-turbo-16k-0613": (0.003, 0.004),
"gpt-35-turbo": (0.0015, 0.002),
"gpt-35-turbo-16k": (0.003, 0.004),
"gpt-35-turbo-instruct": (0.0015, 0.002),
"gpt-4": (0.03, 0.06),
"gpt-4-32k": (0.06, 0.12),
"gpt-4-0314": (0.03, 0.06), # deprecate in Sep
"gpt-4-32k-0314": (0.06, 0.12), # deprecate in Sep
"gpt-4-0613": (0.03, 0.06),
"gpt-4-32k-0613": (0.06, 0.12),
# 11-06
"gpt-3.5-turbo": (0.0015, 0.002), # default is still 0613
"gpt-3.5-turbo-1106": (0.001, 0.002),
"gpt-35-turbo-1106": (0.001, 0.002),
"gpt-4-1106-preview": (0.01, 0.03),
"gpt-4-0125-preview": (0.01, 0.03),
"gpt-4-turbo-preview": (0.01, 0.03),
"gpt-4-1106-vision-preview": (0.01, 0.03), # TODO: support vision pricing of images
}

However, there are a number of issues with this including:

  • The name "max_token_limit" is confusing and can clash with "max_tokens", which is a parameter of the chat completion request, and refers exclusively to output tokens.
  • Adding new models requires editing both files. It's easy to forget one or the other.
  • It's keyed off model names, but those name may not be reliable. For example, "gpt-4" on Azure might point to an 8k, 32k, or 128k model. There is no requirement that these match with OpenAI names. Likewise, model names may not match at all (e.g., "my-gpt-4")
  • Adding new models or providers requires hand-editing core library files.

Describe the solution you'd like

I would like to be able to provide token accounting information in the OAI_CONFIG_LIST.

Perhaps something like this:

[
        "model": "gpt-4-turbo",
        "api_key": "blahblahblah",
        "base_url": "https://mymodel.openai.azure.com/",
        "api_type": "azure",
        "api_version": "2023-12-01-preview",
        "window_size": (128000, 4096),
        "1k_token_cost": (0.01, 0.03) 
]

In the case where input and output tokens are not distinguished:

[
        "model": "gpt-3.5-turbo",
        "api_key": "blahblahblah",
        "base_url": "https://mymodel.openai.azure.com/",
        "api_type": "azure",
        "api_version": "2023-12-01-preview",
        "window_size": 16385,
        "1k_token_cost": (0.0005, 0.0015) 
]

We would then use these values when present to compute costs, token limits, etc.

Additional context

No response

@afourney
Copy link
Member Author

#1682 was closed, so I'm closing this issue as well. Reopen ad needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant