We Need a Token Count #175

aiInvader · 2023-10-09T19:06:40Z

This is a super crucial feature.

When finished executing, we need the token count.

qingyun-wu · 2023-10-10T00:19:54Z

If you enable logging, you should be able to get the cost of the API calls in the logged info. Check code examples as follows:
Enable logging: https://github.com/microsoft/autogen/blob/main/test/agentchat/test_assistant_agent.py#L122

Check the logged info: https://github.com/microsoft/autogen/blob/main/test/agentchat/test_assistant_agent.py#L150

@sonichi @kevin666aa please add anything else you know relevant to this question, and correct me if I am wrong.

aiInvader · 2023-10-10T01:05:29Z

I just tried your suggested this and it logged this:

{
    "[{\"content\": \"You are a helpful AI assistant.\\nSolve tasks using your coding and language skills.\\nIn the following cases, suggest python code (in a python coding block) or shell script (in a sh coding block) for the user to execute.\\n    1. When you need to collect info, use the code to output the info you need, for example, browse or search the web, download/read a file, print the content of a webpage or a file, get the current date/time, check the operating system. After sufficient info is printed and the task is ready to be solved based on your language skill, you can solve the task by yourself.\\n    2. When you need to perform some task with code, use the code to perform the task and output the result. Finish the task smartly.\\nSolve the task step by step if you need to. If a plan is not provided, explain your plan first. Be clear which step uses code, and which step uses your language skill.\\nWhen using code, you must indicate the script type in the code block. The user cannot provide any other feedback or perform any other action beyond executing the code you suggest. The user can't modify your code. So do not suggest incomplete code which requires users to modify. Don't use a code block if it's not intended to be executed by the user.\\nIf you want the user to save the code in a file before executing it, put # filename: <filename> inside the code block as the first line. Don't include multiple code blocks in one response. Do not ask users to copy and paste the result. Instead, use 'print' function for the output when relevant. Check the execution result returned by the user.\\nIf the result indicates there is an error, fix the error and output the code again. Suggest the full code instead of partial code or code changes. If the error can't be fixed or if the task is not solved even after the code is executed successfully, analyze the problem, revisit your assumption, collect additional info you need, and think of a different approach to try.\\nWhen you find an answer, verify the answer carefully. Include verifiable evidence in your response if possible.\\nReply \\\"TERMINATE\\\" in the end when everything is done.\\n    \", \"role\": \"system\"}, {\"content\": \"Update the existing Flask server to also serve the type in addition to the name.\", \"role\": \"user\"}, {\"content\": \"To respond to this request, we would have to modify the Flask server's code to include the 'type' attribute. Let's assume you already have a Flask web server that has endpoints serving 'name', we'll simply add a route to also serve 'type'. \\n\\nLet's assume your server.py file looks something like this and we will add the type functionality:\\n\\n```python\\n# filename: server.py\\nfrom flask import Flask, jsonify\\n\\napp = Flask(__name__)\\n\\n@app.route('/name/<string:name>')\\ndef return_name(name):\\n    return jsonify({'name': name})\\n\\n# Adding new route to return type\\n@app.route('/type/<string:type>')\\ndef return_type(type):\\n    return jsonify({'type': type})\\n\\nif __name__ == '__main__':\\n    app.run(debug=True)\\n```\\n\\nThis Flask server will serve 'type' when requested through the '/type/' route. Replace '<string:type>' with the type you want to fetch and the server will return a JSON object that includes the type.\\n\\nPlease save the changes and run your server again. You can run the server using the following command in the terminal:\\n\\n```sh\\npython server.py\\n```\\n\\nThis will start the server and you should be able to navigate to \\\"http://localhost:<port>/type/<typename>\\\" to see your changes.\", \"role\": \"assistant\"}]": {
        "created_at": [
            0
        ],
        "cost": [
            0.030479999999999997
        ]
    }
}

I assume that is cost in USD, and not token count, correct?
(although I understand Token Count can be inferred from this Token Count = Total Cost / Cost per Token = 0.030479999999999997 / (0.06 /1000) = 508)

sonichi · 2023-10-10T01:40:08Z

Let's add token count to the logging. Anyone interested in adding it?
@thinkall has added some utility functions in retrieve_utils.py. So maybe we can mark it a good first issue.

yiranwu0 · 2023-10-10T02:12:51Z

Let's add token count to the logging. Anyone interested in adding it? @thinkall has added some utility functions in retrieve_utils.py. So maybe we can mark it a good first issue.

We can utilize token_count returned from oai directly, If we set compact=False in the logging, there are already token count from each message. We just need to extract them and show them when compact=True.

I will add this quickly.

{
   'object': 'chat.completion',
   'model': 'gpt-4-0613',
   'choices': [{'index': 0,
     'message': {'role': 'assistant',
      'content': 'To create a plot such as a rocket, we can use ASCII art. ASCII (American Standard Code for Information Interchange) art is a graphic design format that uses standard printable characters. Let\'s create a simple ASCII art that represents a rocket.\n\nHere is a Python script which uses print statements to create a rocket.\n\n```python\n# filename: plot_rocket.py\n\nrocket = """\n   ^\n  [/](https://vscode-remote+ssh-002dremote-002bpsu-005flab.vscode-resource.vscode-cdn.net/)|\\\\\n [/](https://vscode-remote+ssh-002dremote-002bpsu-005flab.vscode-resource.vscode-cdn.net/) | \\\\\n/  |  \\\\\n|  |  |\n|  |  |\n+--+--+\n|  |  |\n+--+--+\n \'  \'  \'\n"""\n\nprint(rocket) \n```\n\nYou can run this script from the terminal by navigating to the directory where you saved this file and then run the following command:\n\n```sh\npython plot_rocket.py\n```'},
     'finish_reason': 'stop'}],
   'usage': {'prompt_tokens': 484,
    'completion_tokens': 163,
    'total_tokens': 647},
   'cost': 0.0243}}

thinkall · 2023-10-10T04:19:03Z

Do we need something to estimate the cost/tokens before actually call the APIs? @sonichi @kevin666aa @qingyun-wu

yiranwu0 · 2023-10-10T04:38:35Z

Do we need something to estimate the cost/tokens before actually call the APIs? @sonichi @kevin666aa @qingyun-wu

I am calculating the prompt tokens and compress if prompt tokens are too big before call the APIs here: https://github.com/microsoft/autogen/pull/131/files

It is hard to estimate completion tokens though.

TomExMachina · 2023-10-10T07:33:25Z

The token counts are a metric that need to always be available to both the developer and the agent.

qingyun-wu assigned yiranwu0 Oct 9, 2023

sonichi added the enhancement New feature or request label Oct 10, 2023

yiranwu0 mentioned this issue Oct 10, 2023

Improving logging in oai.completion to show token_count #179

Merged

3 tasks

qingyun-wu closed this as completed in #179 Oct 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

We Need a Token Count #175

We Need a Token Count #175

aiInvader commented Oct 9, 2023

qingyun-wu commented Oct 10, 2023 •

edited

Loading

aiInvader commented Oct 10, 2023

sonichi commented Oct 10, 2023 •

edited

Loading

yiranwu0 commented Oct 10, 2023 •

edited

Loading

thinkall commented Oct 10, 2023

yiranwu0 commented Oct 10, 2023

TomExMachina commented Oct 10, 2023

We Need a Token Count #175

We Need a Token Count #175

Comments

aiInvader commented Oct 9, 2023

qingyun-wu commented Oct 10, 2023 • edited Loading

aiInvader commented Oct 10, 2023

sonichi commented Oct 10, 2023 • edited Loading

yiranwu0 commented Oct 10, 2023 • edited Loading

thinkall commented Oct 10, 2023

yiranwu0 commented Oct 10, 2023

TomExMachina commented Oct 10, 2023

qingyun-wu commented Oct 10, 2023 •

edited

Loading

sonichi commented Oct 10, 2023 •

edited

Loading

yiranwu0 commented Oct 10, 2023 •

edited

Loading