Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: stream_chunk_builder fails to include tool_calls preceded by content #2716

Open
stephenfreund opened this issue Mar 27, 2024 · 6 comments
Labels
bug Something isn't working

Comments

@stephenfreund
Copy link

What happened?

If the stream produced by a call to litellm.completion(..., stream=True) contains content deltas before tool call deltas, the result of calling stream_chunk_builder on an array of all chunks from the stream fails to include the tool calls. The test case below demonstrates this issue on runs in which gpt-4 responds with first content and then a call to weather. The output of one such run is attached.

import litellm

prompt = """
Tell me word that starts with "aar".
Then tell me the weather in San Francisco.
"""

messages = [{"role": "user", "content": prompt}]

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_current_weather",
            "description": "Get the current weather in a given location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city and state, e.g. San Francisco, CA",
                }
                },
                "required": ["location"],
            },
        },
    }
]

stream = litellm.completion(
    model="gpt-4",
    messages=messages,
    tools=tools,
    stream=True
)

chunks=[]
for chunk in stream:
    print(chunk.choices[0].delta)
    chunks.append(chunk)
    
response = litellm.stream_chunk_builder(chunks, messages=messages)
    
print("\n-----\n")
print("stream_chunk_builder result:\n", response)

Relevant log output

Delta(content='The', role='assistant', function_call=None, tool_calls=None)
Delta(content=' word', role=None, function_call=None, tool_calls=None)
Delta(content=' that', role=None, function_call=None, tool_calls=None)
Delta(content=' starts', role=None, function_call=None, tool_calls=None)
Delta(content=' with', role=None, function_call=None, tool_calls=None)
Delta(content=' "', role=None, function_call=None, tool_calls=None)
Delta(content='aar', role=None, function_call=None, tool_calls=None)
Delta(content='"', role=None, function_call=None, tool_calls=None)
Delta(content=' is', role=None, function_call=None, tool_calls=None)
Delta(content=' "', role=None, function_call=None, tool_calls=None)
Delta(content='a', role=None, function_call=None, tool_calls=None)
Delta(content='ard', role=None, function_call=None, tool_calls=None)
Delta(content='v', role=None, function_call=None, tool_calls=None)
Delta(content='ark', role=None, function_call=None, tool_calls=None)
Delta(content='".\n\n', role=None, function_call=None, tool_calls=None)
Delta(content='Let', role=None, function_call=None, tool_calls=None)
Delta(content=' me', role=None, function_call=None, tool_calls=None)
Delta(content=' check', role=None, function_call=None, tool_calls=None)
Delta(content=' the', role=None, function_call=None, tool_calls=None)
Delta(content=' weather', role=None, function_call=None, tool_calls=None)
Delta(content=' in', role=None, function_call=None, tool_calls=None)
Delta(content=' San', role=None, function_call=None, tool_calls=None)
Delta(content=' Francisco', role=None, function_call=None, tool_calls=None)
Delta(content=' for', role=None, function_call=None, tool_calls=None)
Delta(content=' you', role=None, function_call=None, tool_calls=None)
Delta(content='.', role=None, function_call=None, tool_calls=None)
Delta(content=None, role=None, function_call=None, tool_calls=[ChatCompletionDeltaToolCall(id='call_kT6n9k2oyBaNtko9oqluSjvY', function=Function(arguments='', name='get_current_weather'), type='function', index=0)])
Delta(content=None, role=None, function_call=None, tool_calls=[ChatCompletionDeltaToolCall(id=None, function=Function(arguments='{\n', name=None), type=None, index=0)])
Delta(content=None, role=None, function_call=None, tool_calls=[ChatCompletionDeltaToolCall(id=None, function=Function(arguments=' ', name=None), type=None, index=0)])
Delta(content=None, role=None, function_call=None, tool_calls=[ChatCompletionDeltaToolCall(id=None, function=Function(arguments=' "', name=None), type=None, index=0)])
Delta(content=None, role=None, function_call=None, tool_calls=[ChatCompletionDeltaToolCall(id=None, function=Function(arguments='location', name=None), type=None, index=0)])
Delta(content=None, role=None, function_call=None, tool_calls=[ChatCompletionDeltaToolCall(id=None, function=Function(arguments='":', name=None), type=None, index=0)])
Delta(content=None, role=None, function_call=None, tool_calls=[ChatCompletionDeltaToolCall(id=None, function=Function(arguments=' "', name=None), type=None, index=0)])
Delta(content=None, role=None, function_call=None, tool_calls=[ChatCompletionDeltaToolCall(id=None, function=Function(arguments='San', name=None), type=None, index=0)])
Delta(content=None, role=None, function_call=None, tool_calls=[ChatCompletionDeltaToolCall(id=None, function=Function(arguments=' Francisco', name=None), type=None, index=0)])
Delta(content=None, role=None, function_call=None, tool_calls=[ChatCompletionDeltaToolCall(id=None, function=Function(arguments=',', name=None), type=None, index=0)])
Delta(content=None, role=None, function_call=None, tool_calls=[ChatCompletionDeltaToolCall(id=None, function=Function(arguments=' CA', name=None), type=None, index=0)])
Delta(content=None, role=None, function_call=None, tool_calls=[ChatCompletionDeltaToolCall(id=None, function=Function(arguments='"\n', name=None), type=None, index=0)])
Delta(content=None, role=None, function_call=None, tool_calls=[ChatCompletionDeltaToolCall(id=None, function=Function(arguments='}', name=None), type=None, index=0)])
Delta(content=None, role=None, function_call=None, tool_calls=None)

-----

stream_chunk_builder result:
 ModelResponse(id='chatcmpl-97QlEIdc7JfojkM4G6uwSc4GZT7Y8', choices=[Choices(finish_reason='tool_calls', index=0, message=Message(content='The word that starts with "aar" is "aardvark".\n\nLet me check the weather in San Francisco for you.', role='assistant'))], created=1711558192, model='gpt-4', object='chat.completion', system_fingerprint=None, usage=Usage(completion_tokens=26, prompt_tokens=26, total_tokens=52))

Twitter / LinkedIn details

No response

@stephenfreund stephenfreund added the bug Something isn't working label Mar 27, 2024
@krrishdholakia
Copy link
Contributor

Hey @stephenfreund can you share what the expected response here should be?

I can add fix accordingly and add it to our ci/cd!

Thanks for this issue

@stephenfreund
Copy link
Author

Hi,

Absolutely. Here's an example of what I would expect.

When I create a completion without streaming

response = litellm.completion(
    model="gpt-4",
    messages=messages,
    tools=tools,
)

I get responses like this:

ModelResponse(
    id="chatcmpl-97lEc2asXEk3w7yHiVH6nUqqIx7FD",
    choices=[
        Choices(
            finish_reason="tool_calls",
            index=0,
            message=Message(
                content='The word "Aardvark" starts with "aar".\n\nLet me check the weather in San Francisco for you.',
                role="assistant",
                tool_calls=[
                    ChatCompletionMessageToolCall(
                        function=Function(
                            arguments='{\n  "location": "San Francisco, CA"\n}',
                            name="get_current_weather",
                        ),
                        id="call_SMaGOY7OUR47n4phOeAyjHhB",
                        type="function",
                    )
                ],
            ),
        )
    ],
    created=1711636894,
    model="gpt-4-0613",
    object="chat.completion",
    system_fingerprint=None,
    usage=Usage(completion_tokens=44, prompt_tokens=82, total_tokens=126),
)

where the response Message has both the content and tool calls. I would expect the response built with the stream_chunk_builder to have this same form when a model responds with content and a tool call in the streamed version.

Happy to provide any other details that would be useful. Thanks!

Steve.

@aantn
Copy link

aantn commented Nov 10, 2024

Does this still occur?

Copy link

github-actions bot commented Feb 9, 2025

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

@github-actions github-actions bot added the stale label Feb 9, 2025
@krrishdholakia
Copy link
Contributor

not stale - needs to be investigated still. Adding to feb 2025 roadmap.

@github-actions github-actions bot removed the stale label Feb 10, 2025
@krrishdholakia
Copy link
Contributor

cc: @vibhavbhat

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants