[Issue]: Workflow terminates after 2 tokens when using AutogenStudio with LM Studio #2445

Freffles · 2024-04-19T07:00:26Z

Describe the issue

If I create a model in Autogen studio that points to the LM studio endpoint then add the model to an agent, then a workflow etc, when I run the workflow it will terminate after 2 tokens. Works perfectly with Ollama but every time I try LM Studio I have this problem regardless of the model I try with LM Studio. This is the output from LM studio server log.

[2024-04-19 16:25:21.195] [INFO] [LM STUDIO SERVER] Processing queued request...
[2024-04-19 16:25:21.197] [INFO] Received POST request to /v1/chat/completions with body: {
"messages": [
{
"content": "You are a helpful AI assistant",
"role": "system"
},
{
"content": "what is the federal capital of australia",
"role": "user"
}
],
"model": "gemma",
"max_tokens": null,
"stream": false,
"temperature": 0.1
}
[2024-04-19 16:25:21.198] [INFO] [LM STUDIO SERVER] Context Overflow Policy is: Rolling Window
[2024-04-19 16:25:21.201] [INFO] [LM STUDIO SERVER] Last message: { role: 'user', content: 'what is the federal capital of australia' } (total messages = 2)
[2024-04-19 16:25:21.696] [INFO] [LM STUDIO SERVER] Accumulating tokens ... (stream = false)
[2024-04-19 16:25:21.698] [INFO] [lmstudio-ai/gemma-2b-it-GGUF/gemma-2b-it-q8_0.gguf] Accumulated 1 tokens: The
[2024-04-19 16:25:21.799] [INFO] [lmstudio-ai/gemma-2b-it-GGUF/gemma-2b-it-q8_0.gguf] Accumulated 2 tokens: The Federal
[2024-04-19 16:25:21.911] [INFO] [LM STUDIO SERVER] [lmstudio-ai/gemma-2b-it-GGUF/gemma-2b-it-q8_0.gguf] Generated prediction: {
"id": "chatcmpl-zbpbxj8o70qatvsd8fz9ri",
"object": "chat.completion",
"created": 1713507921,
"model": "lmstudio-ai/gemma-2b-it-GGUF/gemma-2b-it-q8_0.gguf",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "The Federal"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 2,
"completion_tokens": 2,
"total_tokens": 4
}
}

Steps to reproduce

I have no issue replicating on my setup. I have latest LM studio and Ollama. Anytime I create a model that is run on the LM Studio server I have this problem but not on the Ollama server

Screenshots and logs

No response

Additional Information

autogenstudio v0.0.56
LM Studio 0.2.19 with lmstudio-ai/gemma-2b-it-GGUF/gemma-2b-it-q8_0.gguf
Ollama 0.1.32 with llama3:latest (8B)

stigtk · 2024-04-21T06:33:17Z

I have the exact same issue with lm online and autogen studio.

castlenthesky · 2024-04-22T13:33:42Z

I just encountered the same issue as well. My LM Studio logs are essentially identical to what you have shared.

flynntmaverick · 2024-04-23T20:51:41Z

Rolled back to 53 works -
Same issue here. I'm able to get agents communicating with writing python,but autogen studio gets stuck with 2 tokens

brustulim · 2024-04-25T20:41:18Z

If anyone is interested in a "work-around" until the fix, you can:
update the file:
autogenstudio\datamodel.py
by replacing the line 95 from:
max_tokens: Optional[int] = None
to:
max_tokens: Optional[int] = 3000

then create a new model (exactly like your previous)
and replace the old one in your agents, workflows, etc.

ekzhu · 2024-04-25T20:48:21Z

cc @victordibia for awareness

victordibia · 2024-04-25T20:59:31Z

@ALL, thanks for flagging this.

This has been fixed in the current dev branch for ags (pr here).
In addition to a default value for max_tokens, we also expose the max_tokens parameter in the UI so users can specify max_tokens. You can try this branch out today by running pip install autogenstudio==0.0.56rc3.

Should be merged into main soon.

Freffles · 2024-04-25T22:16:40Z

@victordibia Thanks!

Freffles · 2024-04-25T22:18:37Z

replacing the line 95 from: max_tokens: Optional[int] = None to: max_tokens: Optional[int] = 3000

@brustulim Thanks for the idea. I was going to try something like that but I was not sure if it was the right solution.

victordibia · 2024-05-06T13:17:57Z

@godblessme

You seemed to have mentioned having issues with v0.0.56rc3 , and LM studio in a deleted comment ... can you verify this is the case?

GodBlessingMe · 2024-05-06T14:40:24Z

@godblessme

You seemed to have mentioned having issues with v0.0.56rc3 , and LM studio in a deleted comment ... can you verify this is the case?

@victordibia Thank you for providing the v0.0.56rc3 version. This version fixes my problem with the maximal limit of two tokens when using LM Studio. However, I encountered that Python code generated by primary-assistant cannot be run. Could you kindly help me figure out what happened? I provide the multi-conversations between userproxy and primary-assistant. Thank you very much.

GodBlessingMe · 2024-05-08T14:59:08Z

@victordibia Thank you. The issue was solved.

MaxAkbar · 2024-05-12T04:50:33Z

I'm using .NET code with LMStudio and it always fails on the second call. It works as expected when maxTokens is specified during agent creation.

victordibia added the proj-studio Related to AutoGen Studio. label Apr 19, 2024

new4u mentioned this issue May 21, 2024

New Branch 0.0.56rc3 has option of "code execution config "running in docker, how dose it work? #2741

Open

rysweet added 0.2 Issues which were filed before re-arch to 0.4 needs-triage labels Oct 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Issue]: Workflow terminates after 2 tokens when using AutogenStudio with LM Studio #2445

[Issue]: Workflow terminates after 2 tokens when using AutogenStudio with LM Studio #2445

Freffles commented Apr 19, 2024

stigtk commented Apr 21, 2024

castlenthesky commented Apr 22, 2024

flynntmaverick commented Apr 23, 2024 •

edited

Loading

brustulim commented Apr 25, 2024

ekzhu commented Apr 25, 2024

victordibia commented Apr 25, 2024 •

edited

Loading

Freffles commented Apr 25, 2024

Freffles commented Apr 25, 2024 •

edited

Loading

victordibia commented May 6, 2024

GodBlessingMe commented May 6, 2024

GodBlessingMe commented May 8, 2024

MaxAkbar commented May 12, 2024

[Issue]: Workflow terminates after 2 tokens when using AutogenStudio with LM Studio #2445

[Issue]: Workflow terminates after 2 tokens when using AutogenStudio with LM Studio #2445

Comments

Freffles commented Apr 19, 2024

Describe the issue

Steps to reproduce

Screenshots and logs

Additional Information

stigtk commented Apr 21, 2024

castlenthesky commented Apr 22, 2024

flynntmaverick commented Apr 23, 2024 • edited Loading

brustulim commented Apr 25, 2024

ekzhu commented Apr 25, 2024

victordibia commented Apr 25, 2024 • edited Loading

Freffles commented Apr 25, 2024

Freffles commented Apr 25, 2024 • edited Loading

victordibia commented May 6, 2024

GodBlessingMe commented May 6, 2024

GodBlessingMe commented May 8, 2024

MaxAkbar commented May 12, 2024

flynntmaverick commented Apr 23, 2024 •

edited

Loading

victordibia commented Apr 25, 2024 •

edited

Loading

Freffles commented Apr 25, 2024 •

edited

Loading