[Roadmap] Streaming support #217

sonichi · 2023-10-12T21:42:09Z

stream messages from agent's reply.

Tasks

Give feedback

Refractor streaming for oai client #633

ui/deploy
Move print out of client.py for streaming
Add support to streaming function call
Introducing IOStream protocol and adding support for websockets #1551

streaming
Options

sonichi · 2023-10-22T15:43:40Z

No one is working on this issue as far as I know. Any volunteer to take a lead?

maxim-saplin · 2023-10-23T09:00:19Z

A bit of criticism... Is Streaming needed?

It doesn't seem AutoGen is a user facing tool with UI/UX being a top priority, while streaming mainly solves user experience concerns (interactivity and visible progress). After all the performance (total time to complete the request) is that same no matter is streaming is enabled or not.

On the contrary it seems that implementing streaming is a large piece of work and many parts will be affected and hard to maintain:

Working with chunks in async matter might require creating 2 flavours of all APIs that use completions
Cost accounting will be broken right away, streaming APIs (at least from OpenAI) don't return token stats for stream responses. OpenAI suggests to utilise tiktoken to do own accounting, though my experience with tiktoken says that there will alway be ~1% discrepancy with what OpenAI returns - costing will become less accurate.

IMO, it is a Large piece of work for a Small value (speaking in terms of S-M-L sizing and prioritizing).

ragyabraham · 2023-10-23T23:21:44Z

@sonichi im actually working on this already. Happy to pick this up

sonichi · 2023-10-23T23:46:38Z

Thanks. One thing to pay attention to is #203 . If your work uses the streaming feature from openai, better target the newer version.
I'm currently working on #203 without the streaming part.

ragyabraham · 2023-10-23T23:49:41Z

@sonichi I've also done some work to enable messages to be sent over sockets rather than printing in a terminal. Do you think that is something that I should open a PR for? Would that be useful?

sonichi · 2023-10-24T00:49:02Z

That sounds useful because I've heard many different people trying to do that. @victordibia @AaronWard may be interested and feel free to pull others who are also interested.

ragyabraham · 2023-10-24T01:09:39Z

ok sounds good. I'll start a new issue and push the changes I have so far. @victordibia @AaronWard please refer to #394

victordibia · 2023-10-24T01:49:36Z

I replied on #394 .
One thing to note here is that the work by @ragyabraham is more focused on streaming completed responses from each agent within an active conversation, not directly streaming the tokens from each agent as they are generated by an llm. The later is more involved and has unclear use cases/benefits (as mentioned by @maxim-saplin above).
@ragyabraham kindly confirm that this is your focus here?

ragyabraham · 2023-10-24T02:14:37Z

Hi @victordibia, I intend to utilise streaming to chunk all responses from the LLM. The approach we are thinking of is:

we chunk the response and utilise some sort of messaging framework to emit messages to a server (e.g. socketio that sends messages to the FE)
chunks are aggregated in memory (e.g. string += chunk)
once all chunks have been consumed the complete message can be sent to the intended team member/recipient

Please let me know what you think

Alvaromah · 2023-10-28T18:45:49Z

Hi!,

I've just created PR #465 to introduce streaming support in a straightforward and non-intrusive manner.

Usage:

llm_config={
    "config_list": config_list,
    # Enable, disable streaming (defaults to False)
    "stream": True,
}

assistant = autogen.AssistantAgent(
    name="assistant",
    llm_config=llm_config,
)

Please, feel free to review the code and make suggestions.

lianghsun · 2023-12-20T01:30:28Z

Hi @Alvaromah , thank you for your contribution, which has enabled autogen to stream in the terminal. However, I would like to ask if there's a way to support streaming simultaneously to an external output? I'm asking this because if autogen is integrated with other UI frameworks, it would be desirable to see a streaming effect. I've tried modifying some parts of the source code to use 'yield', but it doesn't seem to have made any difference. Thank you for your help. 😀

ragyabraham · 2023-12-20T05:38:50Z

You'll need to use websockets

Joaoprcf · 2023-12-21T17:11:52Z

Hi!,

I've just created PR #465 to introduce streaming support in a straightforward and non-intrusive manner.

Usage:
llm_config={
    "config_list": config_list,
    # Enable, disable streaming (defaults to False)
    "stream": True,
}

assistant = autogen.AssistantAgent(
    name="assistant",
    llm_config=llm_config,
)
Please, feel free to review the code and make suggestions.

Instead of just stream: True, maybe also allow stream: callable, where True would point to sys.stdout.write by default. It is a small change that would make a big difference.

sonichi · 2024-01-01T02:30:36Z

@Joaoprcf Good suggestion! Please feel free to make a PR and add @Alvaromah @ragyabraham as a reviewer.

bitnom · 2024-01-02T13:21:29Z

Hi!,
I've just created PR #465 to introduce streaming support in a straightforward and non-intrusive manner.
Usage:
llm_config={
    "config_list": config_list,
    # Enable, disable streaming (defaults to False)
    "stream": True,
}

assistant = autogen.AssistantAgent(
    name="assistant",
    llm_config=llm_config,
)
Please, feel free to review the code and make suggestions.
Instead of just stream: True, maybe also allow stream: callable, where True would point to sys.stdout.write by default. It is a small change that would make a big difference.

I think it best to add a new parameter. Let's not introduce a weird boolean that isn't a boolean. What we should probably have is:

llm_config={
    "config_list": config_list,
    "stream": True,
    "response_callback": my_cb_func
}

which would fire for both stream: True (chunks) and stream: False (full message). I don't think there's a need to separate them since multiple chunks give you the full message anyway.

Note: must ensure that the finished message (stop_reason or whatever the return model calls it) is always passed to the callback.

Also, see #1118 about function streams.

matbee-eth · 2024-01-31T21:08:54Z

Any idea when these PRs can land?

sonichi · 2024-02-11T19:26:33Z

@matbee-eth if you'd like to help accelerate it, please participate in #1551

vinodvarma24 · 2024-03-23T23:49:07Z

It would be great to have the steaming enabled so that for end-user production applications, UX will be better.

vinodvarma24 · 2024-03-23T23:55:25Z

A bit of criticism... Is Streaming needed?

It doesn't seem AutoGen is a user facing tool with UI/UX being a top priority, while streaming mainly solves user experience concerns (interactivity and visible progress). After all the performance (total time to complete the request) is that same no matter is streaming is enabled or not.

On the contrary it seems that implementing streaming is a large piece of work and many parts will be affected and hard to maintain:

Working with chunks in async matter might require creating 2 flavours of all APIs that use completions

Cost accounting will be broken right away, streaming APIs (at least from OpenAI) don't return token stats for stream responses. OpenAI suggests to utilise tiktoken to do own accounting, though my experience with tiktoken says that there will alway be ~1% discrepancy with what OpenAI returns - costing will become less accurate.

IMO, it is a Large piece of work for a Small value (speaking in terms of S-M-L sizing and prioritizing).

It's a big value for end users of the Agents

davorrunje · 2024-03-24T14:16:19Z

It would be great to have the steaming enabled so that for end-user production applications, UX will be better.

@vinodvarma24 streaming via websockets is implemented in #1551, please take a look at it and let us know what you think

sonichi added enhancement New feature or request roadmap Issues related to roadmap of AutoGen labels Oct 12, 2023

sonichi added the help wanted Extra attention is needed label Oct 22, 2023

sonichi mentioned this issue Oct 22, 2023

OpenAI Streaming #145

Closed

sonichi assigned ragyabraham Oct 23, 2023

Alvaromah added a commit to Alvaromah/autogen that referenced this issue Oct 28, 2023

Enable streaming support for openai ChatCompletion microsoft#217

749fb52

Alvaromah mentioned this issue Oct 28, 2023

Enable streaming support for openai ChatCompletion #217 #465

Closed

3 tasks

Alvaromah mentioned this issue Oct 30, 2023

Dev/v0.2 - Enable streaming support for openai v1.0.0b3 #491

Closed

3 tasks

Alvaromah mentioned this issue Nov 8, 2023

Enable streaming support for openai v1 #597

Merged

3 tasks

denonrailz mentioned this issue Nov 10, 2023

User Context Retention, Streaming Support, and Avatar Display in Gradio Chat Interface denonrailz/obsidian-autogen#9

Open

This was referenced Nov 12, 2023

Refractor streaming for oai client #633

Closed

How to stream text output? #587

Closed

denonrailz mentioned this issue Nov 14, 2023

Lightweight WebServer denonrailz/obsidian-autogen#12

Open

lianghsun mentioned this issue Dec 13, 2023

Implementing Streaming Output with Autogen and Chainlit Integration Chainlit/chainlit#589

Open

bitnom mentioned this issue Jan 12, 2024

Override OpenAIWrapper | hook streams and other client mechanisms #1217

Closed

3 tasks

Tylersuard mentioned this issue Jan 26, 2024

Add websockets for streaming to a frontend #1414

Closed

3 tasks

davorrunje mentioned this issue Feb 5, 2024

Introducing IOStream protocol and adding support for websockets #1551

Merged

11 tasks

jackgerrits assigned ekzhu and davorrunje and unassigned ragyabraham Mar 18, 2024

jackgerrits added the in-progress Roadmap is actively being worked on label Mar 18, 2024

jackgerrits changed the title ~~streaming support~~ [Roadmap] Streaming support Mar 18, 2024

sonichi closed this as completed in #1551 Mar 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Roadmap] Streaming support #217

[Roadmap] Streaming support #217

sonichi commented Oct 12, 2023 •

edited by ekzhu

Loading

Tasks

sonichi commented Oct 22, 2023

maxim-saplin commented Oct 23, 2023 •

edited

Loading

ragyabraham commented Oct 23, 2023

sonichi commented Oct 23, 2023

ragyabraham commented Oct 23, 2023

sonichi commented Oct 24, 2023

ragyabraham commented Oct 24, 2023 •

edited

Loading

victordibia commented Oct 24, 2023 •

edited

Loading

ragyabraham commented Oct 24, 2023

Alvaromah commented Oct 28, 2023

lianghsun commented Dec 20, 2023

ragyabraham commented Dec 20, 2023

Joaoprcf commented Dec 21, 2023 •

edited

Loading

sonichi commented Jan 1, 2024 •

edited

Loading

bitnom commented Jan 2, 2024

matbee-eth commented Jan 31, 2024

sonichi commented Feb 11, 2024

vinodvarma24 commented Mar 23, 2024

vinodvarma24 commented Mar 23, 2024

davorrunje commented Mar 24, 2024

[Roadmap] Streaming support #217

[Roadmap] Streaming support #217

Comments

sonichi commented Oct 12, 2023 • edited by ekzhu Loading

Tasks

sonichi commented Oct 22, 2023

maxim-saplin commented Oct 23, 2023 • edited Loading

ragyabraham commented Oct 23, 2023

sonichi commented Oct 23, 2023

ragyabraham commented Oct 23, 2023

sonichi commented Oct 24, 2023

ragyabraham commented Oct 24, 2023 • edited Loading

victordibia commented Oct 24, 2023 • edited Loading

ragyabraham commented Oct 24, 2023

Alvaromah commented Oct 28, 2023

lianghsun commented Dec 20, 2023

ragyabraham commented Dec 20, 2023

Joaoprcf commented Dec 21, 2023 • edited Loading

sonichi commented Jan 1, 2024 • edited Loading

bitnom commented Jan 2, 2024

matbee-eth commented Jan 31, 2024

sonichi commented Feb 11, 2024

vinodvarma24 commented Mar 23, 2024

vinodvarma24 commented Mar 23, 2024

davorrunje commented Mar 24, 2024

sonichi commented Oct 12, 2023 •

edited by ekzhu

Loading

maxim-saplin commented Oct 23, 2023 •

edited

Loading

ragyabraham commented Oct 24, 2023 •

edited

Loading

victordibia commented Oct 24, 2023 •

edited

Loading

Joaoprcf commented Dec 21, 2023 •

edited

Loading

sonichi commented Jan 1, 2024 •

edited

Loading