[Feature Request]: Adding Openai Structured outputs ( json_schema) for agents #3442

AhmedOmarYounusShahhat · 2024-08-28T13:25:57Z

Is your feature request related to a problem? Please describe.

the problem with complex json output you can not be sure always of the output structure

Describe the solution you'd like

Openai have new updated avialiable only for the new modles which is structured outputs using json schmea, where you give in the api call the stuctureed output json you want

Additional context

No response

colaso96 · 2024-08-29T14:41:38Z

+1 this would be an awesome feature to add

r4881t · 2024-09-05T04:26:01Z

This also needs to be added to Gemini as well since it also supports structured output.

r4881t · 2024-09-05T06:05:52Z

This is how one might implement it

initiate_chat, a_initiate_chat, generate_reply and a_generate_reply to be updated to include an additional parameter called output_schema. output_schema should be a Pydantic BaseModel's instance.
When inside initiate_chat, _summarize_chat will be called which may call _reflection_with_llm_as_summary, so the output_schema needs to be passed to _reflection_with_llm_as_summary as well. This will also pass it to _reflection_with_llm
Finally _generate_oai_reply_from_client will be needed to pass this param.

However, there may be many to and fro from the agents, so how to ensure that only the last message is having structured output?

marklysze · 2024-09-05T19:55:41Z

Hey @AhmedOmarYounusShahhat and @r4881t, thanks for requesting this feature and for the thoughts on implementation.

@r4881t, I think your implementation idea is a good start, here are some thoughts:

I think we could leavegenerate_reply/a_generate_reply untouched and focus on initiate_chat/a_initiate_chat. initiate_chat is more typically used for the main entry and exit for a conversation.
I'm wondering whether we can leave the summarisation as is and instead provide a further, optional step, which is to format the output. This could be triggered if an output_format parameter is set, which could be a Pydantic BaseModel class (as you noted) or a callable if you wanted to customise the output to something else (such as just a number, etc.).
ChatResult, which initiate_chat returns could contain another attribute, such as formatted_output which contains this format.
How to create the structured output? I'm thinking this should be client-specific (as different providers/LLMs will handle this differently, or not at all and we'll have to convert it ourselves), so for OpenAI the OpenAI client class (OpenAIClient in client.py) would be used and we can follow their guide on how to use it. Perhaps we could add another function to the client class protocol that's specifically for generating a structured output format (or pass in an additional parameter to the create function). For non-OpenAI client classes we could attempt to do it using prompting and new code in the helper class, client_utils.py.

Thoughts?

cc @Hk669

r4881t · 2024-09-07T10:44:24Z

@marklysze - Great thoughts and I agree with most of them.

I also included generate_reply/a_generate_reply because in my use case, we use this as well and would be interested in having a structured output in this function call also.
My question regarding creating structured output was more on the lines of multiple communications. So typically in a two agent chat, I have noticed that there's LLM call -> Tools picked up and executed -> Answer generated -> Some to and fro -> Final answer. So if we add the output_format to each to & fro, then it may not be the case always. So some prompting technique may be required such that the final answer is only in the output_format and intermediate conversations b/w agents can be in regular strings. A better to understand this is in the context of Group Chat. The conversation b/w various agents internally can happen in str but the final response from the initiate_chat should be in structured output.
Currently Gemini, OpenAI are suporting structured output, so we need to add it at both places.

marklysze · 2024-09-09T03:28:39Z

Thanks @r4881t, noted on #1.

For #2, I think we're on the same page - I think an endstep with the formatted output makes sense. I think changing the internal communications would require a lot more work. With the endstep, my thoughts were to leave summarisation as is and take that summarisation output and then format it, rather than try and do it in there?

For #3, for the LLM-level structured output, agreed, both places.

r4881t · 2024-09-09T06:26:35Z

for 2, I think that's wise to add an extra step to convert into specific structure. Reduces all complexity. I will start working on a PR for this. @marklysze

AhmedOmarYounusShahhat added the enhancement New feature or request label Aug 28, 2024

r4881t mentioned this issue Sep 6, 2024

[Feature Request]: Structured Output/ Formatted Output autogen-ai/autogen#25

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request]: Adding Openai Structured outputs ( json_schema) for agents #3442

[Feature Request]: Adding Openai Structured outputs ( json_schema) for agents #3442

AhmedOmarYounusShahhat commented Aug 28, 2024

colaso96 commented Aug 29, 2024

r4881t commented Sep 5, 2024

r4881t commented Sep 5, 2024

marklysze commented Sep 5, 2024

r4881t commented Sep 7, 2024

marklysze commented Sep 9, 2024

r4881t commented Sep 9, 2024

[Feature Request]: Adding Openai Structured outputs ( json_schema) for agents #3442

[Feature Request]: Adding Openai Structured outputs ( json_schema) for agents #3442

Comments

AhmedOmarYounusShahhat commented Aug 28, 2024

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Additional context

colaso96 commented Aug 29, 2024

r4881t commented Sep 5, 2024

r4881t commented Sep 5, 2024

marklysze commented Sep 5, 2024

r4881t commented Sep 7, 2024

marklysze commented Sep 9, 2024

r4881t commented Sep 9, 2024