-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add math-class group chat test #309
add math-class group chat test #309
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #309 +/- ##
==========================================
+ Coverage 28.70% 37.00% +8.29%
==========================================
Files 27 27
Lines 3383 3389 +6
Branches 760 762 +2
==========================================
+ Hits 971 1254 +283
+ Misses 2341 2018 -323
- Partials 71 117 +46
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good structure wise.
- speaker selection should work under a continuous q&a scenario among two agents and GPT 3.5 model. | ||
- admin should end the class when teacher has created 3 questions. | ||
""" | ||
skip_if_openai_not_available() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. This is better than my old way. @rickyloynd-microsoft @thinkall @kevin666aa FYI.
If you know which user can benefit from this PR, could you ask them to review it? |
], | ||
} | ||
|
||
def terminate_group_chat(message): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@afourney This is how to achieve a more robust terminating strategy via function_call, could you review it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LittleLittleCloud I am having trouble understand why this strategy is more robust?
Suggested strategy: one of the agents calls a group chat termination function. user proxy executes. manager detects termination string
Old strategy: one of the agent generates a termination string. manager detects.
If an agent is smart enough for suggested strategy then it should be able to also do old strategy? Sorry if I missed the argument for the increased robustness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If an agent is smart enough for suggested strategy then it should be able to also do old strategy?
Yes! Unfortunately, that's not the case in the real world, especially for gpt-3.5-turbo
. For some quite obvious reason I still use gpt-3.5-turbo
most frequently and gpt-3.5-turbo
agents are not very robust in generating correct termination messages.
Even for gpt-4
agents, they might also fail to give the correct termination world when the conversation grows.
The new strategy (using termination_function_call) can make sure the group chat terminate correctly when that termination function_call get triggered. That strategy also works well on gpt-3.5-turbo which fined-tuned for function_call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it -- the fact that function calls are prioritized makes it more robust. I added this comment to #525
Once we review and approve this PR I think we can make progress on #525 and #517. (cc @victordibia @pcdeadeasy) |
autogen/agentchat/groupchat.py
Outdated
|
||
if ( | ||
type(reply) == dict | ||
and self._is_termination_msg(reply) | ||
or type(reply) == str | ||
and self._is_termination_msg({"content": reply}) | ||
): | ||
break | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LittleLittleCloud, should these lines be after speaker.send
below??
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, I usually use isinstance()
instead of type()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LittleLittleCloud, should these lines be after
speaker.send
below??
Did this suggestion make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice catch
system_message="""You are a pre-school math teacher, you create 3 math questions for student to resolve. | ||
Here's your workflow: | ||
-workflow- | ||
if question count > 3 say [COMPLETE]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps it would be better to make the if-else grammar consistent in your code.
Hi @LittleLittleCloud, do you have a plan to further update this PR? There are some conflicts in two files due to some recent updates in group chat. |
@LittleLittleCloud bump so this PR doesn't get abandoned :D |
Why are these changes needed?
this test is imported from #102
After #294 it's possible to test the group chat result in a more robust way.
By using
function_call
and adding a special prefix to messages, we can examine the conversation flow inside a group chat.This test also provides an example for #133, which terminates group chat/two agent chat in a more controlable way
Related issue number
Checks