From 8bfcb2bff13384877ec6734f6d5ddcb1a028246d Mon Sep 17 00:00:00 2001 From: Mark Sze Date: Thu, 9 May 2024 19:32:17 +0000 Subject: [PATCH] Removed mdx file. --- .../topics/groupchat/resuming_groupchat.mdx | 274 ------------------ 1 file changed, 274 deletions(-) delete mode 100644 website/docs/topics/groupchat/resuming_groupchat.mdx diff --git a/website/docs/topics/groupchat/resuming_groupchat.mdx b/website/docs/topics/groupchat/resuming_groupchat.mdx deleted file mode 100644 index 4ee258c3248..00000000000 --- a/website/docs/topics/groupchat/resuming_groupchat.mdx +++ /dev/null @@ -1,274 +0,0 @@ ---- -custom_edit_url: https://github.com/microsoft/autogen/edit/main/website/docs/topics/groupchat/resuming_groupchat.ipynb -description: Custom Speaker Selection Function -source_notebook: /website/docs/topics/groupchat/resuming_groupchat.ipynb -tags: -- orchestration -- group chat -title: Resuming a GroupChat ---- -# Resuming a GroupChat -[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/microsoft/autogen/blob/main/website/docs/topics/groupchat/resuming_groupchat.ipynb) -[![Open on GitHub](https://img.shields.io/badge/Open%20on%20GitHub-grey?logo=github)](https://github.com/microsoft/autogen/blob/main/website/docs/topics/groupchat/resuming_groupchat.ipynb) - - -In GroupChat, we can resume a previous group chat by passing the -messages from that conversation to the GroupChatManager’s `resume_chat` -function (or `a_resume_chat` for asynchronous workflows). - -To resume, the agents, GroupChat, and GroupChatManager objects must -exist and match the original group chat. - -The messages can be passed in as a JSON string or a `List[Dict]`, -typically from the original GroupChat’s `messages` value. Use the -GroupChatManager’s `messages_to_string` function to retrieve a JSON -string from a groupchat that can be used for resuming: - -```text -# Save chat messages for resuming later on -messages_json = mygroupchatmanager.messages_to_string() -``` - -An example of the JSON string: - -```json -[{"content": "Find the latest paper about gpt-4 on arxiv and find its potential applications in software.", "role": "user", "name": "Admin"}, {"content": "Plan:\n1. **Engineer**: Search for the latest paper on GPT-4 on arXiv.\n2. **Scientist**: Read the paper and summarize the key findings and potential applications of GPT-4.\n3. **Engineer**: Identify potential software applications where GPT-4 can be utilized based on the scientist's summary.\n4. **Scientist**: Provide insights on the feasibility and impact of implementing GPT-4 in the identified software applications.\n5. **Engineer**: Develop a prototype or proof of concept to demonstrate how GPT-4 can be integrated into the selected software application.\n6. **Scientist**: Evaluate the prototype, provide feedback, and suggest any improvements or modifications.\n7. **Engineer**: Make necessary revisions based on the scientist's feedback and finalize the integration of GPT-4 into the software application.\n8. **Admin**: Review the final software application with GPT-4 integration and approve for further development or implementation.\n\nFeedback from admin and critic is needed for further refinement of the plan.", "role": "user", "name": "Planner"}, {"content": "Agree", "role": "user", "name": "Admin"}, {"content": "Great! Let's proceed with the plan outlined earlier. I will start by searching for the latest paper on GPT-4 on arXiv. Once I find the paper, the scientist will summarize the key findings and potential applications of GPT-4. We will then proceed with the rest of the steps as outlined. I will keep you updated on our progress.", "role": "user", "name": "Planner"}] -``` - -Under the hood, the `ConversableAgent.initiate_chat` method is called -with many of the parameters of `resume_chat` being passed through. These -include `silent`, `max_turns`, `summary_method`, and `summary_args`. In -line with that, `resume_chat` returns a `ChatResult`. - -When resuming, the messages will be validated against the groupchat’s -agents to make sure that the messages can be assigned to them. Messages -will be allocated to the agents and then the last speaker and message -will be used to resume the group chat. - -### Example of resuming a GroupChat - -Start with the LLM config. This can differ from the original group chat. - -```python -import os - -import autogen - -# Put your api key in the environment variable OPENAI_API_KEY -config_list = [ - { - "model": "gpt-4-0125-preview", - "api_key": os.environ["OPENAI_API_KEY"], - } -] - -gpt4_config = { - "cache_seed": 42, # change the cache_seed for different trials - "temperature": 0, - "config_list": config_list, - "timeout": 120, -} -``` - -Create the group chat objects, they should have the same `name` as the -original group chat. - -```python -# Create Agents, GroupChat, and GroupChatManager in line with the original group chat - -planner = autogen.AssistantAgent( - name="Planner", - system_message="""Planner. Suggest a plan. Revise the plan based on feedback from admin and critic, until admin approval. -The plan may involve an engineer who can write code and a scientist who doesn't write code. -Explain the plan first. Be clear which step is performed by an engineer, and which step is performed by a scientist. -""", - llm_config=gpt4_config, -) - -user_proxy = autogen.UserProxyAgent( - name="Admin", - system_message="A human admin. Interact with the planner to discuss the plan. Plan execution needs to be approved by this admin.", - code_execution_config=False, -) - -engineer = autogen.AssistantAgent( - name="Engineer", - llm_config=gpt4_config, - system_message="""Engineer. You follow an approved plan. You write python/shell code to solve tasks. Wrap the code in a code block that specifies the script type. The user can't modify your code. So do not suggest incomplete code which requires others to modify. Don't use a code block if it's not intended to be executed by the executor. -Don't include multiple code blocks in one response. Do not ask others to copy and paste the result. Check the execution result returned by the executor. -If the result indicates there is an error, fix the error and output the code again. Suggest the full code instead of partial code or code changes. If the error can't be fixed or if the task is not solved even after the code is executed successfully, analyze the problem, revisit your assumption, collect additional info you need, and think of a different approach to try. -""", -) -scientist = autogen.AssistantAgent( - name="Scientist", - llm_config=gpt4_config, - system_message="""Scientist. You follow an approved plan. You are able to categorize papers after seeing their abstracts printed. You don't write code.""", -) - -executor = autogen.UserProxyAgent( - name="Executor", - system_message="Executor. Execute the code written by the engineer and report the result.", - human_input_mode="NEVER", - code_execution_config={ - "last_n_messages": 3, - "work_dir": "paper", - "use_docker": False, - }, # Please set use_docker=True if docker is available to run the generated code. Using docker is safer than running the generated code directly. -) - -groupchat = autogen.GroupChat( - agents=[user_proxy, engineer, scientist, planner, executor], - messages=[], - max_round=10, -) - -manager = autogen.GroupChatManager(groupchat=groupchat, llm_config=gpt4_config) -``` - -Load the previous messages (from a JSON string or messages `List[Dict]`) - -```python -# Messages in a JSON string -previous_state = r"""[{"content": "Find the latest paper about gpt-4 on arxiv and find its potential applications in software.", "role": "user", "name": "Admin"}, {"content": "Plan:\n1. **Engineer**: Search for the latest paper on GPT-4 on arXiv.\n2. **Scientist**: Read the paper and summarize the key findings and potential applications of GPT-4.\n3. **Engineer**: Identify potential software applications where GPT-4 can be utilized based on the scientist's summary.\n4. **Scientist**: Provide insights on the feasibility and impact of implementing GPT-4 in the identified software applications.\n5. **Engineer**: Develop a prototype or proof of concept to demonstrate how GPT-4 can be integrated into the selected software application.\n6. **Scientist**: Evaluate the prototype, provide feedback, and suggest any improvements or modifications.\n7. **Engineer**: Make necessary revisions based on the scientist's feedback and finalize the integration of GPT-4 into the software application.\n8. **Admin**: Review the final software application with GPT-4 integration and approve for further development or implementation.\n\nFeedback from admin and critic is needed for further refinement of the plan.", "role": "user", "name": "Planner"}, {"content": "Agree", "role": "user", "name": "Admin"}, {"content": "Great! Let's proceed with the plan outlined earlier. I will start by searching for the latest paper on GPT-4 on arXiv. Once I find the paper, the scientist will summarize the key findings and potential applications of GPT-4. We will then proceed with the rest of the steps as outlined. I will keep you updated on our progress.", "role": "user", "name": "Planner"}]""" -``` - -Resume the group chat through the manager. - -```python -result = manager.resume_chat(messages=previous_state) -``` - -```` text -Resuming chat... -Last speaker is Planner - conversation will resume with their last message. -Planner (to chat_manager): - -Great! Let's proceed with the plan outlined earlier. I will start by searching for the latest paper on GPT-4 on arXiv. Once I find the paper, the scientist will summarize the key findings and potential applications of GPT-4. We will then proceed with the rest of the steps as outlined. I will keep you updated on our progress. - --------------------------------------------------------------------------------- -Engineer (to chat_manager): - -```python -import requests -from bs4 import BeautifulSoup - -# Define the URL for the arXiv search -url = "https://arxiv.org/search/?query=GPT-4&searchtype=all&source=header" - -# Send a GET request to the URL -response = requests.get(url) - -# Parse the HTML content of the page -soup = BeautifulSoup(response.content, 'html.parser') - -# Find the first paper related to GPT-4 -paper = soup.find('li', class_='arxiv-result') -if paper: - title = paper.find('p', class_='title').text.strip() - authors = paper.find('p', class_='authors').text.strip() - abstract = paper.find('p', class_='abstract').text.strip().replace('\n', ' ') - link = paper.find('p', class_='list-title').find('a')['href'] - print(f"Title: {title}\nAuthors: {authors}\nAbstract: {abstract}\nLink: {link}") -else: - print("No GPT-4 papers found on arXiv.") -``` -This script searches for the latest paper on GPT-4 on arXiv, extracts the title, authors, abstract, and link to the paper, and prints this information. - --------------------------------------------------------------------------------- - ->>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is python)... -Executor (to chat_manager): - -exitcode: 0 (execution succeeded) -Code output: -Title: NaturalCodeBench: Examining Coding Performance Mismatch on HumanEval and Natural User Prompts -Authors: Authors: -Shudan Zhang, - - Hanlin Zhao, - - Xiao Liu, - - Qinkai Zheng, - - Zehan Qi, - - Xiaotao Gu, - - Xiaohan Zhang, - - Yuxiao Dong, - - Jie Tang -Abstract: Abstract: …we also introduce a semi-automated pipeline to enhance the efficiency of test case construction. Comparing with manual solutions, it achieves an efficiency increase of more than 4 times. Our systematic experiments on 39 LLMs find that performance gaps on NCB between models with close HumanEval scores could still be significant, indicating a lack of focus on… ▽ More Large language models (LLMs) have manifested strong ability to generate codes for productive activities. However, current benchmarks for code synthesis, such as HumanEval, MBPP, and DS-1000, are predominantly oriented towards introductory tasks on algorithm and data science, insufficiently satisfying challenging requirements prevalent in real-world coding. To fill this gap, we propose NaturalCodeBench (NCB), a challenging code benchmark designed to mirror the complexity and variety of scenarios in real coding tasks. NCB comprises 402 high-quality problems in Python and Java, meticulously selected from natural user queries from online coding services, covering 6 different domains. Noting the extraordinary difficulty in creating testing cases for real-world queries, we also introduce a semi-automated pipeline to enhance the efficiency of test case construction. Comparing with manual solutions, it achieves an efficiency increase of more than 4 times. Our systematic experiments on 39 LLMs find that performance gaps on NCB between models with close HumanEval scores could still be significant, indicating a lack of focus on practical code synthesis scenarios or over-specified optimization on HumanEval. On the other hand, even the best-performing GPT-4 is still far from satisfying on NCB. The evaluation toolkit and development set are available at https://github.com/THUDM/NaturalCodeBench. △ Less -Link: https://arxiv.org/abs/2405.04520 - - --------------------------------------------------------------------------------- -Scientist (to chat_manager): - -The latest paper related to GPT-4 on arXiv is titled "NaturalCodeBench: Examining Coding Performance Mismatch on HumanEval and Natural User Prompts" by Shudan Zhang, Hanlin Zhao, Xiao Liu, Qinkai Zheng, Zehan Qi, Xiaotao Gu, Xiaohan Zhang, Yuxiao Dong, and Jie Tang. The abstract discusses the introduction of NaturalCodeBench (NCB), a challenging code benchmark designed to mirror the complexity and variety of scenarios in real coding tasks. It comprises 402 high-quality problems in Python and Java, selected from natural user queries from online coding services, covering 6 different domains. The paper highlights the performance gaps on NCB between models with close HumanEval scores, indicating a lack of focus on practical code synthesis scenarios or over-specified optimization on HumanEval. It also notes that even the best-performing GPT-4 is still far from satisfying on NCB, suggesting room for improvement in real-world coding applications. - -Potential applications in software based on this summary could include: -1. **Automated Code Generation**: Enhancing IDEs (Integrated Development Environments) with GPT-4 to provide real-time coding assistance, code completion, and bug fixing suggestions based on real-world coding scenarios. -2. **Code Review and Optimization**: Integrating GPT-4 into code review tools to suggest optimizations and improvements by understanding the context and complexity of the code better. -3. **Educational Tools**: Developing educational software that uses GPT-4 to create more complex and real-world relevant coding exercises and challenges for learners. -4. **Automated Testing and Debugging**: Utilizing GPT-4 to generate test cases for software projects automatically, especially for complex real-world scenarios that are hard to cover manually. - -These applications could significantly impact software development by improving efficiency, code quality, and learning outcomes for developers. - --------------------------------------------------------------------------------- -Engineer (to chat_manager): - -Based on the scientist's summary and identified potential applications of GPT-4 in software, we can see that GPT-4's capabilities can be leveraged to enhance various aspects of software development and education. The key areas where GPT-4 can make a significant impact include automated code generation, code review and optimization, educational tools, and automated testing and debugging. - -Given the current state of GPT-4 as highlighted in the paper, where it still falls short in satisfying the complex requirements of real-world coding tasks as per the NaturalCodeBench (NCB) benchmark, it's clear that integrating GPT-4 into software applications requires careful consideration of its limitations and strengths. The potential for GPT-4 to improve efficiency, code quality, and learning outcomes is substantial, but it also necessitates ongoing evaluation and refinement to ensure that the applications remain relevant and effective in real-world scenarios. - -For instance, in automated code generation and assistance within IDEs, GPT-4 could be used to suggest code snippets, complete code blocks, or even generate entire functions based on brief descriptions or comments. However, developers would need to review these suggestions carefully for accuracy and efficiency, especially in complex or critical applications. - -In code review and optimization, GPT-4 could help identify potential improvements or optimizations in code, but human reviewers should verify these suggestions to ensure they align with project goals and coding standards. - -Educational tools powered by GPT-4 could offer advanced coding challenges and personalized learning experiences, but they should be designed to encourage understanding and not just mimicry of solutions. - -Finally, in automated testing and debugging, GPT-4's ability to generate test cases could significantly reduce manual effort, but the quality and coverage of these tests would need to be assessed to ensure they effectively catch bugs and issues. - -Overall, the integration of GPT-4 into software applications offers exciting possibilities, but it also requires a balanced approach that leverages the model's strengths while mitigating its limitations through careful design, human oversight, and continuous feedback and improvement. - --------------------------------------------------------------------------------- -Admin (to chat_manager): - -Approve - --------------------------------------------------------------------------------- -Engineer (to chat_manager): - -With the approval from the admin, the plan to explore and integrate GPT-4 into software applications, considering its potential and limitations, is set to move forward. This involves developing prototypes or proof of concepts for the identified applications, such as enhancing IDEs with real-time coding assistance, integrating GPT-4 into code review tools, creating educational software with complex coding exercises, and automating test case generation for software projects. - -The next steps include detailed planning and execution of these prototypes, ensuring they leverage GPT-4 effectively while addressing any challenges that arise during development. Continuous evaluation and refinement based on feedback will be crucial to achieving successful integration and realizing the benefits of GPT-4 in software development and education. - -This initiative represents an exciting opportunity to push the boundaries of what's possible with AI in software, aiming to improve efficiency, code quality, and learning outcomes for developers and students alike. - --------------------------------------------------------------------------------- -```` - -```python -# Output the final chat history showing the original 4 messages and resumed messages -for i, message in enumerate(groupchat.messages): - print( - f"#{i + 1}, {message['name']}: {message['content'][:80]}".replace("\n", " "), - f"{'...' if len(message['content']) > 80 else ''}".replace("\n", " "), - ) -``` - -``` text -#1, Admin: Find the latest paper about gpt-4 on arxiv and find its potential applications i ... -#2, Planner: Plan: 1. **Engineer**: Search for the latest paper on GPT-4 on arXiv. 2. **Scien ... -#3, Admin: Agree -#4, Planner: Great! Let's proceed with the plan outlined earlier. I will start by searching f ... -#5, Engineer: ```python import requests from bs4 import BeautifulSoup # Define the URL for th ... -#6, Executor: exitcode: 0 (execution succeeded) Code output: Title: NaturalCodeBench: Examini ... -#7, Scientist: The latest paper related to GPT-4 on arXiv is titled "NaturalCodeBench: Examinin ... -#8, Engineer: Based on the scientist's summary and identified potential applications of GPT-4 ... -#9, Admin: Approve -#10, Engineer: With the approval from the admin, the plan to explore and integrate GPT-4 into s ... -```