Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

migration guide #477

Merged
merged 4 commits into from
Oct 30, 2023
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 11 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,11 +28,11 @@ AutoGen is a framework that enables the development of LLM applications using mu

![AutoGen Overview](https://github.com/microsoft/autogen/blob/main/website/static/img/autogen_agentchat.png)

- AutoGen enables building next-gen LLM applications based on **multi-agent conversations** with minimal effort. It simplifies the orchestration, automation, and optimization of a complex LLM workflow. It maximizes the performance of LLM models and overcomes their weaknesses.
- It supports **diverse conversation patterns** for complex workflows. With customizable and conversable agents, developers can use AutoGen to build a wide range of conversation patterns concerning conversation autonomy,
- AutoGen enables building next-gen LLM applications based on [multi-agent conversations](https://microsoft.github.io/autogen/docs/Use-Cases/agent_chat) with minimal effort. It simplifies the orchestration, automation, and optimization of a complex LLM workflow. It maximizes the performance of LLM models and overcomes their weaknesses.
- It supports [diverse conversation patterns](https://microsoft.github.io/autogen/docs/Use-Cases/agent_chat#supporting-diverse-conversation-patterns) for complex workflows. With customizable and conversable agents, developers can use AutoGen to build a wide range of conversation patterns concerning conversation autonomy,
the number of agents, and agent conversation topology.
- It provides a collection of working systems with different complexities. These systems span a **wide range of applications** from various domains and complexities. This demonstrates how AutoGen can easily support diverse conversation patterns.
- AutoGen provides **enhanced LLM inference**. It offers easy performance tuning, plus utilities like API unification and caching, and advanced usage patterns, such as error handling, multi-config inference, context programming, etc.
- It provides a collection of working systems with different complexities. These systems span a [wide range of applications](https://microsoft.github.io/autogen/docs/Use-Cases/agent_chat#diverse-applications-implemented-with-autogen) from various domains and complexities. This demonstrates how AutoGen can easily support diverse conversation patterns.
- AutoGen provides [enhanced LLM inference](https://microsoft.github.io/autogen/docs/Use-Cases/enhanced_inference#api-unification). It offers utilities like API unification and caching, and advanced usage patterns, such as error handling, multi-config inference, context programming, etc.

AutoGen is powered by collaborative [research studies](https://microsoft.github.io/autogen/docs/Research) from Microsoft, Penn State University, and the University of Washington.

Expand All @@ -42,14 +42,14 @@ The easiest way to start playing is

[![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://codespaces.new/microsoft/autogen?quickstart=1)

2. Copy OAI_CONFIG_LIST_sample to /notebook folder, name to OAI_CONFIG_LIST, and set the correct config.
2. Copy OAI_CONFIG_LIST_sample to ./notebook folder, name to OAI_CONFIG_LIST, and set the correct config.
3. Start playing with the notebooks!



## Installation

AutoGen requires **Python version >= 3.8**. It can be installed from pip:
AutoGen requires **Python version >= 3.8, < 3.12**. It can be installed from pip:

```bash
pip install pyautogen
Expand All @@ -72,7 +72,7 @@ For LLM inference configurations, check the [FAQs](https://microsoft.github.io/a

## Multi-Agent Conversation Framework

Autogen enables the next-gen LLM applications with a generic multi-agent conversation framework. It offers customizable and conversable agents that integrate LLMs, tools, and humans.
Autogen enables the next-gen LLM applications with a generic [multi-agent conversation](https://microsoft.github.io/autogen/docs/Use-Cases/agent_chat) framework. It offers customizable and conversable agents that integrate LLMs, tools, and humans.
By automating chat among multiple capable agents, one can easily make them collectively perform tasks autonomously or with human feedback, including tasks that require using tools via code.

Features of this use case include:
Expand Down Expand Up @@ -110,7 +110,9 @@ Please find more [code examples](https://microsoft.github.io/autogen/docs/Exampl

## Enhanced LLM Inferences

Autogen also helps maximize the utility out of the expensive LLMs such as ChatGPT and GPT-4. It offers enhanced LLM inference with powerful functionalities like tuning, caching, error handling, and templating. For example, you can optimize generations by LLM with your own tuning data, success metrics, and budgets.
Autogen also helps maximize the utility out of the expensive LLMs such as ChatGPT and GPT-4. It offers [enhanced LLM inference](https://microsoft.github.io/autogen/docs/Use-Cases/enhanced_inference#api-unification) with powerful functionalities like caching, error handling, multi-config inference and templating.

<!-- For example, you can optimize generations by LLM with your own tuning data, success metrics, and budgets.

```python
# perform tuning for openai<1
Expand All @@ -127,7 +129,7 @@ config, analysis = autogen.Completion.tune(
response = autogen.Completion.create(context=test_instance, **config)
```

Please find more [code examples](https://microsoft.github.io/autogen/docs/Examples/AutoGen-Inference) for this feature.
Please find more [code examples](https://microsoft.github.io/autogen/docs/Examples/AutoGen-Inference) for this feature. -->

## Documentation

Expand Down
2 changes: 1 addition & 1 deletion autogen/oai/client.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@

ERROR = None
except ImportError:
ERROR = ImportError("please install openai>=1 and diskcache to use the autogen.oai subpackage.")
ERROR = ImportError("Please install openai>=1 and diskcache to use autogen.OpenAIWrapper.")
OpenAI = object
logger = logging.getLogger(__name__)
if not logger.handlers:
Expand Down
20 changes: 19 additions & 1 deletion autogen/oai/completion.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,10 @@

ERROR = None
except ImportError:
ERROR = ImportError("please install openai and diskcache to use the autogen.oai subpackage.")
ERROR = ImportError(
"(Deprecated) The autogen.Completion class requires openai<1 and diskcache. "
"Please switch to autogen.OpenAIWrapper for openai>=1."
)
openai_Completion = object
logger = logging.getLogger(__name__)
if not logger.handlers:
Expand Down Expand Up @@ -567,6 +570,10 @@ def eval_func(responses, **data):
dict: The optimized hyperparameter setting.
tune.ExperimentAnalysis: The tuning results.
"""
logger.warning(
"tuning via Completion.tune is deprecated in pyautogen v0.2 and openai>=1. "
"flaml.tune supports tuning more generically."
)
if ERROR:
raise ERROR
space = cls.default_search_space.copy()
Expand Down Expand Up @@ -775,6 +782,11 @@ def yes_or_no_filter(context, config, response):
- `config_id`: the index of the config in the config_list that is used to generate the response.
- `pass_filter`: whether the response passes the filter function. None if no filter is provided.
"""
logger.warning(
"Completion.create is deprecated in pyautogen v0.2 and openai>=1. "
"The new openai requires initiating a client for inference. "
"Please refer to https://microsoft.github.io/autogen/docs/Use-Cases/enhanced_inference#api-unification"
)
if ERROR:
raise ERROR

Expand Down Expand Up @@ -1159,6 +1171,12 @@ def start_logging(
while the compact history dict has a linear size.
reset_counter (bool): whether to reset the counter of the number of API calls.
"""
logger.warning(
"logging via Completion.start_logging is deprecated in pyautogen v0.2. "
"logging via OpenAIWrapper will be added back in a future release."
)
if ERROR:
raise ERROR
cls._history_dict = {} if history_dict is None else history_dict
cls._history_compact = compact
cls._count_create = 0 if reset_counter or cls._count_create is None else cls._count_create
Expand Down
4 changes: 2 additions & 2 deletions notebook/agentchat_qdrant_RetrieveChat.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -95,14 +95,14 @@
" {\n",
" 'model': 'gpt-4',\n",
" 'api_key': '<your Azure OpenAI API key here>',\n",
" 'api_base': '<your Azure OpenAI API base here>',\n",
" 'base_url': '<your Azure OpenAI API base here>',\n",
" 'api_type': 'azure',\n",
" 'api_version': '2023-06-01-preview',\n",
" },\n",
" {\n",
" 'model': 'gpt-3.5-turbo',\n",
" 'api_key': '<your Azure OpenAI API key here>',\n",
" 'api_base': '<your Azure OpenAI API base here>',\n",
" 'base_url': '<your Azure OpenAI API base here>',\n",
" 'api_type': 'azure',\n",
" 'api_version': '2023-06-01-preview',\n",
" },\n",
Expand Down
2 changes: 1 addition & 1 deletion notebook/oai_openai_utils.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@
"assistant = AssistantAgent(\n",
" name=\"assistant\",\n",
" llm_config={\n",
" \"request_timeout\": 600,\n",
" \"timeout\": 600,\n",
" \"seed\": 42,\n",
" \"config_list\": config_list,\n",
" \"temperature\": 0,\n",
Expand Down
2 changes: 1 addition & 1 deletion website/blog/2023-10-26-TeachableAgent/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ from autogen.agentchat.contrib.teachable_agent import TeachableAgent
# and OAI_CONFIG_LIST_sample
filter_dict = {"model": ["gpt-4"]} # GPT-3.5 is less reliable than GPT-4 at learning from user feedback.
config_list = config_list_from_json(env_or_file="OAI_CONFIG_LIST", filter_dict=filter_dict)
llm_config={"config_list": config_list, "request_timeout": 120}
llm_config={"config_list": config_list, "timeout": 120}
```

4. Create the agents
Expand Down
8 changes: 4 additions & 4 deletions website/docs/Getting-Started.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,11 +8,11 @@ AutoGen is a framework that enables development of LLM applications using multip

### Main Features

* AutoGen enables building next-gen LLM applications based on **multi-agent conversations** with minimal effort. It simplifies the orchestration, automation and optimization of a complex LLM workflow. It maximizes the performance of LLM models and overcome their weaknesses.
* It supports **diverse conversation patterns** for complex workflows. With customizable and conversable agents, developers can use AutoGen to build a wide range of conversation patterns concerning conversation autonomy,
- AutoGen enables building next-gen LLM applications based on [multi-agent conversations](https://microsoft.github.io/autogen/docs/Use-Cases/agent_chat) with minimal effort. It simplifies the orchestration, automation, and optimization of a complex LLM workflow. It maximizes the performance of LLM models and overcomes their weaknesses.
- It supports [diverse conversation patterns](https://microsoft.github.io/autogen/docs/Use-Cases/agent_chat#supporting-diverse-conversation-patterns) for complex workflows. With customizable and conversable agents, developers can use AutoGen to build a wide range of conversation patterns concerning conversation autonomy,
the number of agents, and agent conversation topology.
* It provides a collection of working systems with different complexities. These systems span a **wide range of applications** from various domains and complexities. They demonstrate how AutoGen can easily support different conversation patterns.
* AutoGen provides **enhanced LLM inference**. It offers easy performance tuning, plus utilities like API unification & caching, and advanced usage patterns, such as error handling, multi-config inference, context programming etc.
- It provides a collection of working systems with different complexities. These systems span a [wide range of applications](https://microsoft.github.io/autogen/docs/Use-Cases/agent_chat#diverse-applications-implemented-with-autogen) from various domains and complexities. This demonstrates how AutoGen can easily support diverse conversation patterns.
- AutoGen provides [enhanced LLM inference](https://microsoft.github.io/autogen/docs/Use-Cases/enhanced_inference#api-unification). It offers utilities like API unification and caching, and advanced usage patterns, such as error handling, multi-config inference, context programming, etc.

AutoGen is powered by collaborative [research studies](/docs/Research) from Microsoft, Penn State University, and University of Washington.

Expand Down
32 changes: 25 additions & 7 deletions website/docs/Installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ Now, you're ready to install AutoGen in the virtual environment you've just crea

## Python

AutoGen requires **Python version >= 3.8**. It can be installed from pip:
AutoGen requires **Python version >= 3.8, < 3.12**. It can be installed from pip:

```bash
pip install pyautogen
Expand All @@ -49,6 +49,24 @@ or conda:
conda install pyautogen -c conda-forge
``` -->

### Migration guide to v0.2

openai v1 is a total rewrite of the library with many breaking changes. For example, the inference requires instantiating a client, instead of using a global class method.
Therefore, some changes are required for users of `pyautogen<0.2`.

- `api_base` -> `base_url`, `request_timeout` -> `timeout` in `llm_config` and `config_list`. `max_retry_period` and `retry_wait_time` are deprecated. `max_retries` can be set for each client.
- MathChat, RetrieveChat, and TeachableAgent are unsupported until they are tested in future release.
- `autogen.Completion` and `autogen.ChatCompletion` are deprecated. The essential functionalities are moved to `autogen.OpenAIWrapper`:
```python
from autogen import OpenAIWrapper
client = OpenAIWrapper(config_list=config_list)
response = client.create(messages=[{"role": "user", "content": "2+2="}])
print(client.extract_text_or_function_call(response))
```
- Inference parameter tuning and inference logging features are currently unavailable in `OpenAIWrapper`. Logging will be added in a future release.
Inference parameter tuning can be done via [`flaml.tune`](https://microsoft.github.io/FLAML/docs/Use-Cases/Tune-User-Defined-Function).
- `use_cache` is removed as a kwarg in `OpenAIWrapper.create()` for being automatically decided by `seed`: int | None.

### Optional Dependencies
* docker

Expand All @@ -61,9 +79,9 @@ pip install docker

* blendsearch

AutoGen offers a cost-effective hyperparameter optimization technique [EcoOptiGen](https://arxiv.org/abs/2303.04673) for tuning Large Language Models. Please install with the [blendsearch] option to use it.
`pyautogen<0.2` offers a cost-effective hyperparameter optimization technique [EcoOptiGen](https://arxiv.org/abs/2303.04673) for tuning Large Language Models. Please install with the [blendsearch] option to use it.
```bash
pip install "pyautogen[blendsearch]"
pip install "pyautogen[blendsearch]<0.2"
```

Example notebooks:
Expand All @@ -72,9 +90,9 @@ Example notebooks:

* retrievechat

AutoGen supports retrieval-augmented generation tasks such as question answering and code generation with RAG agents. Please install with the [retrievechat] option to use it.
`pyautogen<0.2` supports retrieval-augmented generation tasks such as question answering and code generation with RAG agents. Please install with the [retrievechat] option to use it.
```bash
pip install "pyautogen[retrievechat]"
pip install "pyautogen[retrievechat]<0.2"
```

Example notebooks:
Expand All @@ -83,9 +101,9 @@ Example notebooks:

* mathchat

AutoGen offers an experimental agent for math problem solving. Please install with the [mathchat] option to use it.
`pyautogen<0.2` offers an experimental agent for math problem solving. Please install with the [mathchat] option to use it.
```bash
pip install "pyautogen[mathchat]"
pip install "pyautogen[mathchat]<0.2"
```

Example notebooks:
Expand Down
37 changes: 34 additions & 3 deletions website/docs/Use-Cases/enhanced_inference.md
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,23 @@ When chat models are used and `prompt` is given as the input to `autogen.Complet

`autogen.OpenAIWrapper.create()` can be used to create completions for both chat and non-chat models, and both OpenAI API and Azure OpenAI API.

```python
from autogen import OpenAIWrapper
# OpenAI endpoint
client = OpenAIWrapper()
# ChatCompletion
response = client.create(messages=[{"role": "user", "content": "2+2="}], model="gpt-3.5-turbo")
# extract the response text
print(client.extract_text_or_function_call(response))
# Azure OpenAI endpoint
client = OpenAIWrapper(api_key=..., base_url=..., api_version=..., api_type="azure")
# Completion
response = client.create(prompt="2+2=", model="gpt-3.5-turbo-instruct")
# extract the response text
print(client.extract_text_or_function_call(response))

```

For local LLMs, one can spin up an endpoint using a package like [FastChat](https://github.com/lm-sys/FastChat), and then use the same API to send a request. See [here](/blog/2023/07/14/Local-LLMs) for examples on how to make inference with local LLMs.

<!-- When only working with the chat-based models, `autogen.ChatCompletion` can be used. It also does automatic conversion from prompt to messages, if prompt is provided instead of messages. -->
Expand All @@ -122,6 +139,18 @@ For local LLMs, one can spin up an endpoint using a package like [FastChat](http

API call results are cached locally and reused when the same request is issued. This is useful when repeating or continuing experiments for reproducibility and cost saving. It still allows controlled randomness by setting the "seed" specified in `OpenAIWrapper.create()` or the constructor of `OpenAIWrapper`.

```python
client = OpenAIWrapper(seed=...)
client.create(...)
```

```python
client = OpenAIWrapper()
client.create(seed=..., ...)
```

Caching is enabled by default with seed 41. To disable it please set `seed` to None.

## Error handling

### Runtime error
Expand All @@ -133,7 +162,7 @@ API call results are cached locally and reused when the same request is issued.
- `retry_wait_time` (int): the time interval to wait (in seconds) before retrying a failed request.

Moreover, -->
One can pass a list of configurations of different models/endpoints to mitigate the rate limits. For example,
One can pass a list of configurations of different models/endpoints to mitigate the rate limits and other runtime error. For example,

```python
client = OpenAIWrapper(
Expand All @@ -158,7 +187,7 @@ client = OpenAIWrapper(
)
```

It will try querying Azure OpenAI gpt-4, OpenAI gpt-3.5-turbo, and a locally hosted llama2-chat-7B one by one,
`client.create()` will try querying Azure OpenAI gpt-4, OpenAI gpt-3.5-turbo, and a locally hosted llama2-chat-7B one by one,
until a valid result is returned. This can speed up the development process where the rate limit is a bottleneck. An error will be raised if the last choice fails. So make sure the last choice in the list has the best availability.

For convenience, we provide a number of utility functions to load config lists.
Expand All @@ -184,8 +213,10 @@ def valid_json_filter(response, **_):
pass
return False

response = client.create(
client = OpenAIWrapper(
config_list=[{"model": "text-ada-001"}, {"model": "gpt-3.5-turbo-instruct"}, {"model": "text-davinci-003"}],
)
response = client.create(
prompt="How to construct a json request to Bing API to search for 'latest AI news'? Return the JSON request.",
filter_func=valid_json_filter,
)
Expand Down
Loading