Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiline docstrings fix #2130

Merged
merged 33 commits into from
Apr 3, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
b441533
DOC FIX - Formatted Docstrings for the retrieve_user_proxy_agent.py a…
sharsha315 Mar 23, 2024
9a03544
DOC FIX - Formatted Docstrings for theinitiate_chats functiion of Ch…
sharsha315 Mar 23, 2024
a110a1f
Merge branch 'main' into multiline-docstrings-fix
thinkall Mar 24, 2024
ec68207
Add vision capability (#2025)
BeibinLi Mar 24, 2024
28c37f3
Native tool call support for Mistral AI API and topic notebook. (#2135)
ekzhu Mar 25, 2024
9ca9b11
New conversational chess notebook using nested chats and tool use (#2…
ekzhu Mar 25, 2024
cfed8ef
add webarena in samples (#2114)
olgavrou Mar 25, 2024
d9e1e6d
context to kwargs (#2064)
qingyun-wu Mar 26, 2024
91924f5
Bump webpack-dev-middleware from 5.3.3 to 5.3.4 in /website (#2131)
dependabot[bot] Mar 26, 2024
4964cfe
Parse Any HTML-esh Style Tags (#2046)
WaelKarkoub Mar 26, 2024
cd7d91d
Integrate AgentOptimizer (#1767)
skzhang1 Mar 26, 2024
bdaa0a8
Introducing IOStream protocol and adding support for websockets (#1551)
davorrunje Mar 26, 2024
9d03b98
[CAP] [Feature] Get list of actors from directory service. (#2073)
rajan-chari Mar 27, 2024
033fc28
Mark cache as a protocol and update type hints to reflect (#2168)
jackgerrits Mar 27, 2024
fd10d9e
fix(): fix word spelling errors (#2171)
shouldnotappearcalm Mar 27, 2024
11f69a5
Implement User Defined Functions for Local CLI Executor (#2102)
jackgerrits Mar 27, 2024
812c67a
simplify getting-started; update news (#2175)
sonichi Mar 28, 2024
160d474
update (#2178)
skzhang1 Mar 28, 2024
751fc92
Fix formatting of admonitions in udf docs (#2188)
jackgerrits Mar 28, 2024
491903c
Fix iostream on new thread (#2181)
davorrunje Mar 28, 2024
7f1c547
Add link for rendering notebooks docs on website (#2191)
jackgerrits Mar 28, 2024
e363080
Transform Messages Capability (#1923)
WaelKarkoub Mar 28, 2024
1a09005
Bump express from 4.18.2 to 4.19.2 in /website (#2157)
dependabot[bot] Mar 28, 2024
9b15aa6
add clarity analytics (#2201)
ekzhu Mar 28, 2024
6626f37
Docstring formatting fix: Standardize docstrings to adhere to Google …
sharsha315 Mar 28, 2024
1bf1703
Docstring fix: Reformattted docstrings to adhere to Google style guid…
sharsha315 Mar 29, 2024
5c0095d
Merge branch 'main' into multiline-docstrings-fix
sharsha315 Mar 29, 2024
83a12b8
Fixed Pre-Commit Error, Trailing spaces on agentchat/chat.py
sharsha315 Mar 29, 2024
3a5e1cc
Merge remote-tracking branch 'origin/multiline-docstrings-fix' into m…
sharsha315 Mar 29, 2024
1776a57
Fixed Pre-Commit Error, Trailing spaces on agentchat/chat.py
sharsha315 Mar 29, 2024
ae042fb
Merge branch 'main' into multiline-docstrings-fix
thinkall Apr 2, 2024
e132fcb
Merge branch 'main' into multiline-docstrings-fix
ekzhu Apr 2, 2024
227b063
Merge branch 'main' into multiline-docstrings-fix
ekzhu Apr 3, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 30 additions & 21 deletions autogen/agentchat/chat.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,9 @@ class ChatResult:
summary: str = None
"""A summary obtained from the chat."""
cost: tuple = None # (dict, dict) - (total_cost, actual_cost_with_cache)
"""The cost of the chat. a tuple of (total_cost, total_actual_cost), where total_cost is a dictionary of cost information, and total_actual_cost is a dictionary of information on the actual incurred cost with cache."""
"""The cost of the chat. a tuple of (total_cost, total_actual_cost), where total_cost is a
dictionary of cost information, and total_actual_cost is a dictionary of information on
the actual incurred cost with cache."""
ekzhu marked this conversation as resolved.
Show resolved Hide resolved
human_input: List[str] = None
"""A list of human input solicited during the chat."""

Expand Down Expand Up @@ -141,25 +143,32 @@ def __post_carryover_processing(chat_info: Dict[str, Any]) -> None:

def initiate_chats(chat_queue: List[Dict[str, Any]]) -> List[ChatResult]:
"""Initiate a list of chats.

Args:
chat_queue (List[Dict]): a list of dictionaries containing the information about the chats.

Each dictionary should contain the input arguments for [`ConversableAgent.initiate_chat`](/docs/reference/agentchat/conversable_agent#initiate_chat). For example:
- "sender": the sender agent.
- "recipient": the recipient agent.
- "clear_history" (bool): whether to clear the chat history with the agent. Default is True.
- "silent" (bool or None): (Experimental) whether to print the messages in this conversation. Default is False.
- "cache" (AbstractCache or None): the cache client to use for this conversation. Default is None.
- "max_turns" (int or None): maximum number of turns for the chat. If None, the chat will continue until a termination condition is met. Default is None.
- "summary_method" (str or callable): a string or callable specifying the method to get a summary from the chat. Default is DEFAULT_summary_method, i.e., "last_msg".
- "summary_args" (dict): a dictionary of arguments to be passed to the summary_method. Default is {}.
- "message" (str, callable or None): if None, input() will be called to get the initial message.
- **context: additional context information to be passed to the chat.
- "carryover": It can be used to specify the carryover information to be passed to this chat.
If provided, we will combine this carryover with the "message" content when generating the initial chat
message in `generate_init_message`.

chat_queue (List[Dict]): A list of dictionaries containing the information about the chats.

Each dictionary should contain the input arguments for
[`ConversableAgent.initiate_chat`](/docs/reference/agentchat/conversable_agent#initiate_chat).
For example:
- `"sender"` - the sender agent.
- `"recipient"` - the recipient agent.
- `"clear_history" (bool) - whether to clear the chat history with the agent.
Default is True.
- `"silent"` (bool or None) - (Experimental) whether to print the messages in this
conversation. Default is False.
- `"cache"` (Cache or None) - the cache client to use for this conversation.
Default is None.
- `"max_turns"` (int or None) - maximum number of turns for the chat. If None, the chat
will continue until a termination condition is met. Default is None.
- `"summary_method"` (str or callable) - a string or callable specifying the method to get
a summary from the chat. Default is DEFAULT_summary_method, i.e., "last_msg".
- `"summary_args"` (dict) - a dictionary of arguments to be passed to the summary_method.
Default is {}.
- `"message"` (str, callable or None) - if None, input() will be called to get the
initial message.
- `**context` - additional context information to be passed to the chat.
- `"carryover"` - It can be used to specify the carryover information to be passed
to this chat. If provided, we will combine this carryover with the "message" content when
generating the initial chat message in `generate_init_message`.
Returns:
(list): a list of ChatResult objects corresponding to the finished chats in the chat_queue.
"""
Expand Down Expand Up @@ -228,11 +237,11 @@ async def a_initiate_chats(chat_queue: List[Dict[str, Any]]) -> Dict[int, ChatRe
"""(async) Initiate a list of chats.

args:
Please refer to `initiate_chats`.
- Please refer to `initiate_chats`.


returns:
(Dict): a dict of ChatId: ChatResult corresponding to the finished chats in the chat_queue.
- (Dict): a dict of ChatId: ChatResult corresponding to the finished chats in the chat_queue.
"""
consolidate_chat_info(chat_queue)
_validate_recipients(chat_queue)
Expand Down
135 changes: 89 additions & 46 deletions autogen/agentchat/contrib/retrieve_user_proxy_agent.py
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,10 @@


class RetrieveUserProxyAgent(UserProxyAgent):
"""(In preview) The Retrieval-Augmented User Proxy retrieves document chunks based on the embedding
similarity, and sends them along with the question to the Retrieval-Augmented Assistant
"""

def __init__(
self,
name="RetrieveChatAgent", # default set to RetrieveChatAgent
Expand All @@ -73,67 +77,106 @@ def __init__(
r"""
Args:
name (str): name of the agent.

human_input_mode (str): whether to ask for human inputs every time a message is received.
Possible values are "ALWAYS", "TERMINATE", "NEVER".
1. When "ALWAYS", the agent prompts for human input every time a message is received.
Under this mode, the conversation stops when the human input is "exit",
or when is_termination_msg is True and there is no human input.
2. When "TERMINATE", the agent only prompts for human input only when a termination message is received or
the number of auto reply reaches the max_consecutive_auto_reply.
3. When "NEVER", the agent will never prompt for human input. Under this mode, the conversation stops
when the number of auto reply reaches the max_consecutive_auto_reply or when is_termination_msg is True.
2. When "TERMINATE", the agent only prompts for human input only when a termination
message is received or the number of auto reply reaches
the max_consecutive_auto_reply.
3. When "NEVER", the agent will never prompt for human input. Under this mode, the
conversation stops when the number of auto reply reaches the
max_consecutive_auto_reply or when is_termination_msg is True.

is_termination_msg (function): a function that takes a message in the form of a dictionary
and returns a boolean value indicating if this received message is a termination message.
The dict can contain the following keys: "content", "role", "name", "function_call".

retrieve_config (dict or None): config for the retrieve agent.
To use default config, set to None. Otherwise, set to a dictionary with the following keys:
- task (Optional, str): the task of the retrieve chat. Possible values are "code", "qa" and "default". System
prompt will be different for different tasks. The default value is `default`, which supports both code and qa.
- client (Optional, chromadb.Client): the chromadb client. If key not provided, a default client `chromadb.Client()`
will be used. If you want to use other vector db, extend this class and override the `retrieve_docs` function.
- docs_path (Optional, Union[str, List[str]]): the path to the docs directory. It can also be the path to a single file,
the url to a single file or a list of directories, files and urls. Default is None, which works only if the collection is already created.
- extra_docs (Optional, bool): when true, allows adding documents with unique IDs without overwriting existing ones; when false, it replaces existing documents using default IDs, risking collection overwrite.,
when set to true it enables the system to assign unique IDs starting from "length+i" for new document chunks, preventing the replacement of existing documents and facilitating the addition of more content to the collection..
By default, "extra_docs" is set to false, starting document IDs from zero. This poses a risk as new documents might overwrite existing ones, potentially causing unintended loss or alteration of data in the collection.
- collection_name (Optional, str): the name of the collection.

To use default config, set to None. Otherwise, set to a dictionary with the
following keys:
- `task` (Optional, str) - the task of the retrieve chat. Possible values are
"code", "qa" and "default". System prompt will be different for different tasks.
The default value is `default`, which supports both code and qa.
- `client` (Optional, chromadb.Client) - the chromadb client. If key not provided, a
default client `chromadb.Client()` will be used. If you want to use other
vector db, extend this class and override the `retrieve_docs` function.
- `docs_path` (Optional, Union[str, List[str]]) - the path to the docs directory. It
can also be the path to a single file, the url to a single file or a list
of directories, files and urls. Default is None, which works only if the
collection is already created.
- `extra_docs` (Optional, bool) - when true, allows adding documents with unique IDs
without overwriting existing ones; when false, it replaces existing documents
using default IDs, risking collection overwrite., when set to true it enables
the system to assign unique IDs starting from "length+i" for new document
chunks, preventing the replacement of existing documents and facilitating the
addition of more content to the collection..
By default, "extra_docs" is set to false, starting document IDs from zero.
This poses a risk as new documents might overwrite existing ones, potentially
causing unintended loss or alteration of data in the collection.
- `collection_name` (Optional, str) - the name of the collection.
If key not provided, a default name `autogen-docs` will be used.
- model (Optional, str): the model to use for the retrieve chat.
- `model` (Optional, str) - the model to use for the retrieve chat.
If key not provided, a default model `gpt-4` will be used.
- chunk_token_size (Optional, int): the chunk token size for the retrieve chat.
- `chunk_token_size` (Optional, int) - the chunk token size for the retrieve chat.
If key not provided, a default size `max_tokens * 0.4` will be used.
- context_max_tokens (Optional, int): the context max token size for the retrieve chat.
- `context_max_tokens` (Optional, int) - the context max token size for the
retrieve chat.
If key not provided, a default size `max_tokens * 0.8` will be used.
- chunk_mode (Optional, str): the chunk mode for the retrieve chat. Possible values are
"multi_lines" and "one_line". If key not provided, a default mode `multi_lines` will be used.
- must_break_at_empty_line (Optional, bool): chunk will only break at empty line if True. Default is True.
- `chunk_mode` (Optional, str) - the chunk mode for the retrieve chat. Possible values
are "multi_lines" and "one_line". If key not provided, a default mode
`multi_lines` will be used.
- `must_break_at_empty_line` (Optional, bool) - chunk will only break at empty line
if True. Default is True.
If chunk_mode is "one_line", this parameter will be ignored.
- embedding_model (Optional, str): the embedding model to use for the retrieve chat.
If key not provided, a default model `all-MiniLM-L6-v2` will be used. All available models
can be found at `https://www.sbert.net/docs/pretrained_models.html`. The default model is a
fast model. If you want to use a high performance model, `all-mpnet-base-v2` is recommended.
- embedding_function (Optional, Callable): the embedding function for creating the vector db. Default is None,
SentenceTransformer with the given `embedding_model` will be used. If you want to use OpenAI, Cohere, HuggingFace or
other embedding functions, you can pass it here, follow the examples in `https://docs.trychroma.com/embeddings`.
- customized_prompt (Optional, str): the customized prompt for the retrieve chat. Default is None.
- customized_answer_prefix (Optional, str): the customized answer prefix for the retrieve chat. Default is "".
If not "" and the customized_answer_prefix is not in the answer, `Update Context` will be triggered.
- update_context (Optional, bool): if False, will not apply `Update Context` for interactive retrieval. Default is True.
- get_or_create (Optional, bool): if True, will create/return a collection for the retrieve chat. This is the same as that used in chromadb.
Default is False. Will raise ValueError if the collection already exists and get_or_create is False. Will be set to True if docs_path is None.
- custom_token_count_function (Optional, Callable): a custom function to count the number of tokens in a string.
The function should take (text:str, model:str) as input and return the token_count(int). the retrieve_config["model"] will be passed in the function.
Default is autogen.token_count_utils.count_token that uses tiktoken, which may not be accurate for non-OpenAI models.
- custom_text_split_function (Optional, Callable): a custom function to split a string into a list of strings.
Default is None, will use the default function in `autogen.retrieve_utils.split_text_to_chunks`.
- custom_text_types (Optional, List[str]): a list of file types to be processed. Default is `autogen.retrieve_utils.TEXT_FORMATS`.
This only applies to files under the directories in `docs_path`. Explicitly included files and urls will be chunked regardless of their types.
- recursive (Optional, bool): whether to search documents recursively in the docs_path. Default is True.
- `embedding_model` (Optional, str) - the embedding model to use for the retrieve chat.
If key not provided, a default model `all-MiniLM-L6-v2` will be used. All available
models can be found at `https://www.sbert.net/docs/pretrained_models.html`.
The default model is a fast model. If you want to use a high performance model,
`all-mpnet-base-v2` is recommended.
- `embedding_function` (Optional, Callable) - the embedding function for creating the
vector db. Default is None, SentenceTransformer with the given `embedding_model`
will be used. If you want to use OpenAI, Cohere, HuggingFace or other embedding
functions, you can pass it here,
follow the examples in `https://docs.trychroma.com/embeddings`.
- `customized_prompt` (Optional, str) - the customized prompt for the retrieve chat.
Default is None.
- `customized_answer_prefix` (Optional, str) - the customized answer prefix for the
retrieve chat. Default is "".
If not "" and the customized_answer_prefix is not in the answer,
`Update Context` will be triggered.
- `update_context` (Optional, bool) - if False, will not apply `Update Context` for
interactive retrieval. Default is True.
- `get_or_create` (Optional, bool) - if True, will create/return a collection for the
retrieve chat. This is the same as that used in chromadb.
Default is False. Will raise ValueError if the collection already exists and
get_or_create is False. Will be set to True if docs_path is None.
- `custom_token_count_function` (Optional, Callable) - a custom function to count the
number of tokens in a string.
The function should take (text:str, model:str) as input and return the
token_count(int). the retrieve_config["model"] will be passed in the function.
Default is autogen.token_count_utils.count_token that uses tiktoken, which may
not be accurate for non-OpenAI models.
- `custom_text_split_function` (Optional, Callable) - a custom function to split a
string into a list of strings.
Default is None, will use the default function in
`autogen.retrieve_utils.split_text_to_chunks`.
- `custom_text_types` (Optional, List[str]) - a list of file types to be processed.
Default is `autogen.retrieve_utils.TEXT_FORMATS`.
This only applies to files under the directories in `docs_path`. Explicitly
included files and urls will be chunked regardless of their types.
- `recursive` (Optional, bool) - whether to search documents recursively in the
docs_path. Default is True.

`**kwargs` (dict): other kwargs in [UserProxyAgent](../user_proxy_agent#__init__).

Example:

Example of overriding retrieve_docs - If you have set up a customized vector db, and it's not compatible with chromadb, you can easily plug in it with below code.
Example of overriding retrieve_docs - If you have set up a customized vector db, and it's
not compatible with chromadb, you can easily plug in it with below code.
```python
class MyRetrieveUserProxyAgent(RetrieveUserProxyAgent):
def query_vector_db(
Expand Down Expand Up @@ -416,9 +459,9 @@ def message_generator(sender, recipient, context):
sender (Agent): the sender agent. It should be the instance of RetrieveUserProxyAgent.
recipient (Agent): the recipient agent. Usually it's the assistant agent.
context (dict): the context for the message generation. It should contain the following keys:
- problem (str): the problem to be solved.
- n_results (int): the number of results to be retrieved. Default is 20.
- search_string (str): only docs that contain an exact match of this string will be retrieved. Default is "".
- `problem` (str) - the problem to be solved.
- `n_results` (int) - the number of results to be retrieved. Default is 20.
- `search_string` (str) - only docs that contain an exact match of this string will be retrieved. Default is "".
Returns:
str: the generated message ready to be sent to the recipient agent.
"""
Expand Down
Loading