From e3a7e8a37e87cbe3599eb6fd0efa20942c9b22c1 Mon Sep 17 00:00:00 2001
From: Shaokun Zhang <shaokunzhang529@gmail.com>
Date: Mon, 6 May 2024 08:58:49 +0800
Subject: [PATCH 1/6] Update AgentOptimizer BibTeX (#2578)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* update paper

* update

---------

Co-authored-by: “skzhang1” <“shaokunzhang529@gmail.com”>
Co-authored-by: Jieyu Zhang <jieyuz2@cs.washington.edu>
---
 README.md                | 11 +++++++++++
 website/docs/Research.md |  6 ++----
 2 files changed, 13 insertions(+), 4 deletions(-)
diff --git a/README.md b/README.md
index dffa451db98..fabbff99b63 100644
--- a/README.md
+++ b/README.md
@@ -266,6 +266,17 @@ In addition, you can find:
 }
 ```
 
+[AgentOptimizer](https://arxiv.org/pdf/2402.11359)
+
+```
+@article{zhang2024training,
+  title={Training Language Model Agents without Modifying Language Models},
+  author={Zhang, Shaokun and Zhang, Jieyu and Liu, Jiale and Song, Linxin and Wang, Chi and Krishna, Ranjay and Wu, Qingyun},
+  journal={ICML'24},
+  year={2024}
+}
+```
+
 <p align="right" style="font-size: 14px; color: #555; margin-top: 20px;">
   <a href="#readme-top" style="text-decoration: none; color: blue; font-weight: bold;">
     ↑ Back to Top ↑
diff --git a/website/docs/Research.md b/website/docs/Research.md
index 25c885f1d06..c8ba1d9c865 100644
--- a/website/docs/Research.md
+++ b/website/docs/Research.md
@@ -61,16 +61,14 @@ For technical details, please check our technical report and research publicatio
 }
 ```
 
-* [Training Language Model Agents without Modifying Language Models](https://arxiv.org/abs/2402.11359). Shaokun Zhang, Jieyu Zhang, Jiale Liu, Linxin Song, Chi Wang, Ranjay Krishna, Qingyun Wu. ArXiv preprint arXiv:2402.09015 (2024).
+* [Training Language Model Agents without Modifying Language Models](https://arxiv.org/abs/2402.11359). Shaokun Zhang, Jieyu Zhang, Jiale Liu, Linxin Song, Chi Wang, Ranjay Krishna, Qingyun Wu. ICML'24.
 
 ```bibtex
 @misc{zhang2024agentoptimizer,
       title={Training Language Model Agents without Modifying Language Models},
       author={Shaokun Zhang and Jieyu Zhang and Jiale Liu and Linxin Song and Chi Wang and Ranjay Krishna and Qingyun Wu},
       year={2024},
-      eprint={2402.11359},
-      archivePrefix={arXiv},
-      primaryClass={cs.AI}
+      booktitle={ICML'24},
 }
 ```
 

From 5a3a8a5541d1270aa5fe4d1b85f415e110ded675 Mon Sep 17 00:00:00 2001
From: "Erez A. Korn" <erakorn@outlook.com>
Date: Mon, 6 May 2024 06:19:33 +0300
Subject: [PATCH 2/6] Correct link to Jupyter Code Executor in
 code-executors.ipynb (#2589)

* Correct link to Jupyter Code Executor in code-executors.ipynb

The link to the code executor was referencing the wrong folder.

* Update website/docs/tutorial/code-executors.ipynb

Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>

* Update website/docs/tutorial/code-executors.ipynb

Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>

---------

Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>
---
 website/docs/tutorial/code-executors.ipynb | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/website/docs/tutorial/code-executors.ipynb b/website/docs/tutorial/code-executors.ipynb
index 72feabf2a94..f1e0c26510e 100644
--- a/website/docs/tutorial/code-executors.ipynb
+++ b/website/docs/tutorial/code-executors.ipynb
@@ -41,7 +41,7 @@
     "\n",
     "In this chapter, we will focus on the command line code executors.\n",
     "For the Jupyter code executor, please refer to the topic page for \n",
-    "[Jupyter Code Executor](../topics/code-execution/jupyter-code-executor)."
+    "[Jupyter Code Executor](/docs/topics/code-execution/jupyter-code-executor)."
    ]
   },
   {
@@ -674,7 +674,7 @@
     "Contrast to the command line code executor, the Jupyter code executor\n",
     "runs all code blocks in the same Jupyter kernel, which keeps the state\n",
     "in memory between executions.\n",
-    "See the topic page for [Jupyter Code Executor](../topics/code-execution/jupyter-code-executor).\n",
+    "See the topic page for [Jupyter Code Executor](/docs/topics/code-execution/jupyter-code-executor).\n",
     "\n",
     "The choice between command line and Jupyter code executor depends on the\n",
     "nature of the code blocks in agents' conversation.\n",

From 372ac1e794eda840efb26b26fa0c9ecaed5562a1 Mon Sep 17 00:00:00 2001
From: Wael Karkoub <wael.karkoub96@gmail.com>
Date: Mon, 6 May 2024 15:16:49 +0100
Subject: [PATCH 3/6] Text Compression Transform (#2225)

* adds implementation

* handles optional import

* cleanup

* updates github workflows

* skip test if dependencies not installed

* skip test if dependencies not installed

* use cpu

* skip openai

* unskip openai

* adds protocol

* better docstr

* minor fixes

* updates optional dependencies docs

* wip

* update docstrings

* wip

* adds back llmlingua requirement

* finalized protocol

* improve docstr

* guide complete

* improve docstr

* fix FAQ

* added cache support

* improve cache key

* cache key fix + faq fix

* improve docs

* improve guide

* args -> params

* spelling
---
 .github/workflows/contrib-tests.yml           |   2 +-
 .../contrib/capabilities/text_compressors.py  |  68 +++++++
 .../contrib/capabilities/transforms.py        | 178 ++++++++++++++++--
 setup.py                                      |   1 +
 .../contrib/capabilities/test_transforms.py   |  86 ++++++++-
 website/docs/FAQ.mdx                          |   2 +-
 .../installation/Optional-Dependencies.md     |  13 +-
 .../handling_long_contexts/_category_.json    |   4 +
 .../compressing_text_w_llmligua.md            | 171 +++++++++++++++++
 .../intro_to_transform_messages.md}           |  11 +-
 10 files changed, 503 insertions(+), 33 deletions(-)
 create mode 100644 autogen/agentchat/contrib/capabilities/text_compressors.py
 create mode 100644 website/docs/topics/handling_long_contexts/_category_.json
 create mode 100644 website/docs/topics/handling_long_contexts/compressing_text_w_llmligua.md
 rename website/docs/topics/{long_contexts.md => handling_long_contexts/intro_to_transform_messages.md} (98%)

diff --git a/.github/workflows/contrib-tests.yml b/.github/workflows/contrib-tests.yml
index d36a9d52e69..f8dd1d46186 100644
--- a/.github/workflows/contrib-tests.yml
+++ b/.github/workflows/contrib-tests.yml
@@ -400,7 +400,7 @@ jobs:
           pip install pytest-cov>=5
       - name: Install packages and dependencies for Transform Messages
         run: |
-          pip install -e .
+          pip install -e '.[long-context]'
       - name: Set AUTOGEN_USE_DOCKER based on OS
         shell: bash
         run: |
diff --git a/autogen/agentchat/contrib/capabilities/text_compressors.py b/autogen/agentchat/contrib/capabilities/text_compressors.py
new file mode 100644
index 00000000000..78554bdc935
--- /dev/null
+++ b/autogen/agentchat/contrib/capabilities/text_compressors.py
@@ -0,0 +1,68 @@
+from typing import Any, Dict, Optional, Protocol
+
+IMPORT_ERROR: Optional[Exception] = None
+try:
+    import llmlingua
+except ImportError:
+    IMPORT_ERROR = ImportError(
+        "LLMLingua is not installed. Please install it with `pip install pyautogen[long-context]`"
+    )
+    PromptCompressor = object
+else:
+    from llmlingua import PromptCompressor
+
+
+class TextCompressor(Protocol):
+    """Defines a protocol for text compression to optimize agent interactions."""
+
+    def compress_text(self, text: str, **compression_params) -> Dict[str, Any]:
+        """This method takes a string as input and returns a dictionary containing the compressed text and other
+        relevant information. The compressed text should be stored under the 'compressed_text' key in the dictionary.
+        To calculate the number of saved tokens, the dictionary should include 'origin_tokens' and 'compressed_tokens' keys.
+        """
+        ...
+
+
+class LLMLingua:
+    """Compresses text messages using LLMLingua for improved efficiency in processing and response generation.
+
+    NOTE: The effectiveness of compression and the resultant token savings can vary based on the content of the messages
+    and the specific configurations used for the PromptCompressor.
+    """
+
+    def __init__(
+        self,
+        prompt_compressor_kwargs: Dict = dict(
+            model_name="microsoft/llmlingua-2-bert-base-multilingual-cased-meetingbank",
+            use_llmlingua2=True,
+            device_map="cpu",
+        ),
+        structured_compression: bool = False,
+    ) -> None:
+        """
+        Args:
+            prompt_compressor_kwargs (dict): A dictionary of keyword arguments for the PromptCompressor. Defaults to a
+                dictionary with model_name set to "microsoft/llmlingua-2-bert-base-multilingual-cased-meetingbank",
+                use_llmlingua2 set to True, and device_map set to "cpu".
+            structured_compression (bool): A flag indicating whether to use structured compression. If True, the
+                structured_compress_prompt method of the PromptCompressor is used. Otherwise, the compress_prompt method
+                is used. Defaults to False.
+                dictionary.
+
+        Raises:
+            ImportError: If the llmlingua library is not installed.
+        """
+        if IMPORT_ERROR:
+            raise IMPORT_ERROR
+
+        self._prompt_compressor = PromptCompressor(**prompt_compressor_kwargs)
+
+        assert isinstance(self._prompt_compressor, llmlingua.PromptCompressor)
+        self._compression_method = (
+            self._prompt_compressor.structured_compress_prompt
+            if structured_compression
+            else self._prompt_compressor.compress_prompt
+        )
+
+    def compress_text(self, text: str, **compression_params) -> Dict[str, Any]:
+        return self._compression_method([text], **compression_params)
diff --git a/autogen/agentchat/contrib/capabilities/transforms.py b/autogen/agentchat/contrib/capabilities/transforms.py
index 279faed8c9d..8303843e881 100644
--- a/autogen/agentchat/contrib/capabilities/transforms.py
+++ b/autogen/agentchat/contrib/capabilities/transforms.py
@@ -1,4 +1,5 @@
 import copy
+import json
 import sys
 from typing import Any, Dict, List, Optional, Protocol, Tuple, Union
 
@@ -6,6 +7,9 @@
 from termcolor import colored
 
 from autogen import token_count_utils
+from autogen.cache import AbstractCache, Cache
+
+from .text_compressors import LLMLingua, TextCompressor
 
 
 class MessageTransform(Protocol):
@@ -156,7 +160,7 @@ def apply_transform(self, messages: List[Dict]) -> List[Dict]:
         assert self._min_tokens is not None
 
         # if the total number of tokens in the messages is less than the min_tokens, return the messages as is
-        if not self._are_min_tokens_reached(messages):
+        if not _min_tokens_reached(messages, self._min_tokens):
             return messages
 
         temp_messages = copy.deepcopy(messages)
@@ -205,19 +209,6 @@ def get_logs(self, pre_transform_messages: List[Dict], post_transform_messages:
             return logs_str, True
         return "No tokens were truncated.", False
 
-    def _are_min_tokens_reached(self, messages: List[Dict]) -> bool:
-        """
-        Returns True if no minimum tokens restrictions are applied.
-
-        Either if the total number of tokens in the messages is greater than or equal to the `min_theshold_tokens`,
-        or no minimum tokens threshold is set.
-        """
-        if not self._min_tokens:
-            return True
-
-        messages_tokens = sum(_count_tokens(msg["content"]) for msg in messages if "content" in msg)
-        return messages_tokens >= self._min_tokens
-
     def _truncate_str_to_tokens(self, contents: Union[str, List], n_tokens: int) -> Union[str, List]:
         if isinstance(contents, str):
             return self._truncate_tokens(contents, n_tokens)
@@ -268,7 +259,7 @@ def _validate_max_tokens(self, max_tokens: Optional[int] = None) -> Optional[int
 
         return max_tokens if max_tokens is not None else sys.maxsize
 
-    def _validate_min_tokens(self, min_tokens: int, max_tokens: int) -> int:
+    def _validate_min_tokens(self, min_tokens: Optional[int], max_tokens: Optional[int]) -> int:
         if min_tokens is None:
             return 0
         if min_tokens < 0:
@@ -278,6 +269,154 @@ def _validate_min_tokens(self, min_tokens: int, max_tokens: int) -> int:
         return min_tokens
 
 
+class TextMessageCompressor:
+    """A transform for compressing text messages in a conversation history.
+
+    It uses a specified text compression method to reduce the token count of messages, which can lead to more efficient
+    processing and response generation by downstream models.
+    """
+
+    def __init__(
+        self,
+        text_compressor: Optional[TextCompressor] = None,
+        min_tokens: Optional[int] = None,
+        compression_params: Dict = dict(),
+        cache: Optional[AbstractCache] = Cache.disk(),
+    ):
+        """
+        Args:
+            text_compressor (TextCompressor or None): An instance of a class that implements the TextCompressor
+                protocol. If None, it defaults to LLMLingua.
+            min_tokens (int or None): Minimum number of tokens in messages to apply the transformation. Must be greater
+                than or equal to 0 if not None. If None, no threshold-based compression is applied.
+            compression_args (dict): A dictionary of arguments for the compression method. Defaults to an empty
+                dictionary.
+            cache (None or AbstractCache): The cache client to use to store and retrieve previously compressed messages.
+                If None, no caching will be used.
+        """
+
+        if text_compressor is None:
+            text_compressor = LLMLingua()
+
+        self._validate_min_tokens(min_tokens)
+
+        self._text_compressor = text_compressor
+        self._min_tokens = min_tokens
+        self._compression_args = compression_params
+        self._cache = cache
+
+        # Optimizing savings calculations to optimize log generation
+        self._recent_tokens_savings = 0
+
+    def apply_transform(self, messages: List[Dict]) -> List[Dict]:
+        """Applies compression to messages in a conversation history based on the specified configuration.
+
+        The function processes each message according to the `compression_args` and `min_tokens` settings, applying
+        the specified compression configuration and returning a new list of messages with reduced token counts
+        where possible.
+
+        Args:
+            messages (List[Dict]): A list of message dictionaries to be compressed.
+
+        Returns:
+            List[Dict]: A list of dictionaries with the message content compressed according to the configured
+                method and scope.
+        """
+        # Make sure there is at least one message
+        if not messages:
+            return messages
+
+        # if the total number of tokens in the messages is less than the min_tokens, return the messages as is
+        if not _min_tokens_reached(messages, self._min_tokens):
+            return messages
+
+        total_savings = 0
+        processed_messages = messages.copy()
+        for message in processed_messages:
+            # Some messages may not have content.
+            if not isinstance(message.get("content"), (str, list)):
+                continue
+
+            if _is_content_text_empty(message["content"]):
+                continue
+
+            cached_content = self._cache_get(message["content"])
+            if cached_content is not None:
+                savings, compressed_content = cached_content
+            else:
+                savings, compressed_content = self._compress(message["content"])
+
+            self._cache_set(message["content"], compressed_content, savings)
+
+            message["content"] = compressed_content
+            total_savings += savings
+
+        self._recent_tokens_savings = total_savings
+        return processed_messages
+
+    def get_logs(self, pre_transform_messages: List[Dict], post_transform_messages: List[Dict]) -> Tuple[str, bool]:
+        if self._recent_tokens_savings > 0:
+            return f"{self._recent_tokens_savings} tokens saved with text compression.", True
+        else:
+            return "No tokens saved with text compression.", False
+
+    def _compress(self, content: Union[str, List[Dict]]) -> Tuple[int, Union[str, List[Dict]]]:
+        """Compresses the given text or multimodal content using the specified compression method."""
+        if isinstance(content, str):
+            return self._compress_text(content)
+        elif isinstance(content, list):
+            return self._compress_multimodal(content)
+        else:
+            return 0, content
+
+    def _compress_multimodal(self, content: List[Dict]) -> Tuple[int, List[Dict]]:
+        tokens_saved = 0
+        for msg in content:
+            if "text" in msg:
+                savings, msg["text"] = self._compress_text(msg["text"])
+                tokens_saved += savings
+        return tokens_saved, content
+
+    def _compress_text(self, text: str) -> Tuple[int, str]:
+        """Compresses the given text using the specified compression method."""
+        compressed_text = self._text_compressor.compress_text(text, **self._compression_args)
+
+        savings = 0
+        if "origin_tokens" in compressed_text and "compressed_tokens" in compressed_text:
+            savings = compressed_text["origin_tokens"] - compressed_text["compressed_tokens"]
+
+        return savings, compressed_text["compressed_prompt"]
+
+    def _cache_get(self, content: Union[str, List[Dict]]) -> Optional[Tuple[int, Union[str, List[Dict]]]]:
+        if self._cache:
+            cached_value = self._cache.get(self._cache_key(content))
+            if cached_value:
+                return cached_value
+
+    def _cache_set(
+        self, content: Union[str, List[Dict]], compressed_content: Union[str, List[Dict]], tokens_saved: int
+    ):
+        if self._cache:
+            value = (tokens_saved, json.dumps(compressed_content))
+            self._cache.set(self._cache_key(content), value)
+
+    def _cache_key(self, content: Union[str, List[Dict]]) -> str:
+        return f"{json.dumps(content)}_{self._min_tokens}"
+
+    def _validate_min_tokens(self, min_tokens: Optional[int]):
+        if min_tokens is not None and min_tokens <= 0:
+            raise ValueError("min_tokens must be greater than 0 or None")
+
+
+def _min_tokens_reached(messages: List[Dict], min_tokens: Optional[int]) -> bool:
+    """Returns True if the total number of tokens in the messages is greater than or equal to the specified value."""
+    if not min_tokens:
+        return True
+
+    messages_tokens = sum(_count_tokens(msg["content"]) for msg in messages if "content" in msg)
+    return messages_tokens >= min_tokens
+
+
 def _count_tokens(content: Union[str, List[Dict[str, Any]]]) -> int:
     token_count = 0
     if isinstance(content, str):
@@ -286,3 +425,12 @@ def _count_tokens(content: Union[str, List[Dict[str, Any]]]) -> int:
         for item in content:
             token_count += _count_tokens(item.get("text", ""))
     return token_count
+
+
+def _is_content_text_empty(content: Union[str, List[Dict[str, Any]]]) -> bool:
+    if isinstance(content, str):
+        return content == ""
+    elif isinstance(content, list):
+        return all(_is_content_text_empty(item.get("text", "")) for item in content)
+    else:
+        return False
diff --git a/setup.py b/setup.py
index a5481c90dfb..a4fa4f63aa5 100644
--- a/setup.py
+++ b/setup.py
@@ -79,6 +79,7 @@
     "websockets": ["websockets>=12.0,<13"],
     "jupyter-executor": jupyter_executor,
     "types": ["mypy==1.9.0", "pytest>=6.1.1,<8"] + jupyter_executor,
+    "long-context": ["llmlingua<0.3"],
 }
 
 setuptools.setup(
diff --git a/test/agentchat/contrib/capabilities/test_transforms.py b/test/agentchat/contrib/capabilities/test_transforms.py
index 6d9441d53e6..c5ffc08f112 100644
--- a/test/agentchat/contrib/capabilities/test_transforms.py
+++ b/test/agentchat/contrib/capabilities/test_transforms.py
@@ -1,5 +1,6 @@
 import copy
 from typing import Dict, List
+from unittest.mock import MagicMock, patch
 
 import pytest
 
@@ -118,13 +119,82 @@ def test_message_token_limiter_get_logs(message_token_limiter, messages, expecte
     assert logs_str == expected_logs
 
 
+def test_text_compression():
+    """Test the TextMessageCompressor transform."""
+    try:
+        from autogen.agentchat.contrib.capabilities.transforms import TextMessageCompressor
+
+        text_compressor = TextMessageCompressor()
+    except ImportError:
+        pytest.skip("LLM Lingua is not installed.")
+
+    text = "Run this test with a long string. "
+    messages = [
+        {
+            "role": "assistant",
+            "content": [{"type": "text", "text": "".join([text] * 3)}],
+        },
+        {
+            "role": "assistant",
+            "content": [{"type": "text", "text": "".join([text] * 3)}],
+        },
+        {
+            "role": "assistant",
+            "content": [{"type": "text", "text": "".join([text] * 3)}],
+        },
+    ]
+
+    transformed_messages = text_compressor.apply_transform([{"content": text}])
+
+    assert len(transformed_messages[0]["content"]) < len(text)
+
+    # Test compressing all messages
+    text_compressor = TextMessageCompressor()
+    transformed_messages = text_compressor.apply_transform(copy.deepcopy(messages))
+    for message in transformed_messages:
+        assert len(message["content"][0]["text"]) < len(messages[0]["content"][0]["text"])
+
+
+def test_text_compression_cache():
+    try:
+        from autogen.agentchat.contrib.capabilities.transforms import TextMessageCompressor
+
+    except ImportError:
+        pytest.skip("LLM Lingua is not installed.")
+
+    messages = get_long_messages()
+    mock_compressed_content = (1, {"content": "mock"})
+
+    with patch(
+        "autogen.agentchat.contrib.capabilities.transforms.TextMessageCompressor._cache_get",
+        MagicMock(return_value=(1, {"content": "mock"})),
+    ) as mocked_get, patch(
+        "autogen.agentchat.contrib.capabilities.transforms.TextMessageCompressor._cache_set", MagicMock()
+    ) as mocked_set:
+        text_compressor = TextMessageCompressor()
+
+        text_compressor.apply_transform(messages)
+        text_compressor.apply_transform(messages)
+
+        assert mocked_get.call_count == len(messages)
+        assert mocked_set.call_count == len(messages)
+
+    # We already populated the cache with the mock content
+    # We need to test if we retrieve the correct content
+    text_compressor = TextMessageCompressor()
+    compressed_messages = text_compressor.apply_transform(messages)
+
+    for message in compressed_messages:
+        assert message["content"] == mock_compressed_content[1]
+
+
 if __name__ == "__main__":
     long_messages = get_long_messages()
     short_messages = get_short_messages()
     no_content_messages = get_no_content_messages()
-    message_history_limiter = MessageHistoryLimiter(max_messages=3)
-    message_token_limiter = MessageTokenLimiter(max_tokens_per_message=3)
-    message_token_limiter_with_threshold = MessageTokenLimiter(max_tokens_per_message=1, min_tokens=10)
+    msg_history_limiter = MessageHistoryLimiter(max_messages=3)
+    msg_token_limiter = MessageTokenLimiter(max_tokens_per_message=3)
+    msg_token_limiter_with_threshold = MessageTokenLimiter(max_tokens_per_message=1, min_tokens=10)
 
     # Test Parameters
     message_history_limiter_apply_transform_parameters = {
@@ -170,14 +240,14 @@ def test_message_token_limiter_get_logs(message_token_limiter, messages, expecte
         message_history_limiter_apply_transform_parameters["messages"],
         message_history_limiter_apply_transform_parameters["expected_messages_len"],
     ):
-        test_message_history_limiter_apply_transform(message_history_limiter, messages, expected_messages_len)
+        test_message_history_limiter_apply_transform(msg_history_limiter, messages, expected_messages_len)
 
     for messages, expected_logs, expected_effect in zip(
         message_history_limiter_get_logs_parameters["messages"],
         message_history_limiter_get_logs_parameters["expected_logs"],
         message_history_limiter_get_logs_parameters["expected_effect"],
     ):
-        test_message_history_limiter_get_logs(message_history_limiter, messages, expected_logs, expected_effect)
+        test_message_history_limiter_get_logs(msg_history_limiter, messages, expected_logs, expected_effect)
 
     # Call the MessageTokenLimiter tests
 
@@ -187,7 +257,7 @@ def test_message_token_limiter_get_logs(message_token_limiter, messages, expecte
         message_token_limiter_apply_transform_parameters["expected_messages_len"],
     ):
         test_message_token_limiter_apply_transform(
-            message_token_limiter, messages, expected_token_count, expected_messages_len
+            msg_token_limiter, messages, expected_token_count, expected_messages_len
         )
 
     for messages, expected_token_count, expected_messages_len in zip(
@@ -196,7 +266,7 @@ def test_message_token_limiter_get_logs(message_token_limiter, messages, expecte
         message_token_limiter_with_threshold_apply_transform_parameters["expected_messages_len"],
     ):
         test_message_token_limiter_with_threshold_apply_transform(
-            message_token_limiter_with_threshold, messages, expected_token_count, expected_messages_len
+            msg_token_limiter_with_threshold, messages, expected_token_count, expected_messages_len
         )
 
     for messages, expected_logs, expected_effect in zip(
@@ -204,4 +274,4 @@ def test_message_token_limiter_get_logs(message_token_limiter, messages, expecte
         message_token_limiter_get_logs_parameters["expected_logs"],
         message_token_limiter_get_logs_parameters["expected_effect"],
     ):
-        test_message_token_limiter_get_logs(message_token_limiter, messages, expected_logs, expected_effect)
+        test_message_token_limiter_get_logs(msg_token_limiter, messages, expected_logs, expected_effect)
diff --git a/website/docs/FAQ.mdx b/website/docs/FAQ.mdx
index 6baa09768a6..d2a4b3b2a32 100644
--- a/website/docs/FAQ.mdx
+++ b/website/docs/FAQ.mdx
@@ -267,7 +267,7 @@ Migrating enhances flexibility, modularity, and customization in handling chat m
 
 ### How to migrate?
 
-To ensure a smooth migration process, simply follow the detailed guide provided in [Handling Long Context Conversations with Transform Messages](/docs/topics/long_contexts.md).
+To ensure a smooth migration process, simply follow the detailed guide provided in [Introduction to TransformMessages](/docs/topics/handling_long_contexts/intro_to_transform_messages.md).
 
 ### What should I do if I get the error "TypeError: Assistants.create() got an unexpected keyword argument 'file_ids'"?
 
diff --git a/website/docs/installation/Optional-Dependencies.md b/website/docs/installation/Optional-Dependencies.md
index 617a90aabae..f0176ba8fdc 100644
--- a/website/docs/installation/Optional-Dependencies.md
+++ b/website/docs/installation/Optional-Dependencies.md
@@ -85,7 +85,7 @@ To use Teachability, please install AutoGen with the [teachable] option.
 pip install "pyautogen[teachable]"
 ```
 
-Example notebook:  [Chatting with a teachable agent](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_teachability.ipynb)
+Example notebook: [Chatting with a teachable agent](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_teachability.ipynb)
 
 ## Large Multimodal Model (LMM) Agents
 
@@ -115,9 +115,16 @@ Example notebooks:
 
 To use a graph in `GroupChat`, particularly for graph visualization, please install AutoGen with the [graph] option.
 
-
 ```bash
 pip install "pyautogen[graph]"
 ```
 
-Example notebook:  [Graph Modeling Language with using select_speaker](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_graph_modelling_language_using_select_speaker.ipynb)
+Example notebook: [Graph Modeling Language with using select_speaker](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_graph_modelling_language_using_select_speaker.ipynb)
+
+## Long Context Handling
+
+AutoGen includes support for handling long textual contexts by leveraging the LLMLingua library for text compression. To enable this functionality, please install AutoGen with the `[long-context]` option:
+
+```bash
+pip install "pyautogen[long-context]"
+```
diff --git a/website/docs/topics/handling_long_contexts/_category_.json b/website/docs/topics/handling_long_contexts/_category_.json
new file mode 100644
index 00000000000..04c54583298
--- /dev/null
+++ b/website/docs/topics/handling_long_contexts/_category_.json
@@ -0,0 +1,4 @@
+{
+  "label": "Handling Long Contexts",
+  "collapsible": true
+}
diff --git a/website/docs/topics/handling_long_contexts/compressing_text_w_llmligua.md b/website/docs/topics/handling_long_contexts/compressing_text_w_llmligua.md
new file mode 100644
index 00000000000..e251786f555
--- /dev/null
+++ b/website/docs/topics/handling_long_contexts/compressing_text_w_llmligua.md
@@ -0,0 +1,171 @@
+# Compressing Text with LLMLingua
+
+Text compression is crucial for optimizing interactions with LLMs, especially when dealing with long prompts that can lead to higher costs and slower response times. LLMLingua is a tool designed to compress prompts effectively, enhancing the efficiency and cost-effectiveness of LLM operations.
+
+This guide introduces LLMLingua's integration with AutoGen, demonstrating how to use this tool to compress text, thereby optimizing the usage of LLMs for various applications.
+
+:::info Requirements
+Install `pyautogen[long-context]` and `PyMuPDF`:
+
+```bash
+pip install "pyautogen[long-context]" PyMuPDF
+```
+
+For more information, please refer to the [installation guide](/docs/installation/).
+:::
+
+## Example 1: Compressing AutoGen Research Paper using LLMLingua
+
+We will look at how we can use `TextMessageCompressor` to compress an AutoGen research paper using `LLMLingua`. Here's how you can initialize `TextMessageCompressor` with LLMLingua, a text compressor that adheres to the `TextCompressor` protocol.
+
+```python
+import tempfile
+
+import fitz  # PyMuPDF
+import requests
+
+from autogen.agentchat.contrib.capabilities.text_compressors import LLMLingua
+from autogen.agentchat.contrib.capabilities.transforms import TextMessageCompressor
+
+AUTOGEN_PAPER = "https://arxiv.org/pdf/2308.08155"
+
+
+def extract_text_from_pdf():
+    # Download the PDF
+    response = requests.get(AUTOGEN_PAPER)
+    response.raise_for_status()  # Ensure the download was successful
+
+    text = ""
+    # Save the PDF to a temporary file
+    with tempfile.TemporaryDirectory() as temp_dir:
+        with open(temp_dir + "temp.pdf", "wb") as f:
+            f.write(response.content)
+
+        # Open the PDF
+        with fitz.open(temp_dir + "temp.pdf") as doc:
+            # Read and extract text from each page
+            for page in doc:
+                text += page.get_text()
+
+    return text
+
+
+# Example usage
+pdf_text = extract_text_from_pdf()
+
+llm_lingua = LLMLingua()
+text_compressor = TextMessageCompressor(text_compressor=llm_lingua)
+compressed_text = text_compressor.apply_transform([{"content": pdf_text}])
+
+print(text_compressor.get_logs([], []))
+```
+
+```console
+('19765 tokens saved with text compression.', True)
+```
+
+## Example 2: Integrating LLMLingua with `ConversableAgent`
+
+Now, let's integrate `LLMLingua` into a conversational agent within AutoGen. This allows dynamic compression of prompts before they are sent to the LLM.
+
+```python
+import os
+
+import autogen
+from autogen.agentchat.contrib.capabilities import transform_messages
+
+system_message = "You are a world class researcher."
+config_list = [{"model": "gpt-4-turbo", "api_key": os.getenv("OPENAI_API_KEY")}]
+
+# Define your agent; the user proxy and an assistant
+researcher = autogen.ConversableAgent(
+    "assistant",
+    llm_config={"config_list": config_list},
+    max_consecutive_auto_reply=1,
+    system_message=system_message,
+    human_input_mode="NEVER",
+)
+user_proxy = autogen.UserProxyAgent(
+    "user_proxy",
+    human_input_mode="NEVER",
+    is_termination_msg=lambda x: "TERMINATE" in x.get("content", ""),
+    max_consecutive_auto_reply=1,
+)
+```
+
+:::tip
+Learn more about configuring LLMs for agents [here](/docs/topics/llm_configuration).
+:::
+
+```python
+context_handling = transform_messages.TransformMessages(transforms=[text_compressor])
+context_handling.add_to_agent(researcher)
+
+message = "Summarize this research paper for me, include the important information" + pdf_text
+result = user_proxy.initiate_chat(recipient=researcher, clear_history=True, message=message, silent=True)
+
+print(result.chat_history[1]["content"])
+```
+
+```console
+19953 tokens saved with text compression.
+The paper describes AutoGen, a framework designed to facilitate the development of diverse large language model (LLM) applications through conversational multi-agent systems. The framework emphasizes customization and flexibility, enabling developers to define agent interaction behaviors in natural language or computer code.
+
+Key components of AutoGen include:
+1. **Conversable Agents**: These are customizable agents designed to operate autonomously or through human interaction. They are capable of initiating, maintaining, and responding within conversations, contributing effectively to multi-agent dialogues.
+
+2. **Conversation Programming**: AutoGen introduces a programming paradigm centered around conversational interactions among agents. This approach simplifies the development of complex applications by streamlining how agents communicate and interact, focusing on conversational logic rather than traditional coding for
+mats.
+
+3. **Agent Customization and Flexibility**: Developers have the freedom to define the capabilities and behaviors of agents within the system, allowing for a wide range of applications across different domains.
+
+4. **Application Versatility**: The paper outlines various use cases from mathematics and coding to decision-making and entertainment, demonstrating AutoGen's ability to cope with a broad spectrum of complexities and requirements.
+
+5. **Hierarchical and Joint Chat Capabilities**: The system supports complex conversation patterns including hierarchical and multi-agent interactions, facilitating robust dialogues that can dynamically adjust based on the conversation context and the agents' roles.
+
+6. **Open-source and Community Engagement**: AutoGen is presented as an open-source framework, inviting contributions and adaptations from the global development community to expand its capabilities and applications.
+
+The framework's architecture is designed so that it can be seamlessly integrated into existing systems, providing a robust foundation for developing sophisticated multi-agent applications that leverage the capabilities of modern LLMs. The paper also discusses potential ethical considerations and future improvements, highlighting the importance of continual development in response to evolving tech landscapes and user needs.
+```
+
+## Example 3: Modifying LLMLingua's Compression Parameters
+
+LLMLingua's flexibility allows for various configurations, such as customizing instructions for the LLM or setting specific token counts for compression. This example demonstrates how to set a target token count, enabling the use of models with smaller context sizes like gpt-3.5.
+
+```python
+config_list = [{"model": "gpt-3.5-turbo", "api_key": os.getenv("OPENAI_API_KEY")}]
+researcher = autogen.ConversableAgent(
+    "assistant",
+    llm_config={"config_list": config_list},
+    max_consecutive_auto_reply=1,
+    system_message=system_message,
+    human_input_mode="NEVER",
+)
+
+text_compressor = TextMessageCompressor(
+    text_compressor=llm_lingua,
+    compression_params={"target_token": 13000},
+    cache=None,
+)
+context_handling = transform_messages.TransformMessages(transforms=[text_compressor])
+context_handling.add_to_agent(researcher)
+
+compressed_text = text_compressor.apply_transform([{"content": message}])
+
+result = user_proxy.initiate_chat(recipient=researcher, clear_history=True, message=message, silent=True)
+
+print(result.chat_history[1]["content"])
+```
+
+```console
+25308 tokens saved with text compression.
+Based on the extensive research paper information provided, it seems that the focus is on developing a framework called AutoGen for creating multi-agent conversations based on Large Language Models (LLMs) for a variety of applications such as math problem solving, coding, decision-making, and more.
+
+The paper discusses the importance of incorporating diverse roles of LLMs, human inputs, and tools to enhance the capabilities of the conversable agents within the AutoGen framework. It also delves into the effectiveness of different systems in various scenarios, showcases the implementation of AutoGen in pilot studies, and compares its performance with other systems in tasks like math problem-solving, coding, and decision-making.
+
+The paper also highlights the different features and components of AutoGen such as the AssistantAgent, UserProxyAgent, ExecutorAgent, and GroupChatManager, emphasizing its flexibility, ease of use, and modularity in managing multi-agent interactions. It presents case analyses to demonstrate the effectiveness of AutoGen in various applications and scenarios.
+
+Furthermore, the paper includes manual evaluations, scenario testing, code examples, and detailed comparisons with other systems like ChatGPT, OptiGuide, MetaGPT, and more, to showcase the performance and capabilities of the AutoGen framework.
+
+Overall, the research paper showcases the potential of AutoGen in facilitating dynamic multi-agent conversations, enhancing decision-making processes, and improving problem-solving tasks with the integration of LLMs, human inputs, and tools in a collaborative framework.
+```
diff --git a/website/docs/topics/long_contexts.md b/website/docs/topics/handling_long_contexts/intro_to_transform_messages.md
similarity index 98%
rename from website/docs/topics/long_contexts.md
rename to website/docs/topics/handling_long_contexts/intro_to_transform_messages.md
index 51648c5c549..d0a53702c48 100644
--- a/website/docs/topics/long_contexts.md
+++ b/website/docs/topics/handling_long_contexts/intro_to_transform_messages.md
@@ -1,4 +1,4 @@
-# Handling Long Context Conversations with Transform Messages
+# Introduction to Transform Messages
 
 Why do we need to handle long contexts? The problem arises from several constraints and requirements:
 
@@ -14,6 +14,7 @@ The `TransformMessages` capability is designed to modify incoming messages befor
 
 :::info Requirements
 Install `pyautogen`:
+
 ```bash
 pip install pyautogen
 ```
@@ -99,9 +100,9 @@ pprint.pprint(processed_short_messages)
 ```console
 [{'content': 'hello there, how are you?', 'role': 'user'},
  {'content': [{'text': 'hello', 'type': 'text'}], 'role': 'assistant'}]
- ```
+```
 
- We can see that no transformation was applied, because the threshold of 10 total tokens was not reached.
+We can see that no transformation was applied, because the threshold of 10 total tokens was not reached.
 
 ### Apply Transformations Using Agents
 
@@ -318,7 +319,7 @@ result = user_proxy.initiate_chat(
 
 ```
 
-````console
+```console
 user_proxy (to assistant):
 
 What are the two API keys that I just provided
@@ -340,4 +341,4 @@ user_proxy (to assistant):
 
 --------------------------------------------------------------------------------
 Redacted 2 OpenAI API keys.
-````
+```

From 498aa7f367e420df26a63dea27713372e475108b Mon Sep 17 00:00:00 2001
From: zbram101 <bharath.ram89@gmail.com>
Date: Mon, 6 May 2024 09:12:00 -0700
Subject: [PATCH 4/6] =?UTF-8?q?notebook=20showing=20assistant=20agents=20c?=
 =?UTF-8?q?onnecting=20azure=20ai=20search=20and=20azur=E2=80=A6=20(#2594)?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* notebook showing assistant agents connecting azure ai search and azure identity

* formatting fix for notebook azr_ai_search

---------

Co-authored-by: Bharadwaj Ramachandran <bharadwajramachandran@Bharadwajs-MacBook-Pro.local>
---
 notebook/agentchat_azr_ai_search.ipynb | 413 +++++++++++++++++++++++++
 1 file changed, 413 insertions(+)
 create mode 100644 notebook/agentchat_azr_ai_search.ipynb

diff --git a/notebook/agentchat_azr_ai_search.ipynb b/notebook/agentchat_azr_ai_search.ipynb
new file mode 100644
index 00000000000..f4521f60d27
--- /dev/null
+++ b/notebook/agentchat_azr_ai_search.ipynb
@@ -0,0 +1,413 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Assistants with Azure Cognitive Search and Azure Identity\n",
+    "\n",
+    "This notebook demonstrates the use of Assistant Agents in conjunction with Azure Cognitive Search and Azure Identity. Assistant Agents use tools that interact with Azure Cognitive Search to extract pertinent data.\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Prerequisites\n",
+    "\n",
+    "Before running this notebook, please ensure the following prerequisites are met:\n",
+    " \n",
+    "\n",
+    "### Dependencies\n",
+    "1. **Autogen**\n",
+    "2. **Azure SDK**\n",
+    "3. **Cognitive Search**/**AI Search**\n",
+    "\n",
+    "If you have AI search enabled in your Azure Portal, you can use the following code to create an assistant agent that can search Azure Cognitive Search.\n",
+    "\n",
+    "**AI search setup details:**\n",
+    "- Documentation:   \n",
+    "    - Create search service: https://learn.microsoft.com/en-us/azure/search/search-create-service-portal \n",
+    "    - Search index:  https://learn.microsoft.com/en-us/azure/search/search-how-to-create-search-index?tabs=portal \n",
+    "    hybrid search: https://learn.microsoft.com/en-us/azure/search/hybrid-search-how-to-query\n",
+    "\n",
+    "- Youtube walkthrough: https://www.youtube.com/watch?v=6Zfuw-UJZ7k\n",
+    "\n",
+    "\n",
+    "### Install Azure CLI\n",
+    "This notebook requires the Azure CLI for authentication purposes. Follow these steps to install and configure it:\n",
+    "\n",
+    "1. **Download and Install Azure CLI**:\n",
+    "   - Visit the [Azure CLI installation page](https://docs.microsoft.com/en-us/cli/azure/install-azure-cli) and follow the instructions for your operating system.\n",
+    "   - Mac users can install Azure CLI using Homebrew with the command `brew install azure-cli`   \n",
+    "\n",
+    "2. **Verify Installation**:\n",
+    "   - In the below cell execute `az --version` to check if Azure CLI is installed correctly.\n",
+    "\n",
+    "4. **Login to Azure**:\n",
+    "   - In the below cell execute `az login` to log into your Azure account. This step is necessary as the notebook uses `AzureCliCredential` which retrieves the token based on the Azure account currently logged in.\n",
+    "\n",
+    "### Check Azure CLI Installation\n",
+    "Run the cell below to check if Azure CLI is installed and properly configured on your system."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Check Azure CLI Installation and Login Status"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Check Azure CLI installation and login status\n",
+    "# !az --version\n",
+    "# !az login"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Install required packages\n",
+    "Run the cell below to install the required packages for this notebook.\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!pip3 install pyautogen==0.2.16\n",
+    "!pip3 install python-dotenv==1.0.1\n",
+    "!pip3 install pyautogen[graph]>=0.2.11\n",
+    "!pip3 install azure-search-documents==11.4.0b8\n",
+    "!pip3 install azure-identity==1.12.0"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Next you will import the required packages for this notebook.\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import json\n",
+    "import os\n",
+    "\n",
+    "import requests\n",
+    "from azure.identity import DefaultAzureCredential\n",
+    "from azure.search.documents import SearchClient\n",
+    "from dotenv import load_dotenv\n",
+    "\n",
+    "import autogen\n",
+    "from autogen import AssistantAgent, UserProxyAgent, register_function\n",
+    "from autogen.cache import Cache\n",
+    "\n",
+    "load_dotenv()\n",
+    "\n",
+    "# Import Cognitive Search index ENV\n",
+    "AZURE_SEARCH_SERVICE = os.getenv(\"AZURE_SEARCH_SERVICE\")\n",
+    "AZURE_SEARCH_INDEX = os.getenv(\"AZURE_SEARCH_INDEX\")\n",
+    "AZURE_SEARCH_KEY = os.getenv(\"AZURE_SEARCH_KEY\")\n",
+    "AZURE_SEARCH_API_VERSION = os.getenv(\"AZURE_SEARCH_API_VERSION\")\n",
+    "AZURE_SEARCH_SEMANTIC_SEARCH_CONFIG = os.getenv(\"AZURE_SEARCH_SEMANTIC_SEARCH_CONFIG\")\n",
+    "AZURE_SEARCH_SERVICE_ENDPOINT = os.getenv(\"AZURE_SEARCH_SERVICE_ENDPOINT\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Next, you need to authenticate and create a `SearchClient` instance."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "credential = DefaultAzureCredential()\n",
+    "endpoint = AZURE_SEARCH_SERVICE_ENDPOINT\n",
+    "\n",
+    "from azure.identity import AzureCliCredential\n",
+    "\n",
+    "credential = AzureCliCredential()\n",
+    "token = credential.get_token(\"https://cognitiveservices.azure.com/.default\")\n",
+    "\n",
+    "print(\"TOKEN\", token.token)\n",
+    "\n",
+    "client = SearchClient(endpoint=endpoint, index_name=\"test-index\", credential=credential)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "\n",
+    "Then, load the configuration list and define the configuration for the `AssistantAgent`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 13,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "config_list = autogen.config_list_from_json(\n",
+    "    env_or_file=\"OAI_CONFIG_LIST\",\n",
+    ")\n",
+    "\n",
+    "gpt4_config = {\n",
+    "    \"cache_seed\": 42,\n",
+    "    \"temperature\": 0,\n",
+    "    \"config_list\": config_list,\n",
+    "    \"timeout\": 120,\n",
+    "}"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "\n",
+    "Define your tool function `search` that will interact with the Azure Cognitive Search service."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 14,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def search(query: str):\n",
+    "    payload = json.dumps(\n",
+    "        {\n",
+    "            \"search\": query,\n",
+    "            \"vectorQueries\": [{\"kind\": \"text\", \"text\": query, \"k\": 5, \"fields\": \"vector\"}],\n",
+    "            \"queryType\": \"semantic\",\n",
+    "            \"semanticConfiguration\": AZURE_SEARCH_SEMANTIC_SEARCH_CONFIG,\n",
+    "            \"captions\": \"extractive\",\n",
+    "            \"answers\": \"extractive|count-3\",\n",
+    "            \"queryLanguage\": \"en-US\",\n",
+    "        }\n",
+    "    )\n",
+    "\n",
+    "    response = list(client.search(payload))\n",
+    "\n",
+    "    output = []\n",
+    "    for result in response:\n",
+    "        result.pop(\"titleVector\")\n",
+    "        result.pop(\"contentVector\")\n",
+    "        output.append(result)\n",
+    "\n",
+    "    return output"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "\n",
+    "Define the `AssistantAgent` and `UserProxyAgent` instances, and register the `search` function to them."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "cog_search = AssistantAgent(\n",
+    "    name=\"COGSearch\",\n",
+    "    system_message=\"You are a helpful AI assistant. \"\n",
+    "    \"You can help with Azure Cognitive Search.\"\n",
+    "    \"Return 'TERMINATE' when the task is done.\",\n",
+    "    llm_config=gpt4_config,\n",
+    ")\n",
+    "\n",
+    "user_proxy = UserProxyAgent(\n",
+    "    name=\"User\",\n",
+    "    llm_config=False,\n",
+    "    is_termination_msg=lambda msg: msg.get(\"content\") is not None and \"TERMINATE\" in msg[\"content\"],\n",
+    "    human_input_mode=\"NEVER\",\n",
+    ")\n",
+    "\n",
+    "register_function(\n",
+    "    search,\n",
+    "    caller=cog_search,\n",
+    "    executor=user_proxy,\n",
+    "    name=\"search\",\n",
+    "    description=\"A tool for searching the Cognitive Search index\",\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Finally, initiate a chat."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 16,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\u001b[33mUser\u001b[0m (to COGSearch):\n",
+      "\n",
+      "Search for 'What is Azure?' in the 'test-index' index\n",
+      "\n",
+      "--------------------------------------------------------------------------------\n",
+      "\u001b[33mCOGSearch\u001b[0m (to User):\n",
+      "\n",
+      "\u001b[32m***** Suggested tool Call (call_6Db6DFPNEp7J7Dz5dkAbbjDY): search *****\u001b[0m\n",
+      "Arguments: \n",
+      "{\"query\":\"What is Azure?\"}\n",
+      "\u001b[32m***********************************************************************\u001b[0m\n",
+      "\n",
+      "--------------------------------------------------------------------------------\n",
+      "\u001b[35m\n",
+      ">>>>>>>> EXECUTING ASYNC FUNCTION search...\u001b[0m\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\u001b[33mUser\u001b[0m (to COGSearch):\n",
+      "\n",
+      "\u001b[33mUser\u001b[0m (to COGSearch):\n",
+      "\n",
+      "\u001b[32m***** Response from calling tool \"call_6Db6DFPNEp7J7Dz5dkAbbjDY\" *****\u001b[0m\n",
+      "[{\"id\": \"40\", \"title\": \"Azure Cognitive Search\", \"category\": \"AI + Machine Learning\", \"content\": \"Azure Cognitive Search is a fully managed search-as-a-service that enables you to build rich search experiences for your applications. It provides features like full-text search, faceted navigation, and filters. Azure Cognitive Search supports various data sources, such as Azure SQL Database, Azure Blob Storage, and Azure Cosmos DB. You can use Azure Cognitive Search to index your data, create custom scoring profiles, and integrate with other Azure services. It also integrates with other Azure services, such as Azure Cognitive Services and Azure Machine Learning.\", \"@search.score\": 9.1308, \"@search.reranker_score\": null, \"@search.highlights\": null, \"@search.captions\": null}, {\"id\": \"90\", \"title\": \"Azure Cognitive Services\", \"category\": \"AI + Machine Learning\", \"content\": \"Azure Cognitive Services is a collection of AI services and APIs that enable you to build intelligent applications using pre-built models and algorithms. It provides features like computer vision, speech recognition, and natural language processing. Cognitive Services supports various platforms, such as .NET, Java, Node.js, and Python. You can use Azure Cognitive Services to build chatbots, analyze images and videos, and process and understand text. It also integrates with other Azure services, such as Azure Machine Learning and Azure Cognitive Search.\", \"@search.score\": 5.9858904, \"@search.reranker_score\": null, \"@search.highlights\": null, \"@search.captions\": null}, {\"id\": \"68\", \"title\": \"Azure Database for MariaDB\", \"category\": \"Databases\", \"content\": \"Azure Database for MariaDB is a fully managed, scalable, and secure relational database service that enables you to build and manage MariaDB applications in Azure. It provides features like automatic backups, monitoring, and high availability. Database for MariaDB supports various data types, such as JSON, spatial, and full-text. You can use Azure Database for MariaDB to migrate your existing applications, build new applications, and ensure the performance and security of your data. It also integrates with other Azure services, such as Azure App Service and Azure Data Factory.\", \"@search.score\": 3.9424267, \"@search.reranker_score\": null, \"@search.highlights\": null, \"@search.captions\": null}, {\"id\": \"69\", \"title\": \"Azure SQL Managed Instance\", \"category\": \"Databases\", \"content\": \"Azure SQL Managed Instance is a fully managed, scalable, and secure SQL Server instance hosted in Azure. It provides features like automatic backups, monitoring, and high availability. SQL Managed Instance supports various data types, such as JSON, spatial, and full-text. You can use Azure SQL Managed Instance to migrate your existing applications, build new applications, and ensure the performance and security of your data. It also integrates with other Azure services, such as Azure App Service and Azure Data Factory.\", \"@search.score\": 3.2041788, \"@search.reranker_score\": null, \"@search.highlights\": null, \"@search.captions\": null}, {\"id\": \"66\", \"title\": \"Azure Database for MySQL\", \"category\": \"Databases\", \"content\": \"Azure Database for MySQL is a fully managed, scalable, and secure relational database service that enables you to build and manage MySQL applications in Azure. It provides features like automatic backups, monitoring, and high availability. Database for MySQL supports various data types, such as JSON, spatial, and full-text. You can use Azure Database for MySQL to migrate your existing applications, build new applications, and ensure the performance and security of your data. It also integrates with other Azure services, such as Azure App Service and Azure Data Factory.\", \"@search.score\": 3.1852448, \"@search.reranker_score\": null, \"@search.highlights\": null, \"@search.captions\": null}, {\"id\": \"67\", \"title\": \"Azure Database for PostgreSQL\", \"category\": \"Databases\", \"content\": \"Azure Database for PostgreSQL is a fully managed, scalable, and secure relational database service that enables you to build and manage PostgreSQL applications in Azure. It provides features like automatic backups, monitoring, and high availability. Database for PostgreSQL supports various data types, such as JSON, spatial, and full-text. You can use Azure Database for PostgreSQL to migrate your existing applications, build new applications, and ensure the performance and security of your data. It also integrates with other Azure services, such as Azure App Service and Azure Data Factory.\", \"@search.score\": 2.8028796, \"@search.reranker_score\": null, \"@search.highlights\": null, \"@search.captions\": null}, {\"id\": \"3\", \"title\": \"Azure Cognitive Services\", \"category\": \"AI + Machine Learning\", \"content\": \"Azure Cognitive Services are a set of AI services that enable you to build intelligent applications with powerful algorithms using just a few lines of code. These services cover a wide range of capabilities, including vision, speech, language, knowledge, and search. They are designed to be easy to use and integrate into your applications. Cognitive Services are fully managed, scalable, and continuously improved by Microsoft. It allows developers to create AI-powered solutions without deep expertise in machine learning.\", \"@search.score\": 1.9905571, \"@search.reranker_score\": null, \"@search.highlights\": null, \"@search.captions\": null}]\n",
+      "\u001b[32m**********************************************************************\u001b[0m\n",
+      "\n",
+      "--------------------------------------------------------------------------------\n",
+      "\u001b[33mCOGSearch\u001b[0m (to User):\n",
+      "\n",
+      "Here are the search results for \"What is Azure?\" from the index:\n",
+      "\n",
+      "1. **Azure Cognitive Search**\n",
+      "   - Category: AI + Machine Learning\n",
+      "   - Content: Azure Cognitive Search is a fully managed search-as-a-service that enables you to build rich search experiences for your applications. It provides features like full-text search, faceted navigation, and filters. Azure Cognitive Search supports various data sources, such as Azure SQL Database, Azure Blob Storage, and Azure Cosmos DB. You can use Azure Cognitive Search to index your data, create custom scoring profiles, and integrate with other Azure services. It also integrates with Azure Cognitive Services and Azure Machine Learning.\n",
+      "   - Search Score: 9.1308\n",
+      "\n",
+      "2. **Azure Cognitive Services**\n",
+      "   - Category: AI + Machine Learning\n",
+      "   - Content: Azure Cognitive Services is a collection of AI services and APIs that enable you to build intelligent applications using pre-built models and algorithms. It provides features like computer vision, speech recognition, and natural language processing. Cognitive Services supports various platforms, such as .NET, Java, Node.js, and Python. You can use Azure Cognitive Services to build chatbots, analyze images and videos, and process and understand text. It also integrates with other Azure services, such as Azure Machine Learning and Azure Cognitive Search.\n",
+      "   - Search Score: 5.9858904\n",
+      "\n",
+      "3. **Azure Database for MariaDB**\n",
+      "   - Category: Databases\n",
+      "   - Content: Azure Database for MariaDB is a fully managed, scalable, and secure relational database service that enables you to build and manage MariaDB applications in Azure. It provides features like automatic backups, monitoring, and high availability. Database for MariaDB supports various data types, such as JSON, spatial, and full-text. You can use Azure Database for MariaDB to migrate your existing applications, build new applications, and ensure the performance and security of your data. It also integrates with other Azure services, such as Azure App Service and Azure Data Factory.\n",
+      "   - Search Score: 3.9424267\n",
+      "\n",
+      "4. **Azure SQL Managed Instance**\n",
+      "   - Category: Databases\n",
+      "   - Content: Azure SQL Managed Instance is a fully managed, scalable, and secure SQL Server instance hosted in Azure. It provides features like automatic backups, monitoring, and high availability. SQL Managed Instance supports various data types, such as JSON, spatial, and full-text. You can use Azure SQL Managed Instance to migrate your existing applications, build new applications, and ensure the performance and security of your data. It also integrates with other Azure services, such as Azure App Service and Azure Data Factory.\n",
+      "   - Search Score: 3.2041788\n",
+      "\n",
+      "5. **Azure Database for MySQL**\n",
+      "   - Category: Databases\n",
+      "   - Content: Azure Database for MySQL is a fully managed, scalable, and secure relational database service that enables you to build and manage MySQL applications in Azure. It provides features like automatic backups, monitoring, and high availability. Database for MySQL supports various data types, such as JSON, spatial, and full-text. You can use Azure Database for MySQL to migrate your existing applications, build new applications, and ensure the performance and security of your data. It also integrates with other Azure services, such as Azure App Service and Azure Data Factory.\n",
+      "   - Search Score: 3.1852448\n",
+      "\n",
+      "6. **Azure Database for PostgreSQL**\n",
+      "   - Category: Databases\n",
+      "   - Content: Azure Database for PostgreSQL is a fully managed, scalable, and secure relational database service that enables you to build and manage PostgreSQL applications in Azure. It provides features like automatic backups, monitoring, and high availability. Database for PostgreSQL supports various data types, such as JSON, spatial, and full-text. You can use Azure Database for PostgreSQL to migrate your existing applications, build new applications, and ensure the performance and security of your data. It also integrates with other Azure services, such as Azure App Service and Azure Data Factory.\n",
+      "   - Search Score: 2.8028796\n",
+      "\n",
+      "7. **Azure Cognitive Services**\n",
+      "   - Category: AI + Machine Learning\n",
+      "   - Content: Azure Cognitive Services are a set of AI services that enable you to build intelligent applications with powerful algorithms using just a few lines of code. These services cover a wide range of capabilities, including vision, speech, language, knowledge, and search. They are designed to be easy to use and integrate into your applications. Cognitive Services are fully managed, scalable, and continuously improved by Microsoft. It allows developers to create AI-powered solutions without deep expertise in machine learning.\n",
+      "   - Search Score: 1.9905571\n",
+      "\n",
+      "The search scores indicate the relevance of each result to the query \"What is Azure?\" with higher scores representing greater relevance. The top result provides a detailed explanation of Azure Cognitive Search, which is a part of the Azure platform.\n",
+      "\n",
+      "--------------------------------------------------------------------------------\n",
+      "\u001b[33mUser\u001b[0m (to COGSearch):\n",
+      "\n",
+      "\n",
+      "\n",
+      "--------------------------------------------------------------------------------\n",
+      "\u001b[33mCOGSearch\u001b[0m (to User):\n",
+      "\n",
+      "TERMINATE\n",
+      "\n",
+      "--------------------------------------------------------------------------------\n"
+     ]
+    }
+   ],
+   "source": [
+    "if __name__ == \"__main__\":\n",
+    "    import asyncio\n",
+    "\n",
+    "    async def main():\n",
+    "        with Cache.disk() as cache:\n",
+    "            await user_proxy.a_initiate_chat(\n",
+    "                cog_search,\n",
+    "                message=\"Search for 'What is Azure?' in the 'test-index' index\",\n",
+    "                cache=cache,\n",
+    "            )\n",
+    "\n",
+    "    await main()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "front_matter": {
+   "description": "This notebook demonstrates the use of Assistant Agents in conjunction with Azure Cognitive Search and Azure Identity",
+   "tags": [
+    "RAG",
+    "Azure Identity",
+    "Azure AI Search"
+   ]
+  },
+  "kernelspec": {
+   "display_name": ".venv",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.12.3"
+  },
+  "skip_test": "This requires Azure AI Search to be enabled and creds for AI Search from Azure Portal"
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}

From 9418b179c2511cbdd42b4bc65e6e27d19eed1271 Mon Sep 17 00:00:00 2001
From: r48Bit <81687400+r4881t@users.noreply.github.com>
Date: Mon, 6 May 2024 21:48:01 +0530
Subject: [PATCH 5/6] Update to correct pip install for litellm (#2602)

The doc mentions `pip install litellm[proxy]` which won't work. The correct command is  `pip install 'litellm[proxy]'`.
---
 website/docs/topics/non-openai-models/local-litellm-ollama.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/website/docs/topics/non-openai-models/local-litellm-ollama.md b/website/docs/topics/non-openai-models/local-litellm-ollama.md
index 98b326acdf4..e9c4b6ba345 100644
--- a/website/docs/topics/non-openai-models/local-litellm-ollama.md
+++ b/website/docs/topics/non-openai-models/local-litellm-ollama.md
@@ -18,7 +18,7 @@ Note: We recommend using a virtual environment for your stack, see [this article
 Install LiteLLM with the proxy server functionality:
 
 ```bash
-pip install litellm[proxy]
+pip install 'litellm[proxy]'
 ```
 
 Note: If using Windows, run LiteLLM and Ollama within a [WSL2](https://learn.microsoft.com/en-us/windows/wsl/install).

From da52c0246ccb8464624308559aacc55a579bb19f Mon Sep 17 00:00:00 2001
From: Ikko Eltociear Ashimine <eltociear@gmail.com>
Date: Tue, 7 May 2024 01:26:49 +0900
Subject: [PATCH 6/6] docs: update tutorial.ipynb (#2606)

Creat -> Create
---
 samples/apps/autogen-studio/notebooks/tutorial.ipynb | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/samples/apps/autogen-studio/notebooks/tutorial.ipynb b/samples/apps/autogen-studio/notebooks/tutorial.ipynb
index da758593757..7e80f17b7b5 100644
--- a/samples/apps/autogen-studio/notebooks/tutorial.ipynb
+++ b/samples/apps/autogen-studio/notebooks/tutorial.ipynb
@@ -52,7 +52,7 @@
     "# load an agent specification in JSON\n",
     "agent_spec = json.load(open(\"agent_spec.json\"))\n",
     "\n",
-    "# Creat a An AutoGen Workflow Configuration from the agent specification\n",
+    "# Create a An AutoGen Workflow Configuration from the agent specification\n",
     "agent_work_flow_config = AgentWorkFlowConfig(**agent_spec)\n",
     "\n",
     "agent_work_flow = AutoGenWorkFlowManager(agent_work_flow_config)\n",
@@ -308,7 +308,7 @@
     "# load an agent specification in JSON\n",
     "agent_spec = json.load(open(\"groupchat_spec.json\"))\n",
     "\n",
-    "# Creat a An AutoGen Workflow Configuration from the agent specification\n",
+    "# Create a An AutoGen Workflow Configuration from the agent specification\n",
     "agent_work_flow_config = AgentWorkFlowConfig(**agent_spec)\n",
     "\n",
     "# Create a Workflow from the configuration\n",