add EcoAssistant blog (microsoft#647)

* add EcoAssistant blog * add blog * Rename blog.md to blog.mdx * Rename blog.mdx to index.mdx * Update index.mdx * Update index.mdx
whiskyboy · Nov 13, 2023 · 7a739c4 · 7a739c4
1 parent dc09386
commit 7a739c4
Show file tree

Hide file tree

Showing 7 changed files with 108 additions and 0 deletions.
diff --git a/website/blog/2023-11-09-EcoAssistant/img/chat.png b/website/blog/2023-11-09-EcoAssistant/img/chat.png
diff --git a/website/blog/2023-11-09-EcoAssistant/img/results.png b/website/blog/2023-11-09-EcoAssistant/img/results.png
diff --git a/website/blog/2023-11-09-EcoAssistant/img/system.png b/website/blog/2023-11-09-EcoAssistant/img/system.png
diff --git a/website/blog/2023-11-09-EcoAssistant/img/template-demo.png b/website/blog/2023-11-09-EcoAssistant/img/template-demo.png
diff --git a/website/blog/2023-11-09-EcoAssistant/img/template.png b/website/blog/2023-11-09-EcoAssistant/img/template.png
diff --git a/website/blog/2023-11-09-EcoAssistant/index.mdx b/website/blog/2023-11-09-EcoAssistant/index.mdx
@@ -0,0 +1,102 @@
+---
+title: EcoAssistant - Using LLM Assistants More Accurately and Affordably
+authors: jieyuz2
+tags: [LMM, RAG, cost-effectiveness]
+---
+
+![system](img/system.png)
+
+**TL;DR:**
+* Introducing the **EcoAssistant**, which is designed to solve user queries more accurately and affordably.
+* We show how to let the LLM assistant agent leverage external API to solve user query.
+* We show how to reduce the cost of using GPT models via **Assistant Hierachy**.
+* We show how to leverage the idea of Retrieval-augmented Generation (RAG) to improve the success rate via **Solution Demonstration**.
+
+
+## EcoAssistant
+
+In this blog, we introduce the **EcoAssistant**, a system built upon AutoGen with the goal of solving user queries more accurately and affordably.
+
+### Problem setup
+
+Recently, users have been using conversational LLMs such as ChatGPT for various queries.
+Reports indicate that 23% of ChatGPT user queries are for knowledge extraction purposes.
+Many of these queries require knowledge that is external to the information stored within any pre-trained large language models (LLMs).
+These tasks can only be completed by generating code to fetch necessary information via external APIs that contain the requested information.
+In the table below, we show three types of user queries that we aim to address in this work.
+
+| Dataset | API | Example query |
+|-------------|----------|----------|
+| Places| [Google Places](https://developers.google.com/maps/documentation/places/web-service/overview) | I’m looking for a 24-hour pharmacy in Montreal, can you find one for me? |
+| Weather | [Weather API](https://www.weatherapi.com) | What is the current cloud coverage in Mumbai, India? |
+| Stock | [Alpha Vantage Stock API](https://www.alphavantage.co/documentation/) | Can you give me the opening price of Microsoft for the month of January 2023? |
+
+
+### Leveraging external APIs
+
+To address these queries, we first build a **two-agent system** based on AutoGen,
+where the first agent is a **LLM assistant agent** (`AssistantAgent` in AutoGen) that is responsible for proposing and refining the code and
+the second agent is a **code executor agent** (`UserProxyAgent` in AutoGen) that would extract the generated code and execute it, forwarding the output back to the LLM assistant agent.
+A visualization of the two-agent system is shown below.
+
+![chat](img/chat.png)
+
+To instruct the assistant agent to leverage external APIs, we only need to add the API name/key dictionary at the beginning of the initial message.
+The template is shown below, where the red part is the information of APIs and black part is user query.
+
+![template](img/template.png)
+
+Importantly, we don't want to reveal our real API key to the assistant agent for safety concerns.
+Therefore, we use a **fake API key** to replace the real API key in the initial message.
+In particular, we generate a random token (e.g., `181dbb37`) for each API key and replace the real API key with the token in the initial message.
+Then, when the code executor execute the code, the fake API key would be automatically replaced by the real API key.
+
+
+### Solution Demonstration
+In most practical scenarios, queries from users would appear sequentially over time.
+Our **EcoAssistant** leverages past success to help the LLM assistants address future queries via **Solution Demonstration**.
+Specifically, whenever a query is deemed successfully resolved by user feedback, we capture and store the query and the final generated code snippet.
+These query-code pairs are saved in a specialized vector database. When new queries appear, **EcoAssistant** retrieves the most similar query from the database, which is then appended with the associated code to the initial prompt for the new query, serving as a demonstration.
+The new template of initial message is shown below, where the blue part corresponds to the solution demonstration.
+
+![template](img/template-demo.png)
+
+We found that this utilization of past successful query-code pairs improves the query resolution process with fewer iterations and enhances the system's performance.
+
+
+### Assistant Hierarchy
+LLMs usually have different prices and performance, for example, GPT-3.5-turbo is much cheaper than GPT-4 but also less accurate.
+Thus, we propose the **Assistant Hierarchy** to reduce the cost of using LLMs.
+The core idea is that we use the cheaper LLMs first and only use the more expensive LLMs when necessary.
+By this way, we are able to reduce the reliance on expensive LLMs and thus reduce the cost.
+In particular, given multiple LLMs, we initiate one assistant agent for each and start the conversation with the most cost-effective LLM assistant.
+If the conversation between the current LLM assistant and the code executor concludes without successfully resolving the query, **EcoAssistant** would then restart the conversation with the next more expensive LLM assistant in the hierarchy.
+We found that this strategy significantly reduces costs while still effectively addressing queries.
+
+### A Synergistic Effect
+We found that the **Assistant Hierarchy** and **Solution Demonstration** of **EcoAssistant** have a synergistic effect.
+Because the query-code database is shared by all LLM assistants, even without specialized design,
+the solution from more powerful LLM assistant (e.g., GPT-4) could be later retrieved to guide weaker LLM assistant (e.g., GPT-3.5-turbo).
+Such a synergistic effect further improves the performance and reduces the cost of **EcoAssistant**.
+
+### Experimental Results
+
+We evaluate **EcoAssistant** on three datasets: Places, Weather, and Stock. When comparing it with a single GPT-4 assistant, we found that **EcoAssistant** achieves a higher success rate with a lower cost as shown in the figure below.
+For more details about the experimental results and other experiments, please refer to our [paper](https://arxiv.org/abs/2310.03046).
+
+![exp](img/results.png)
+
+## Further reading
+
+Please refer to our [paper](https://arxiv.org/abs/2310.03046) and [codebase](https://github.com/JieyuZ2/EcoAssistant) for more details about **EcoAssistant**.
+
+If you find this blog useful, please consider citing:
+
+```bibtex
+@article{zhang2023ecoassistant,
+  title={EcoAssistant: Using LLM Assistant More Affordably and Accurately},
+  author={Zhang, Jieyu and Krishna, Ranjay and Awadallah, Ahmed H and Wang, Chi},
+  journal={arXiv preprint arXiv:2310.03046},
+  year={2023}
+}
+```
diff --git a/website/blog/authors.yml b/website/blog/authors.yml
@@ -39,3 +39,9 @@ beibinli:
   title: Senior Research Engineer at Microsoft
   url: https://github.com/beibinli
   image_url: https://github.com/beibinli.png
+
+jieyuz2:
+  name: Jieyu Zhang
+  title: PhD student at University of Washington
+  url: https://jieyuz2.github.io/
+  image_url: https://github.com/jieyuz2.png