Add example for using retriever with agents #1224

tom-leamon · 2024-05-02T05:07:40Z

tom-leamon
May 2, 2024

Currently, there are no examples in the documentation which illustrate how to use retrievers with agents in order to leverage expanded context through embeddings. It's not immediately clear if this is even possible, though the types suggest it is.

If it's not currently available, implementing this feature would be hugely beneficial in increasing the performance of agents.

marcusschiesser · 2024-05-02T08:40:36Z

marcusschiesser
May 2, 2024
Maintainer

@tom-leamon, you can use a QueryEngineTool in your agent, see https://github.com/run-llama/LlamaIndexTS/blob/main/examples/agent/query_openai_agent.ts

0 replies

tom-leamon · 2024-05-02T20:13:46Z

tom-leamon
May 2, 2024
Author

That does work, but it seems to take significantly longer to use the tool and respond compared to using a ContextChatEngine. Is this a limitation of function calling APIs? Or is there a way avoid tool use but still perform retrieval?

0 replies

tom-leamon · 2024-05-03T08:29:52Z

tom-leamon
May 3, 2024
Author

@himself65 @marcusschiesser is it possible to have agents use a retriever without a QueryEngineTool?

With ContextChatEngine the model responds very quickly, utilizing the retriever to return relevant data from context. However, when using an agent and QueryEngineTool, the agent only has access to that retriever when using the tool, which not only adds significant latency, but means multiple tool uses are needed, such as in a scenario where the agent needs to be aware of its context and perform a web search.

If the agent could use the retriever without an additional tool call, like ContextChatEngine, it could more than double the performance of the agent. In my use case all interactions with the AI are highly contextual (what space is active, what group is active, what channel is active, what thread is active, and all the data already in these organizational units). With ContextChatEngine every single prompt always takes into consideration this context. With a QueryEngineTool, the base model first needs to even decide if it needs to perform a retrieval, which introduces latency. Many times, it will decide a retrieval is not needed, even though it would have vastly improved the contextual relevance and therefore quality of the answer.

Without this capability I am forced to have users manually choose if they want an agent or not, depending on if they need it to perform tool use (like searching the web) or provide the highest quality answer in the shortest time. I would prefer to have a single paradigm that both always has full context and can use tools.

Is this something that is already possible, or perhaps could be added to the roadmap?

0 replies

marcusschiesser · 2024-05-07T02:45:55Z

marcusschiesser
May 7, 2024
Maintainer

@tom-leamon to reduce the latency, you can use a tool that directly calls the retriever.

I added an example to the examples folder, see https://github.com/run-llama/LlamaIndexTS/blob/main/examples/agent/retriever_openai_agent.ts (compare with the same example using the query engine tool: https://github.com/run-llama/LlamaIndexTS/blob/main/examples/agent/query_openai_agent.ts)

The differences are:

QueryEngineTool uses a query engine (this means an additional LLM call that increases your latency) to generate a result based on the retrieved context that is then used by the agent
The retriever tool just retrieves the context and sends it as tool output to the agent

You can see the difference at run-time by adding the verbose: true parameter.

In the simple use case, both approaches lead to the same result - Would be great to get some feedback from your use case!

0 replies

erik-balfe · 2024-09-15T19:57:58Z

erik-balfe
Sep 15, 2024

Hey, @marcusschiesser

I wanted to circle back on the low latency issue with agents, similar to what we see with ContextChatEngine. Right now, the agent can only take a retriever or tools in the constructor, disabling the retriever if tools are passed:

export class LLMAgent extends AgentRunner<LLM> {
  constructor(params: LLMAgentParams) {
    const llm = params.llm ?? (Settings.llm ? (Settings.llm as LLM) : null);
    if (!llm) throw new Error("llm must be provided for either in params or Settings.llm");
    super({
      llm,
      chatHistory: params.chatHistory ?? [],
      systemPrompt: params.systemPrompt ?? null,
      runner: new LLMAgentWorker(),
      tools: "tools" in params ? params.tools : params.toolRetriever.retrieve.bind(params.toolRetriever),
      verbose: params.verbose ?? false,
    });
  }
  createStore = AgentRunner.defaultCreateStore;
  taskHandler = AgentRunner.defaultTaskHandler;
}

This setup limits the agent’s ability to perform fast semantic searches and insert content from found nodes directly into the LLM query. Using a QueryEngineTool adds noticeable delay, which is a concern for real-time apps.

Also, making tool usage optional might confuse users who expect a straightforward experience where the agent can operate with full context without extra steps.

This mutually exclusive approach has been in the repo since its beginning. Are there any issues with allowing the retriever and tools to work together? If not, I’d be happy to help implement that change.

0 replies

marcusschiesser · 2024-09-18T04:24:24Z

marcusschiesser
Sep 18, 2024
Maintainer

@erik-balfe, the agent doesn't support a retriever as an argument, but a toolRetriever (which retrieves tools to use for the agent, which is different than retrieving nodes for the context).

If using a retriever tool (see https://github.com/run-llama/LlamaIndexTS/blob/main/examples/agent/retriever_openai_agent.ts) is not sufficient for your use case, you could try calling a context generator (see
https://github.com/run-llama/LlamaIndexTS/blob/feat/build-wasm-with-extism/packages/llamaindex/src/engines/chat/DefaultContextGenerator.ts) for each query and add the context it generates to your agent calls (that's basically what the ContextChatEngine is doing)

1 reply

erik-balfe Sep 24, 2024

Understood it. Thanks.

I created a PR here #1235 with possible solution for this case. I hope flexible enough not adding complexity to Agent class and not increasing efforts on maintaining the feature

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add example for using retriever with agents #1224

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 6 comments 1 reply

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Add example for using retriever with agents #1224

tom-leamon May 2, 2024

Replies: 6 comments · 1 reply

marcusschiesser May 2, 2024 Maintainer

tom-leamon May 2, 2024 Author

tom-leamon May 3, 2024 Author

marcusschiesser May 7, 2024 Maintainer

erik-balfe Sep 15, 2024

marcusschiesser Sep 18, 2024 Maintainer

erik-balfe Sep 24, 2024

tom-leamon
May 2, 2024

Replies: 6 comments 1 reply

marcusschiesser
May 2, 2024
Maintainer

tom-leamon
May 2, 2024
Author

tom-leamon
May 3, 2024
Author

marcusschiesser
May 7, 2024
Maintainer

erik-balfe
Sep 15, 2024

marcusschiesser
Sep 18, 2024
Maintainer