Skip to content
This repository has been archived by the owner on May 28, 2024. It is now read-only.

Request for Comment: Aviary <-> LangChain Integration #4

Closed
waleedkadous opened this issue Jun 2, 2023 · 5 comments
Closed

Request for Comment: Aviary <-> LangChain Integration #4

waleedkadous opened this issue Jun 2, 2023 · 5 comments

Comments

@waleedkadous
Copy link
Contributor

waleedkadous commented Jun 2, 2023

Purpose

Aviary is an open source LLM management toolkit built on top of Ray Serve and Ray. LangChain is an incredibly popular open source toolkit for building LLM applications. The question is how do these things fit together?

Possible integrations

This first integration is focused on LLM (non chat) to begin with. When we add streaming to Aviary, we will also integrate that for the chat application too.

There are 3 possible integration points with LangChain.

1. Aviary as an LLM provider for LangChain

Make Aviary a model backend for LangChain (the same way that OpenAI is done currently).

This would enable you to do things like:

import os
from langchain.llms import Aviary
aviary_url = os.environ["AVIARY_URL"]
# Token is optional
aviary_token = os.environ["AVIARY_TOKEN"] or None

from langchain.llms import Aviary

llm = Aviary(model_name = 'amazon/LightGPT')

#single query
llm.predict('How do you make fried rice?')

#uses Aviary's batch interface for greater efficiency
llm.generate(['How do you make fried rice?', 'What are the most influential punk bands?'])

The only real decision here is do we use our SDK or do we allow direct connection to our endpoints. Since our Web API right now is so simple, it might be easier to code against it in the short term, and use the SDK when it is justified.

2. Aviary “wraps” LLMs provided by LangChain

Allow any model supported by LangChain to be wired up through Aviary (the same way that Aviary currently “wires up” Hugging Face). This would give a way for centrally managed Aviaries to control access to models from OpenAI and to impose additional limits on length.

For every model you want to wrap, you would have to set up a models/ file in the https://github.com/ray-project/aviary/tree/master/models directory. We would expand that file format to also support LangChain LLMs as well.

3. Integrate LangChain LLM support directly into Aviary Explorer and Aviary CLI

Allow users to query any model supported by LangChain directly. This would be useful for example to do cross OSS <-> commercial comparisons e.g. with GPT-3.5-turbo.

What we would do there is allow Aviary CLI to do something like this:

aviary query -–model amazon/LightGPT -–model model-configs/langchain-openai-gpt-35,yaml examples/qa-prompts.txt 

In the aviary command, we would read openai://gpt-3.5-turbo and use the LangChain OpenAI LLM tool allowing for cross evaluation.

We would have add new functionality to Aviary Explorer to support adding arbitrarily configured LangChain LLMs.

In essence the difference between proposals 2 and 3 is: where do the config files for specifying LLM properties live?

Decision

We are not limited to doing one of these.

The most immediate need and highest impact is perhaps #1.

#2 and #3 are similar in many ways. Perhaps #3 is more impactful. The Aviary Explorer changes, however, are more complicated. It’s slightly ugly in the sense that we now have yaml files both on the Aviary backend and in the Aviary CLI and Explorer.

@waleedkadous waleedkadous changed the title REQUEST FOR COMMENTS: Aviary <-> LangChain Integration Request for Comment: Aviary <-> LangChain Integration Jun 2, 2023
@lpfhs
Copy link

lpfhs commented Jun 2, 2023

The approach FastChat has taken for LangChain integration is to provide an OpenAI-compatible API for models hosted on FastChat. I don't know if that's a good approach since it does introduce an extra layer of API, but it avoids having to add support for Aviary in LangChain. You also get ready support for Aviary models in any applications currently using the OpenAI API or LangChain.

@waleedkadous
Copy link
Contributor Author

Great suggestion!

We looked at this. This turned out to be too limiting (e.g. Aviary has support for optimized batching, whereas OpenAI's GPT-3.5-Turbo interface does not).

But we are planning on supporting the OpenAI wire format with our endpoints. We just need some time.

@hwchase17
Copy link

@waleedkadous agree that #1 seems the easiest/highest priority. happy to help with that in any way

@waleedkadous
Copy link
Contributor Author

PR for option 1 at langchain-ai/langchain#5661 (@hwchase17 jfyi).
Option 3 merged at 8e4e965

avnishn pushed a commit that referenced this issue Aug 4, 2023
1. Adds h2ogpt-oasst1-512-12b
2. Reworks the YAML config
3. Adds BetterTransformer & torch.compile support
4. (hopefully) fixes batching - we should upstream this to Ray

---------

Signed-off-by: Antoni Baum <antoni.baum@protonmail.com>
@XBeg9
Copy link

XBeg9 commented Nov 20, 2023

Just a sample like this

from langchain.llms import Aviary

llm = Aviary(model='TheBloke/Llama-2-70B-chat-AWQ', aviary_url="http://localhost:8000/v1", aviary_token="EMPTY")
output = llm('How do you make fried rice?')

Gives me an error http://localhost:8000/v1 does not support model TheBloke/Llama-2-70B-chat-AWQ. (type=value_error) any ideas what I am doing wrong? @waleedkadous

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants