-
Notifications
You must be signed in to change notification settings - Fork 93
Request for Comment: Aviary <-> LangChain Integration #4
Comments
The approach FastChat has taken for LangChain integration is to provide an OpenAI-compatible API for models hosted on FastChat. I don't know if that's a good approach since it does introduce an extra layer of API, but it avoids having to add support for Aviary in LangChain. You also get ready support for Aviary models in any applications currently using the OpenAI API or LangChain. |
Great suggestion! We looked at this. This turned out to be too limiting (e.g. Aviary has support for optimized batching, whereas OpenAI's GPT-3.5-Turbo interface does not). But we are planning on supporting the OpenAI wire format with our endpoints. We just need some time. |
@waleedkadous agree that #1 seems the easiest/highest priority. happy to help with that in any way |
PR for option 1 at langchain-ai/langchain#5661 (@hwchase17 jfyi). |
1. Adds h2ogpt-oasst1-512-12b 2. Reworks the YAML config 3. Adds BetterTransformer & torch.compile support 4. (hopefully) fixes batching - we should upstream this to Ray --------- Signed-off-by: Antoni Baum <antoni.baum@protonmail.com>
Just a sample like this from langchain.llms import Aviary
llm = Aviary(model='TheBloke/Llama-2-70B-chat-AWQ', aviary_url="http://localhost:8000/v1", aviary_token="EMPTY")
output = llm('How do you make fried rice?') Gives me an error |
Purpose
Aviary is an open source LLM management toolkit built on top of Ray Serve and Ray. LangChain is an incredibly popular open source toolkit for building LLM applications. The question is how do these things fit together?
Possible integrations
This first integration is focused on LLM (non chat) to begin with. When we add streaming to Aviary, we will also integrate that for the chat application too.
There are 3 possible integration points with LangChain.
1. Aviary as an LLM provider for LangChain
Make Aviary a model backend for LangChain (the same way that OpenAI is done currently).
This would enable you to do things like:
The only real decision here is do we use our SDK or do we allow direct connection to our endpoints. Since our Web API right now is so simple, it might be easier to code against it in the short term, and use the SDK when it is justified.
2. Aviary “wraps” LLMs provided by LangChain
Allow any model supported by LangChain to be wired up through Aviary (the same way that Aviary currently “wires up” Hugging Face). This would give a way for centrally managed Aviaries to control access to models from OpenAI and to impose additional limits on length.
For every model you want to wrap, you would have to set up a models/ file in the https://github.com/ray-project/aviary/tree/master/models directory. We would expand that file format to also support LangChain LLMs as well.
3. Integrate LangChain LLM support directly into Aviary Explorer and Aviary CLI
Allow users to query any model supported by LangChain directly. This would be useful for example to do cross OSS <-> commercial comparisons e.g. with GPT-3.5-turbo.
What we would do there is allow Aviary CLI to do something like this:
In the aviary command, we would read openai://gpt-3.5-turbo and use the LangChain OpenAI LLM tool allowing for cross evaluation.
We would have add new functionality to Aviary Explorer to support adding arbitrarily configured LangChain LLMs.
In essence the difference between proposals 2 and 3 is: where do the config files for specifying LLM properties live?
Decision
We are not limited to doing one of these.
The most immediate need and highest impact is perhaps #1.
#2 and #3 are similar in many ways. Perhaps #3 is more impactful. The Aviary Explorer changes, however, are more complicated. It’s slightly ugly in the sense that we now have yaml files both on the Aviary backend and in the Aviary CLI and Explorer.
The text was updated successfully, but these errors were encountered: