-
Notifications
You must be signed in to change notification settings - Fork 470
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Local models #105
Comments
Thanks! chatgpt-web/src/lib/Types.svelte Lines 2 to 9 in 1926f7d
This can be quite easily fixed though. I guess we should support everything with Are you planning on adding streaming support to the API as well (using EventSource/SSE)? |
Yup, managed to find that bit, so I was wondering what direction to take ( I don't like forking! ), however that sounds good here! I'd be more than happy then to provide a docker-compose file as well in
It needs a prompt to be injected in each call, I've just updated the docs on the API to achieve that!
(but for vicuna/chat I think it would be slightly different)
This comes with a high computational cost, so I'm not really going into that direction for now - CGO calls are really expensive, and if we want to stream token-by-token by calling behind the scene C functions direcly in go, that will likely bump response time by quite a lot. |
Guys, i just wanna say thanks! This is a beautiful collaboration between two amazing projects! |
In response to the models, i think we need to let the user add endpoints, instead of a since 'openai' url. You want to use openai/gpt-4, you select the model from the drop down, and hit [+] to add a custom endpoint. and a custom return object. And just give enough info in the docs on how to POST/GET from the custom endpoints. |
Re: token streaming JFYI is being tracked on go-skynet/go-llama.cpp#4, however I still think that would incur in a high computational cost decreasing the overall performance, but I'll be glad to take a stab at it next. |
Hey 👋 !
Awesome project!
I'm trying to run chatgpt-web with llama.cpp, I've created a project using golang lama.cpp bindings https://github.com/go-skynet/llama-cli which mimics the OpenAI API to be 1:1 compatible but having multi-model that can run locally instead.
It seems all to work so far, and I'd like to document the usage to have them both working together, so to use it with local models. However, I'm struggling as chatgpt-web seems to filter the model from the API with openAI available models - llama-cli returns a list of models but the filtering chatgpt-web is doing prevents to select models from the list. (e.g. alpaca can't be run unless I do some hardwiring on the API).
If you want to test it, you need to run
llama-cli
from the latest image built from master, like so:And set the
VITE_API_BASE
accordingly in the.env
file.It would be super-cool if could work together to have the capability to load local models, maybe directly adding options to run it aside with docker-compose (that's what I'm currently doing!) WDYT?
The text was updated successfully, but these errors were encountered: