feat: running local models #269

mudler · 2023-04-13T22:42:24Z

~~This is an untested draft still~~ It works! see my comment below. This should be enough for testing on an OpenAI compatible endpoint just by letting the user change the base_url, so should work as well with https://github.com/go-skynet/LocalAI

I'll experiment a bit locally and refine this, adding docs too. cc: @arbreezy @AlexsJones

Closes #188

📑 Description

✅ Checks

My pull request adheres to the code style of this project
My code requires changes to the documentation
I have updated the documentation as required
All the tests have passed

ℹ Additional Information

mudler · 2023-04-14T09:23:59Z

works beautifully :)

To test it, I deployed something with a wrong tag, so it was failing to pull the image and bring up the service, that was the response (it took some time, but hey, that's now all local!):

base ❯ ./k8sgpt analyze --explain                                                                                                                                                             
 100% |██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| (2/2, 1 it/min)            
                                                                                                                                                                                                                   
0 llama/llama-69b7785db9-rlv9g(Deployment/llama)                                                                                                                                              
- Error: Back-off pulling image "quay.io/go-skynet/llama-cli:fff"                                                                                                                             
```The message is: Back-off pulling image "quay.io/go-skynet/llama-cli:fff"                                                                                                                                        
                                                                                                                                                                                              
The solution is:                                                                                                                                                                              
                                                                                                                                                                                              
The error message indicates that there is a conflict in the image name being used in the Kubernetes deployment. To resolve this issue, you can either rename the image or update the image tag in the Kubernetes deployment file.
                                                                                                                                                                                              
For example, if the image name is "llama-cli:fff", you can rename it to something like "llama-cli:tag-1". Then, update the image tag in the Kubernetes deployment file to reflect the new name.                                              
                                                                                                                                                                                              
Alternatively, if the image name is not a critical part of the deployment, you can rename the image without affecting the deployment.                                                         
                                                                                                                                                                                              
1 llama/llama(llama)                                                                                                                                                                          
- Error: Service has not ready endpoints, pods: [Pod/llama-69b7785db9-rlv9g], expected 1                                                                                                      
                                                                                                                                                                                                                   
```                                                                                                                                                                                           
Service has not ready endpoints, pods: [Pod/llama-69b7785db9-rlv9g], expected 1                                                                                                               
                                               
```                                                 
                                                                                                                                                                                              
The error message is indicating that the service does not have any available endpoints, and there is only one pod that is running. This means that there is no endpoint available to handle the traffic.                                     
                                                                                                                                                                                                                   
To resolve this issue, you can either:                                                                                                                                                        

1.   Deploy a new pod to the service, or
2.   Update the endpoints of the existing pod to include the new endpoint.                                                                                                                                         

If you are using the first option, you can use the kubectl apply command to deploy a new pod to the service. For example:                                                                                                                    
                                                                                               
                                                                                                                                                                                                                   
```                                            
kubectl apply -f https://raw.githubusercontent.com/Homebrew/kubeadm/main/examples/deploy-pod.yaml                                                                                                                                            
                                                                                                                                                                                              
```                                                                                                                                                                                                                

If you are using the second option, you can use the kubectl apply command to update the endpoints of the existing pod. For example:                                                                                                          

                                                                                                         
```                                                                                            
kubectl apply -f https://raw.githubusercontent.com/Homebrew/kubeadm/main/examples/update-endpoints.yaml                                                                                                                                      
                                                    
```                                                 

Note that you need to replace the values in the examples file with your own values.

My setup:

base ❯ cat ~/.k8sgpt.yaml
ai:
    providers:
        - base_url: http://localhost:8080/v1
          model: ggml-koala-7b-model-q4_0-r2
          name: openai
kubeconfig: ....
kubecontext: ""

To bring up the API server (locally):

Follow the steps in and start the API: https://github.com/go-skynet/llama-cli#usage (I've used the koala model)
Add the default koala prompt template for the model (ggml-koala-7b-model-q4_0-r2.bin.tmpl) next to the model :

BEGINNING OF CONVERSATION: USER: {{.Input}} GPT:

Will polish, and add docs alongside with the PR, stay tuned!

mudler · 2023-04-14T22:37:14Z

here looks good! - let me know if that's the good direction @AlexsJones , I've also updated the docs, feedback is welcome :)

README.md

pkg/ai/openai.go

AlexsJones · 2023-04-16T10:58:20Z

Looks very exciting @mudler , @matthisholleville @arbreezy perhaps we use this as another reason to have a configuration Object for AI rather than passing more strings. Also @mudler I think having your own AI struct might be easier ( yes its a little more work, but it eventually means less manipulation of the openai.go)

arbreezy · 2023-04-16T13:56:47Z

This adds great flexibility to k8sgpt !
Just a small comment @mudler.
Since there is no authentication happening with llama we need to tweak the auth cmd, skipping the password/token/api-key user prompt if llama provider is enabled in k8sgpt.

mudler · 2023-04-16T19:55:44Z

Thanks for the feedback :) Going to work on your feedback into separate commits and be back at you as soon as possible!

mudler · 2023-04-16T19:56:40Z

@arbreezy @AlexsJones do you want me to keep the base_url option to the OpenAI backend? Or shall I leave it untouched? I'm fine both ways - I think might be useful e.g. when using proxies, but it's your call

AlexsJones · 2023-04-17T10:09:43Z

@arbreezy @AlexsJones do you want me to keep the base_url option to the OpenAI backend? Or shall I leave it untouched? I'm fine both ways - I think might be useful e.g. when using proxies, but it's your call

I think we can leave it in there unused, as you say it might be useful eventually!

cmd/auth/auth.go

pkg/ai/iai.go

pkg/ai/llama.go

pkg/ai/noopai.go

mudler · 2023-04-17T16:32:59Z

@AlexsJones @arbreezy implemented your suggestions - added some comments inline with the changes introduced

mudler · 2023-04-17T16:36:17Z

This is how it works now:

deploy llama-cli API (with a model, and a template for it) following the instructions in the README

base ❯ ./k8sgpt auth --backend llama --model ggml-koala-7b-model-q4_0-r2                                          
Using llama as backend AI provider                                                                                                                                                                                                           
Enter llama API Base URL (e.g. `http://localhost:8080/v1`): http://localhost:8080/v1                                                                                                                                                         
Provider updated                                                                                                                                                                                                                             
key added

Enjoy local, free inference without any API cost :)

k8sgpt analyze --explain --backend llama
100% |██████████████████████████████████████████████████████████████████████████████████| (2/2, 46 it/hr)

0 llama/llama(llama)

Error: Service has not ready endpoints, pods: [Pod/llama-69b7785db9-rlv9g], expected 1
The service in your Kubernetes cluster does not have any available endpoints. This means that the service is not running or listening on any available ports.

To resolve this issue, you need to make sure that the service is running and listening on the correct ports. You can check this by running the command kubectl get pod <pod-name> -n <namespace> -o yaml, where is the name
of the pod that the service is running in, and is the name of the namespace where the pod is running.

If the service is not running or listening on any available ports, you need to create or update the service resource file to include the correct ports and endpoints. You can also check that the service resource file is correctly configur
ed and that the service is running in the correct namespace.

If the service is running and listening on the correct ports, you need to check that the pod that the service is running in is running and that the pod is listening on the correct ports. If the pod is not running or listening on the corr
ect ports, you need to create or update the pod resource file to include the correct ports and endpoints.

If the pod is running and listening on the correct ports, you need to check that the service is running in the correct namespace and that the service is configured correctly. If the service is not running in the correct namespace or is n
ot configured correctly, you need to create or update the service resource file to include the correct namespace and configuration.

If the service is running in the correct namespace and is configured correctly, you need to check that the service is listening on the correct ports. If the service is not listening on the correct ports, you need to check that the servic
e resource file is correctly configured and that the service is listening on the correct ports.

If the service is listening on the correct ports, you need to check that the pod that the service is running in is listening on the correct ports. If the pod is not listening on the correct ports, you need to check that the pod resource
file is correctly configured and that the pod is listening on the correct ports.

If the pod is listening on the correct ports, you need to check that the pod is running. If the pod is not running, you need to check that the pod resource file is correctly configured and that the pod is running.

If the pod is running, you need to check that the pod

1 llama/llama-69b7785db9-rlv9g(Deployment/llama)

Error: Back-off pulling image "quay.io/go-skynet/llama-cli:fff"
This error message is indicating that there was a back-off from pulling the image "quay.io/go-skynet/llama-cli:fff" from the Docker image registry. This means that the image was previously attempted to be pulled but failed due to a lack
of available resources.

To resolve this issue, you can try the following solutions:

Check the resource constraints of your Kubernetes cluster and ensure that there are sufficient resources available to pull the image.

Try pulling the image again at a later time when the resources are available.

If the issue persists, consider scaling down the resources of your cluster or upgrading to a more powerful cluster.

Check if the image is already available in the cluster, if it is then try to use it instead of pulling it again.

Check the version of the image, if it is an older version, try upgrading to the latest version.

It's also important to check if there are any network connectivity issues or if the image is not accessible from the Kubernetes cluster.

matthisholleville · 2023-04-18T04:20:40Z

Very useful feature ! Good job @mudler !!

cmd/auth/auth.go

mudler · 2023-04-20T20:53:17Z

Rebased from main, docs updated, and gave it another shot locally!

~/_git/k8sgpt local_models*
base ❯ ./k8sgpt auth --backend localai --model ggml-koala-7b-model-q4_0-r2.bin --baseurl http://localhost:8080/v1
Using localai as backend AI provider
Enter localai Key: New provider added
key added

~/_git/k8sgpt local_models*
base ❯ ./k8sgpt analyze --explain --backend localai
100% |█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| (2/2, 2 it/min)

0 llama/llama(llama)

Error: Service has not ready endpoints, pods: [Pod/llama-69b7785db9-rlv9g], expected 1
The error message is saying that the Kubernetes service for the pod "llama-69b7785db9-rlv9g" has not yet been created. To fix this, you can either wait for the service to be created or manually create the service.

If you want to wait for the service to be created, you can check the status of the service in the Kubernetes cluster using the kubectl get pods command and check if the pod is in a running state. Once the pod is in a running state, the service will be created automatically.

If you want to manually create the service, you can use the kubectl create service command to create the service and specify the name, type, selector, ports, and other details of the service.

It's also important to note that the "expected 1" part of the error message refers to the number of endpoints that the service is expected to have. If the pod is not running or if there is a problem with the pod, it will not have any endpoints, which is why the service is not ready.
1 llama/llama-69b7785db9-rlv9g(Deployment/llama)

Error: Back-off pulling image "quay.io/go-skynet/llama-cli:fff"
The error message "Back-off pulling image "quay.io/go-skynet/llama-cli:fff" indicates that there was a failure to pull an image from a Docker registry. The back-off mechanism is used to avoid too many failed pulls and to avoid overloading the Docker registry.

To resolve the issue, you can try the following steps:

Check that the Docker image exists and is accessible in the Docker registry.

Check that the Docker image has the correct tag and that it matches the image you are trying to pull.

Try pulling the Docker image again after waiting for a few minutes.

If the problem persists, you can try using a different Docker registry or Docker image.

If none of the above steps work, you can also try checking the Docker logs for more information about the error.

mudler · 2023-04-20T20:56:39Z

@arbreezy just noticed that you were already setting the baseURL in #309 🤦

I created #310 in case we want to merge that separately so we don't conflict each other PRs, I'm fine either way!

pkg/ai/iai.go

mudler · 2023-04-21T19:46:03Z

There was a small bug introduced in the #310 PR, sorry about that, I think it was an oversight between cherrypicks and rebases here. I've added a fix for it in this PR. (see my comment above)

@AlexsJones @arbreezy @matthisholleville rebased, re-tested locally, good to go here! please let me know if anything is missing!

Signed-off-by: mudler <mudler@mocaccino.org>

matthisholleville · 2023-04-24T13:07:35Z

I'll test this tonight! Thank you for your contribution

arbreezy · 2023-04-24T13:15:07Z

I am back tomorrow, I will have a look as well.
Thanks again @mudler.

matthisholleville · 2023-04-24T19:21:21Z

Hi,

I just tested it and here's the error I encountered:

➜  k8sgpt git:(9b914fb) ./k8sgpt analyze --explain --backend localai --namespace k8sgpt --no-cache
Warning: Legacy config file at `/Users/matthisholleville/.k8sgpt.yaml` detected! This file will be ignored!
   0% |                                                                                                                                                                                       | (0/1, 0 it/hr) [0s:0s]
Error: failed while calling AI provider localai: error, json: cannot unmarshal string into Go struct field ErrorResponse.error of type map[string]json.RawMessage

Here's my setup:

using the chatgpt-openai/ggml-model-whisper model.
no error log on LocalAI container
I followed the README documentation from A to Z, nothing exotic.

mudler · 2023-04-24T21:46:15Z

Hi,

I just tested it and here's the error I encountered:

➜  k8sgpt git:(9b914fb) ./k8sgpt analyze --explain --backend localai --namespace k8sgpt --no-cache
Warning: Legacy config file at `/Users/matthisholleville/.k8sgpt.yaml` detected! This file will be ignored!
   0% |                                                                                                                                                                                       | (0/1, 0 it/hr) [0s:0s]
Error: failed while calling AI provider localai: error, json: cannot unmarshal string into Go struct field ErrorResponse.error of type map[string]json.RawMessage

Good catch! that was a miss from my side, I just traced it back, now the API returns OpenAI error types: mudler/LocalAI#80

Here's my setup:

* using the [chatgpt-openai/ggml-model-whisper](https://huggingface.co/chatgpt-openai/ggml-model-whisper/blob/main/ggml-model-gpt-2-345M.bin) model.

This model is for "whisper" - it is not for LLM/GPTs. I've updated the docs with an e2e example for gpt4all-j: https://github.com/go-skynet/LocalAI#example-use-gpt4all-j-model

* no error log on LocalAI container

My bad, docs weren't aligned, you need to turn debug mode (by either setting the DEBUG env var to true, or specifying it in the CLI). I've updated the docs as well! (https://github.com/go-skynet/LocalAI#api)

Signed-off-by: mudler <mudler@mocaccino.org>

arbreezy

OK LGTM,
it's working great with a local model
I suggest to merge the localai.

matthisholleville · 2023-04-25T17:22:42Z

if @arbreezy tested and it's ok, it's good for me too. Thank you very much for you contribution @mudler !

matthisholleville

LGTM

panpan0000 · 2023-04-26T08:08:14Z

Hey, @mudler , Can you contribute a step by step doc guide to download a pre-trained model from somewhere , and use your feature ?
very excited for this .

mudler changed the title ~~Allow to set a baseURL for providers~~ Running local models Apr 13, 2023

mudler force-pushed the local_models branch from f3a1a09 to d10e1df Compare April 13, 2023 22:43

mudler changed the title ~~Running local models~~ feat: running local models Apr 13, 2023

mudler force-pushed the local_models branch from e64e098 to 990feeb Compare April 14, 2023 22:35

mudler marked this pull request as ready for review April 14, 2023 22:35

mudler requested review from a team as code owners April 14, 2023 22:35

AlexsJones reviewed Apr 16, 2023

View reviewed changes

README.md Show resolved Hide resolved

AlexsJones reviewed Apr 16, 2023

View reviewed changes

pkg/ai/openai.go Outdated Show resolved Hide resolved

mudler force-pushed the local_models branch 2 times, most recently from 25a00e7 to d1c32f8 Compare April 17, 2023 16:27

mudler commented Apr 17, 2023

View reviewed changes

cmd/auth/auth.go Outdated Show resolved Hide resolved

mudler commented Apr 17, 2023

View reviewed changes

pkg/ai/iai.go Show resolved Hide resolved

mudler commented Apr 17, 2023

View reviewed changes

pkg/ai/llama.go Outdated Show resolved Hide resolved

mudler commented Apr 17, 2023

View reviewed changes

pkg/ai/noopai.go Outdated Show resolved Hide resolved

matthisholleville reviewed Apr 18, 2023

View reviewed changes

cmd/auth/auth.go Outdated Show resolved Hide resolved

cmd/auth/auth.go Outdated Show resolved Hide resolved

mudler force-pushed the local_models branch 3 times, most recently from f6ec65c to dbb52d0 Compare April 20, 2023 20:51

mudler force-pushed the local_models branch from dbb52d0 to 5a95d2e Compare April 20, 2023 21:06

mudler mentioned this pull request Apr 20, 2023

feat: allow to set a baseurl #310

Merged

4 tasks

mudler force-pushed the local_models branch from 5a95d2e to 5f90ea2 Compare April 21, 2023 19:42

mudler commented Apr 21, 2023

View reviewed changes

pkg/ai/iai.go Show resolved Hide resolved

mudler requested review from AlexsJones, arbreezy and matthisholleville April 21, 2023 19:43

mudler added 2 commits April 24, 2023 10:09

docs: add instructions to run local models

110cb54

Signed-off-by: mudler <mudler@mocaccino.org>

feat: add LocalAI backend

9b914fb

Signed-off-by: mudler <mudler@mocaccino.org>

mudler force-pushed the local_models branch from 5f90ea2 to 9b914fb Compare April 24, 2023 08:09

docs: simplify, link to an e2e example

3f769bf

Signed-off-by: mudler <mudler@mocaccino.org>

mudler requested a review from a team as a code owner April 24, 2023 21:48

mudler added 3 commits April 25, 2023 00:08

Merge branch 'main' into local_models

2a27344

Merge branch 'main' into local_models

a1aaa0a

Merge branch 'main' into local_models

30de251

arbreezy approved these changes Apr 25, 2023

View reviewed changes

arbreezy requested a review from a team April 25, 2023 17:20

Merge branch 'main' into local_models

9f092f3

matthisholleville approved these changes Apr 25, 2023

View reviewed changes

arbreezy merged commit c365c53 into k8sgpt-ai:main Apr 25, 2023

mudler deleted the local_models branch April 25, 2023 18:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: running local models #269

feat: running local models #269

mudler commented Apr 13, 2023 •

edited

Loading

mudler commented Apr 14, 2023 •

edited

Loading

mudler commented Apr 14, 2023

AlexsJones commented Apr 16, 2023

arbreezy commented Apr 16, 2023

mudler commented Apr 16, 2023

mudler commented Apr 16, 2023 •

edited

Loading

AlexsJones commented Apr 17, 2023

mudler commented Apr 17, 2023

mudler commented Apr 17, 2023 •

edited

Loading

matthisholleville commented Apr 18, 2023

mudler commented Apr 20, 2023

mudler commented Apr 20, 2023

mudler commented Apr 21, 2023 •

edited

Loading

matthisholleville commented Apr 24, 2023

arbreezy commented Apr 24, 2023

matthisholleville commented Apr 24, 2023

mudler commented Apr 24, 2023 •

edited

Loading

arbreezy left a comment

matthisholleville commented Apr 25, 2023

matthisholleville left a comment

panpan0000 commented Apr 26, 2023

feat: running local models #269

feat: running local models #269

Conversation

mudler commented Apr 13, 2023 • edited Loading

📑 Description

✅ Checks

ℹ Additional Information

mudler commented Apr 14, 2023 • edited Loading

mudler commented Apr 14, 2023

AlexsJones commented Apr 16, 2023

arbreezy commented Apr 16, 2023

mudler commented Apr 16, 2023

mudler commented Apr 16, 2023 • edited Loading

AlexsJones commented Apr 17, 2023

mudler commented Apr 17, 2023

mudler commented Apr 17, 2023 • edited Loading

matthisholleville commented Apr 18, 2023

mudler commented Apr 20, 2023

mudler commented Apr 20, 2023

mudler commented Apr 21, 2023 • edited Loading

matthisholleville commented Apr 24, 2023

arbreezy commented Apr 24, 2023

matthisholleville commented Apr 24, 2023

mudler commented Apr 24, 2023 • edited Loading

arbreezy left a comment

Choose a reason for hiding this comment

matthisholleville commented Apr 25, 2023

matthisholleville left a comment

Choose a reason for hiding this comment

panpan0000 commented Apr 26, 2023

mudler commented Apr 13, 2023 •

edited

Loading

mudler commented Apr 14, 2023 •

edited

Loading

mudler commented Apr 16, 2023 •

edited

Loading

mudler commented Apr 17, 2023 •

edited

Loading

mudler commented Apr 21, 2023 •

edited

Loading

mudler commented Apr 24, 2023 •

edited

Loading