Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: running local models #269

Merged
merged 7 commits into from
Apr 25, 2023
Merged

feat: running local models #269

merged 7 commits into from
Apr 25, 2023

Conversation

mudler
Copy link
Contributor

@mudler mudler commented Apr 13, 2023

This is an untested draft still It works! see my comment below. This should be enough for testing on an OpenAI compatible endpoint just by letting the user change the base_url, so should work as well with https://github.com/go-skynet/LocalAI

I'll experiment a bit locally and refine this, adding docs too. cc: @arbreezy @AlexsJones

Closes #188

📑 Description

✅ Checks

  • My pull request adheres to the code style of this project
  • My code requires changes to the documentation
  • I have updated the documentation as required
  • All the tests have passed

ℹ Additional Information

@mudler mudler changed the title Allow to set a baseURL for providers Running local models Apr 13, 2023
@mudler mudler changed the title Running local models feat: running local models Apr 13, 2023
@mudler
Copy link
Contributor Author

mudler commented Apr 14, 2023

works beautifully :)

To test it, I deployed something with a wrong tag, so it was failing to pull the image and bring up the service, that was the response (it took some time, but hey, that's now all local!):

base ❯ ./k8sgpt analyze --explain                                                                                                                                                             
 100% |██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| (2/2, 1 it/min)            
                                                                                                                                                                                                                   
0 llama/llama-69b7785db9-rlv9g(Deployment/llama)                                                                                                                                              
- Error: Back-off pulling image "quay.io/go-skynet/llama-cli:fff"                                                                                                                             
```The message is: Back-off pulling image "quay.io/go-skynet/llama-cli:fff"                                                                                                                                        
                                                                                                                                                                                              
The solution is:                                                                                                                                                                              
                                                                                                                                                                                              
The error message indicates that there is a conflict in the image name being used in the Kubernetes deployment. To resolve this issue, you can either rename the image or update the image tag in the Kubernetes deployment file.
                                                                                                                                                                                              
For example, if the image name is "llama-cli:fff", you can rename it to something like "llama-cli:tag-1". Then, update the image tag in the Kubernetes deployment file to reflect the new name.                                              
                                                                                                                                                                                              
Alternatively, if the image name is not a critical part of the deployment, you can rename the image without affecting the deployment.                                                         
                                                                                                                                                                                              
1 llama/llama(llama)                                                                                                                                                                          
- Error: Service has not ready endpoints, pods: [Pod/llama-69b7785db9-rlv9g], expected 1                                                                                                      
                                                                                                                                                                                                                   
```                                                                                                                                                                                           
Service has not ready endpoints, pods: [Pod/llama-69b7785db9-rlv9g], expected 1                                                                                                               
                                               
```                                                 
                                                                                                                                                                                              
The error message is indicating that the service does not have any available endpoints, and there is only one pod that is running. This means that there is no endpoint available to handle the traffic.                                     
                                                                                                                                                                                                                   
To resolve this issue, you can either:                                                                                                                                                        

1.   Deploy a new pod to the service, or
2.   Update the endpoints of the existing pod to include the new endpoint.                                                                                                                                         

If you are using the first option, you can use the kubectl apply command to deploy a new pod to the service. For example:                                                                                                                    
                                                                                               
                                                                                                                                                                                                                   
```                                            
kubectl apply -f https://raw.githubusercontent.com/Homebrew/kubeadm/main/examples/deploy-pod.yaml                                                                                                                                            
                                                                                                                                                                                              
```                                                                                                                                                                                                                

If you are using the second option, you can use the kubectl apply command to update the endpoints of the existing pod. For example:                                                                                                          

                                                                                                         
```                                                                                            
kubectl apply -f https://raw.githubusercontent.com/Homebrew/kubeadm/main/examples/update-endpoints.yaml                                                                                                                                      
                                                    
```                                                 

Note that you need to replace the values in the examples file with your own values.                                   

My setup:

base ❯ cat ~/.k8sgpt.yaml
ai:
    providers:
        - base_url: http://localhost:8080/v1
          model: ggml-koala-7b-model-q4_0-r2
          name: openai
kubeconfig: ....
kubecontext: ""

To bring up the API server (locally):

BEGINNING OF CONVERSATION: USER: {{.Input}} GPT:

Will polish, and add docs alongside with the PR, stay tuned!

@mudler mudler marked this pull request as ready for review April 14, 2023 22:35
@mudler mudler requested review from a team as code owners April 14, 2023 22:35
@mudler
Copy link
Contributor Author

mudler commented Apr 14, 2023

here looks good! - let me know if that's the good direction @AlexsJones , I've also updated the docs, feedback is welcome :)

README.md Show resolved Hide resolved
pkg/ai/openai.go Outdated Show resolved Hide resolved
@AlexsJones
Copy link
Member

Looks very exciting @mudler , @matthisholleville @arbreezy perhaps we use this as another reason to have a configuration Object for AI rather than passing more strings. Also @mudler I think having your own AI struct might be easier ( yes its a little more work, but it eventually means less manipulation of the openai.go)

@arbreezy
Copy link
Member

This adds great flexibility to k8sgpt !
Just a small comment @mudler.
Since there is no authentication happening with llama we need to tweak the auth cmd, skipping the password/token/api-key user prompt if llama provider is enabled in k8sgpt.

@mudler
Copy link
Contributor Author

mudler commented Apr 16, 2023

Thanks for the feedback :) Going to work on your feedback into separate commits and be back at you as soon as possible!

@mudler
Copy link
Contributor Author

mudler commented Apr 16, 2023

@arbreezy @AlexsJones do you want me to keep the base_url option to the OpenAI backend? Or shall I leave it untouched? I'm fine both ways - I think might be useful e.g. when using proxies, but it's your call

@AlexsJones
Copy link
Member

@arbreezy @AlexsJones do you want me to keep the base_url option to the OpenAI backend? Or shall I leave it untouched? I'm fine both ways - I think might be useful e.g. when using proxies, but it's your call

I think we can leave it in there unused, as you say it might be useful eventually!

@mudler mudler force-pushed the local_models branch 2 times, most recently from 25a00e7 to d1c32f8 Compare April 17, 2023 16:27
cmd/auth/auth.go Outdated Show resolved Hide resolved
pkg/ai/iai.go Show resolved Hide resolved
pkg/ai/llama.go Outdated Show resolved Hide resolved
pkg/ai/noopai.go Outdated Show resolved Hide resolved
@mudler
Copy link
Contributor Author

mudler commented Apr 17, 2023

@AlexsJones @arbreezy implemented your suggestions - added some comments inline with the changes introduced

@mudler
Copy link
Contributor Author

mudler commented Apr 17, 2023

This is how it works now:

  1. deploy llama-cli API (with a model, and a template for it) following the instructions in the README
base ❯ ./k8sgpt auth --backend llama --model ggml-koala-7b-model-q4_0-r2                                          
Using llama as backend AI provider                                                                                                                                                                                                           
Enter llama API Base URL (e.g. `http://localhost:8080/v1`): http://localhost:8080/v1                                                                                                                                                         
Provider updated                                                                                                                                                                                                                             
key added                                                                                                                                                                                                                                    
  1. Enjoy local, free inference without any API cost :)

k8sgpt analyze --explain --backend llama
100% |██████████████████████████████████████████████████████████████████████████████████| (2/2, 46 it/hr)

0 llama/llama(llama)

  • Error: Service has not ready endpoints, pods: [Pod/llama-69b7785db9-rlv9g], expected 1
    The service in your Kubernetes cluster does not have any available endpoints. This means that the service is not running or listening on any available ports.

To resolve this issue, you need to make sure that the service is running and listening on the correct ports. You can check this by running the command kubectl get pod <pod-name> -n <namespace> -o yaml, where is the name
of the pod that the service is running in, and is the name of the namespace where the pod is running.

If the service is not running or listening on any available ports, you need to create or update the service resource file to include the correct ports and endpoints. You can also check that the service resource file is correctly configur
ed and that the service is running in the correct namespace.

If the service is running and listening on the correct ports, you need to check that the pod that the service is running in is running and that the pod is listening on the correct ports. If the pod is not running or listening on the corr
ect ports, you need to create or update the pod resource file to include the correct ports and endpoints.

If the pod is running and listening on the correct ports, you need to check that the service is running in the correct namespace and that the service is configured correctly. If the service is not running in the correct namespace or is n
ot configured correctly, you need to create or update the service resource file to include the correct namespace and configuration.

If the service is running in the correct namespace and is configured correctly, you need to check that the service is listening on the correct ports. If the service is not listening on the correct ports, you need to check that the servic
e resource file is correctly configured and that the service is listening on the correct ports.

If the service is listening on the correct ports, you need to check that the pod that the service is running in is listening on the correct ports. If the pod is not listening on the correct ports, you need to check that the pod resource
file is correctly configured and that the pod is listening on the correct ports.

If the pod is listening on the correct ports, you need to check that the pod is running. If the pod is not running, you need to check that the pod resource file is correctly configured and that the pod is running.

If the pod is running, you need to check that the pod

1 llama/llama-69b7785db9-rlv9g(Deployment/llama)

  • Error: Back-off pulling image "quay.io/go-skynet/llama-cli:fff"

This error message is indicating that there was a back-off from pulling the image "quay.io/go-skynet/llama-cli:fff" from the Docker image registry. This means that the image was previously attempted to be pulled but failed due to a lack
of available resources.

To resolve this issue, you can try the following solutions:

  1. Check the resource constraints of your Kubernetes cluster and ensure that there are sufficient resources available to pull the image.
  2. Try pulling the image again at a later time when the resources are available.
  3. If the issue persists, consider scaling down the resources of your cluster or upgrading to a more powerful cluster.
  4. Check if the image is already available in the cluster, if it is then try to use it instead of pulling it again.
  5. Check the version of the image, if it is an older version, try upgrading to the latest version.

It's also important to check if there are any network connectivity issues or if the image is not accessible from the Kubernetes cluster.

@matthisholleville
Copy link
Contributor

Very useful feature ! Good job @mudler !!

cmd/auth/auth.go Outdated Show resolved Hide resolved
cmd/auth/auth.go Outdated Show resolved Hide resolved
@mudler mudler force-pushed the local_models branch 3 times, most recently from f6ec65c to dbb52d0 Compare April 20, 2023 20:51
@mudler
Copy link
Contributor Author

mudler commented Apr 20, 2023

Rebased from main, docs updated, and gave it another shot locally!

~/_git/k8sgpt local_models*
base ❯ ./k8sgpt auth --backend localai --model ggml-koala-7b-model-q4_0-r2.bin --baseurl http://localhost:8080/v1
Using localai as backend AI provider
Enter localai Key: New provider added
key added

~/_git/k8sgpt local_models*
base ❯ ./k8sgpt analyze --explain --backend localai
100% |█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| (2/2, 2 it/min)

0 llama/llama(llama)

  • Error: Service has not ready endpoints, pods: [Pod/llama-69b7785db9-rlv9g], expected 1
    The error message is saying that the Kubernetes service for the pod "llama-69b7785db9-rlv9g" has not yet been created. To fix this, you can either wait for the service to be created or manually create the service.

If you want to wait for the service to be created, you can check the status of the service in the Kubernetes cluster using the kubectl get pods command and check if the pod is in a running state. Once the pod is in a running state, the service will be created automatically.

If you want to manually create the service, you can use the kubectl create service command to create the service and specify the name, type, selector, ports, and other details of the service.

It's also important to note that the "expected 1" part of the error message refers to the number of endpoints that the service is expected to have. If the pod is not running or if there is a problem with the pod, it will not have any endpoints, which is why the service is not ready.
1 llama/llama-69b7785db9-rlv9g(Deployment/llama)

  • Error: Back-off pulling image "quay.io/go-skynet/llama-cli:fff"
    The error message "Back-off pulling image "quay.io/go-skynet/llama-cli:fff" indicates that there was a failure to pull an image from a Docker registry. The back-off mechanism is used to avoid too many failed pulls and to avoid overloading the Docker registry.

To resolve the issue, you can try the following steps:

  1. Check that the Docker image exists and is accessible in the Docker registry.
  2. Check that the Docker image has the correct tag and that it matches the image you are trying to pull.
  3. Try pulling the Docker image again after waiting for a few minutes.
  4. If the problem persists, you can try using a different Docker registry or Docker image.

If none of the above steps work, you can also try checking the Docker logs for more information about the error.

@mudler
Copy link
Contributor Author

mudler commented Apr 20, 2023

@arbreezy just noticed that you were already setting the baseURL in #309 🤦

I created #310 in case we want to merge that separately so we don't conflict each other PRs, I'm fine either way!

pkg/ai/iai.go Show resolved Hide resolved
@mudler
Copy link
Contributor Author

mudler commented Apr 21, 2023

There was a small bug introduced in the #310 PR, sorry about that, I think it was an oversight between cherrypicks and rebases here. I've added a fix for it in this PR. (see my comment above)

@AlexsJones @arbreezy @matthisholleville rebased, re-tested locally, good to go here! please let me know if anything is missing!

Signed-off-by: mudler <mudler@mocaccino.org>
Signed-off-by: mudler <mudler@mocaccino.org>
@matthisholleville
Copy link
Contributor

I'll test this tonight! Thank you for your contribution

@arbreezy
Copy link
Member

I am back tomorrow, I will have a look as well.
Thanks again @mudler.

@matthisholleville
Copy link
Contributor

Hi,

I just tested it and here's the error I encountered:

➜  k8sgpt git:(9b914fb) ./k8sgpt analyze --explain --backend localai --namespace k8sgpt --no-cache
Warning: Legacy config file at `/Users/matthisholleville/.k8sgpt.yaml` detected! This file will be ignored!
   0% |                                                                                                                                                                                       | (0/1, 0 it/hr) [0s:0s]
Error: failed while calling AI provider localai: error, json: cannot unmarshal string into Go struct field ErrorResponse.error of type map[string]json.RawMessage

Here's my setup:

@mudler
Copy link
Contributor Author

mudler commented Apr 24, 2023

Hi,

I just tested it and here's the error I encountered:

➜  k8sgpt git:(9b914fb) ./k8sgpt analyze --explain --backend localai --namespace k8sgpt --no-cache
Warning: Legacy config file at `/Users/matthisholleville/.k8sgpt.yaml` detected! This file will be ignored!
   0% |                                                                                                                                                                                       | (0/1, 0 it/hr) [0s:0s]
Error: failed while calling AI provider localai: error, json: cannot unmarshal string into Go struct field ErrorResponse.error of type map[string]json.RawMessage

Good catch! that was a miss from my side, I just traced it back, now the API returns OpenAI error types: mudler/LocalAI#80

Here's my setup:

* using the [chatgpt-openai/ggml-model-whisper](https://huggingface.co/chatgpt-openai/ggml-model-whisper/blob/main/ggml-model-gpt-2-345M.bin) model.

This model is for "whisper" - it is not for LLM/GPTs. I've updated the docs with an e2e example for gpt4all-j: https://github.com/go-skynet/LocalAI#example-use-gpt4all-j-model

* no error log on LocalAI container

My bad, docs weren't aligned, you need to turn debug mode (by either setting the DEBUG env var to true, or specifying it in the CLI). I've updated the docs as well! (https://github.com/go-skynet/LocalAI#api)

Signed-off-by: mudler <mudler@mocaccino.org>
@mudler mudler requested a review from a team as a code owner April 24, 2023 21:48
Copy link
Member

@arbreezy arbreezy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK LGTM,
it's working great with a local model
I suggest to merge the localai.

@arbreezy arbreezy requested a review from a team April 25, 2023 17:20
@matthisholleville
Copy link
Contributor

if @arbreezy tested and it's ok, it's good for me too. Thank you very much for you contribution @mudler !

Copy link
Contributor

@matthisholleville matthisholleville left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@arbreezy arbreezy merged commit c365c53 into k8sgpt-ai:main Apr 25, 2023
@mudler mudler deleted the local_models branch April 25, 2023 18:52
@panpan0000
Copy link
Contributor

Hey, @mudler , Can you contribute a step by step doc guide to download a pre-trained model from somewhere , and use your feature ?
very excited for this .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Local model support
5 participants