Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: empty analyze results when running with operator on AKS + error is raised #98

Closed
3 of 4 tasks
eyalsofer opened this issue May 14, 2023 · 31 comments
Closed
3 of 4 tasks
Labels
bug Something isn't working feature New functionality.

Comments

@eyalsofer
Copy link

Checklist

  • I've searched for similar issues and couldn't find anything matching
  • I've included steps to reproduce the behavior

Affected Components

  • K8sGPT (CLI)
  • K8sGPT Operator

K8sGPT Version

v1alpha1

Kubernetes Version

V 1.25.6

Host OS and its Version

Ubunto

Steps to reproduce

  1. install K8SGPT operator on AKS
  2. run kubectl get results -o json | jq .

Expected behaviour

list with analyzed data is return + no error at k8sgpt deployment.

Actual behaviour

returns empty list:
{
"apiVersion": "v1",
"items": [],
"kind": "List",
"metadata": {
"resourceVersion": ""
}
}

  • error is raised: Error: AI provider openai not specified in configuration. Please run k8sgpt auth

Additional Information

Spoke with Aris on this issue (Aris: "we've recently refactored the way we interact with k8sgpt service and I think we need to tweak it a bit")

@matthisholleville
Copy link
Contributor

Hello, thank you for your message. Are you trying to use a different backend from openAI?

Can you give more details about the K8SGPT manifest you are deploying?

apiVersion: core.k8sgpt.ai/v1alpha1
kind: K8sGPT
metadata:
  name: k8sgpt-sample
spec:
  model: gpt-3.5-turbo
...

@eyalsofer
Copy link
Author

eyalsofer commented May 14, 2023

Yes, im trying to use azure openapi instance.
here is my manifest:

apiVersion: core.k8sgpt.ai/v1alpha1
kind: K8sGPT
metadata:
  name: k8sgpt-sample
  namespace: kube-system
spec:
  model: gpt-3.5-turbo
  backend: azureopenai
  baseurl: https://mdc-generative-ai.openai.azure.com/
  engine: eyal-test
  noCache: false
  version: v0.3.0
  enableAI: true
  secret:
    name: k8sgpt-sample-secret
    key: openai-api-key

@AlexsJones AlexsJones added the bug Something isn't working label May 14, 2023
@AlexsJones
Copy link
Member

Moving this to the operator repo

@AlexsJones AlexsJones transferred this issue from k8sgpt-ai/k8sgpt May 14, 2023
@AlexsJones
Copy link
Member

@eyalsofer how does it behave if you run this from the CLI rather than the operator?

@matthisholleville
Copy link
Contributor

@AlexsJones The choice of backend is not supported by the client API it seems.

@AlexsJones
Copy link
Member

@AlexsJones The choice of backend is not supported by the client API it seems.

It should be as per K8sGPT code.

	clients = []IAI{
		&OpenAIClient{},
		&AzureAIClient{},
		&LocalAIClient{},
		&NoOpAIClient{},
	}
	Backends = []string{
		"openai",
		"localai",
		"azureopenai",
		"noopai",
	}

Unfortunately I cant test this without access ( I have applied )

@AlexsJones AlexsJones added the feature New functionality. label May 15, 2023
@AlexsJones
Copy link
Member

What do the logs say from the operator @eyalsofer ?

@eyalsofer
Copy link
Author

This is the error i see in the pod's logs:
Error: AI provider openai not specified in configuration. Please run k8sgpt auth
{"level":"info","ts":1684097823.841231,"caller":"server/log.go:49","msg":"request failed","duration_ms":0,"method":"/schema.v1.Server/Analyze","request":"backend:"openai" explain:true max_concurrency:10 output:"json"","remote_addr":"10.244.0.75:44364","status_code":2}

@AlexsJones
Copy link
Member

AlexsJones commented May 15, 2023

Can you please try version 0.0.14 of the operator?

  1. Delete the K8sGPT CRD
  2. helm upgrade the operator
  3. Redeploy K8sGPT

@eyalsofer
Copy link
Author

ok i deleted the CRD and updated the helm chart of the operator to 0.0.14.
Still seeing errors - what do you mean by redeploy k8sgpt? maybe im missing a step.
Here are pod's logs:

2023-05-15T09:27:03Z ERROR Reconciler error {"controller": "k8sgpt", "controllerGroup": "core.k8sgpt.ai", "controllerKind": "K8sGPT", "K8sGPT": {"name":"k8sgpt-sample","namespace":"kube-system"}, "namespace": "kube-system", "name": "k8sgpt-sample", "reconcileID": "1dd2277a-c121-494a-a506-a93a3e012b1f", "error": "failed to call Analyze RPC: rpc error: code = Unknown desc = AI provider not specified in configuration"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:329
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:274
Finished Reconciling K8sGPT with error: failed to call Analyze RPC: rpc error: code = Unknown desc = AI provider not specified in configuration
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:235
Finished Reconciling K8sGPT
E0515 09:32:27.151914 1 reflector.go:140] pkg/mod/k8s.io/client-go@v0.26.3/tools/cache/reflector.go:169: Failed to watch *v1alpha1.K8sGPT: the server could not find the requested resource (get k8sgpts.core.k8sgpt.ai)
W0515 09:32:28.139892 1 reflector.go:424] pkg/mod/k8s.io/client-go@v0.26.3/tools/cache/reflector.go:169: failed to list *v1alpha1.K8sGPT: the server could not find the requested resource (get k8sgpts.core.k8sgpt.ai)
E0515 09:32:28.139932 1 reflector.go:140] pkg/mod/k8s.io/client-go@v0.26.3/tools/cache/reflector.go:169: Failed to watch *v1alpha1.K8sGPT: failed to list *v1alpha1.K8sGPT: the server could not find the requested resource (get k8sgpts.core.k8sgpt.ai)

@AlexsJones
Copy link
Member

What version of K8sGPT is in the CR? It needs to be at least v0.3.0

@eyalsofer
Copy link
Author

yes, its v3.0:
Image: ghcr.io/k8sgpt-ai/k8sgpt:v0.3.0

@eyalsofer
Copy link
Author

i see this errors in k8sgpt-operator-system ns:
Finished Reconciling K8sGPT with error: failed to call Analyze RPC: rpc error: code = Unknown desc = failed while calling AI provider azureopenai: error, status code: 401, message: Access denied due to invalid subscription key or wrong API endpoint. Make sure to provide a valid key for an active subscription and use a correct regional API endpoint for your resource.
2023-05-15T10:25:36Z ERROR Reconciler error {"controller": "k8sgpt", "controllerGroup": "core.k8sgpt.ai", "controllerKind": "K8sGPT", "K8sGPT": {"name":"k8sgpt-sample","namespace":"kube-system"}, "namespace": "kube-system", "name": "k8sgpt-sample", "reconcileID": "696c4957-62ae-47cd-8419-8cafc6d8e5f5", "error": "failed to call Analyze RPC: rpc error: code = Unknown desc = failed while calling AI provider azureopenai: error, status code: 401, message: Access denied due to invalid subscription key or wrong API endpoint. Make sure to provide a valid key for an active subscription and use a correct regional API endpoint for your resource."}

@eyalsofer
Copy link
Author

i will try to recreate the secret

@eyalsofer
Copy link
Author

got the error again after recreating the secret

@AlexsJones
Copy link
Member

Unfortunately, I am limited in assisting until I have access to the API, which could be weeks.

It would be helpful if you could reproduce this without the operator and just use K8sGPT CLI

@eyalsofer
Copy link
Author

i tried yesterday the k8sgpt with openai token and it worked fine.
i'll try now with azureopnai

@eyalsofer
Copy link
Author

eyalsofer commented May 15, 2023

btw, found a bug with the k8sgbt cli auth command:
k8sgpt auth --backend azureopenai --baseurl https://<your Azure OpenAI endpoint> --engine <deployment_name> --model <model_name>
it throws: Error: unknown flag: --backend
so i combined the engine after the baseurl and it worked.
its from the azure openai section here.

@eyalsofer
Copy link
Author

reproduced the issue with the k8sgpt cli:
image

@arbreezy
Copy link
Member

the auth cli argument has recently changed and we should reflect this in README file.

it looks like you haven't properly configured the provider, now you get a 404 instead of a 401

I believe @AlexsJones will be able to help with testing it in the operator and cli.

@AlexsJones
Copy link
Member

If you update to k8sgpt version 0.3.1 you can give this command:

k8sgpt auth new --backend azureopenai --baseurl <> --engine <> --model <>

It is a big clunky to work with the analyze by having to specify the backend, something I will look at improving

@eyalsofer
Copy link
Author

i tried to remove azureopenai auth settings in order to set it up again:
image
However, I did manage to remove openai auth settings.

@AlexsJones
Copy link
Member

I have a PR to make this much easier, once it's reviewed ill try to get it in to make your life easier k8sgpt-ai/k8sgpt#427

@AlexsJones
Copy link
Member

I have tested this now in the operator..

kubectl apply -f - << EOF
apiVersion: core.k8sgpt.ai/v1alpha1
kind: K8sGPT
metadata:
  name: k8sgpt-sample
  namespace: k8sgpt-operator-system
spec:
  model: gpt-35-turbo
  backend: azureopenai
  baseUrl: https://k8sgpt.openai.azure.com/
  engine: llm
  noCache: false
  version: v0.3.2
  enableAI: true
  secret:
    name: k8sgpt-sample-secret
    key: azure-api-key
EOF

I can confirm azureopenai as a backend is working cc @arbreezy

image

@eyalsofer
Copy link
Author

Thanks @AlexsJones
i upgraded the operator and now i get a quota issue with my azure openai resource.
i will solve it and update you.
Thanks again for the fast mitigation!

@eyalsofer
Copy link
Author

btw, this is the error im seeing:
Finished Reconciling K8sGPT with error: failed to call Analyze RPC: rpc error: code = Unknown desc = exhausted API quota for AI provider azureopenai: error, status code: 429, message: Requests to the Creates a completion for the chat message Operation under Azure OpenAI API version 2023-03-15-preview have exceeded call rate limit of your current OpenAI S0 pricing tier. Please retry after 59 seconds.

@arbreezy
Copy link
Member

hey @eyalsofer as it states in the error message, you've reached your quota in Azure OpenAI API, this is unrelated to the operator's functionality check your Azure subscription please :)

@eyalsofer
Copy link
Author

eyalsofer commented May 21, 2023

Hi @arbreezy
quick update: the logs of the k8sgpt-operator and k8sgpt-depoyment looks ok:
k8sgpt-deployment:
{"level":"info","ts":1684669226.123519,"caller":"server/log.go:50","msg":"request completed","duration_ms":219,"method":"/schema.v1.Server/Analyze","request":"backend:"azureopenai" explain:true max_concurrency:10 output:"json"","remote_addr":"10.244.0.239:52994"}
{"level":"info","ts":1684669256.259569,"caller":"server/log.go:50","msg":"request completed","duration_ms":95,"method":"/schema.v1.Server/Analyze","request":"backend:"azureopenai" explain:true max_concurrency:10 output:"json"","remote_addr":"10.244.0.239:52994"}
k8sgpt-operator:
Checking if defaultk8sapiapp is still relevant
Checking if defaultk8sapiappdeployment is still relevant
Checking if defaultk8sapiappdeployment745688cf74p6wx9 is still relevant
Checking if defaultk8sapiappdeployment56c75f487z4xfh is still relevant
Checking if defaultk8sapiappdeployment745688cf74d4tpf is still relevant
Finished Reconciling K8sGPT

but the output of kubectl get results -o json | jq . still empty:
{
"apiVersion": "v1",
"items": [],
"kind": "List",
"metadata": {
"resourceVersion": ""
}
}
@AlexsJones - am I'm doing something wrong here?

@lioryantov
Copy link

lioryantov commented Jun 8, 2023

Hi, I worked based on instructions provided on https://github.com/k8sgpt-ai/k8sgpt and also by watching Youtube by @AlexsJones (https://www.youtube.com/watch?v=hb4du-oK0KY) in order to connect my AKS with Azure Open AI service.
I installed the operator, while creating K8sGPT with version: v0.3.6 caused operator pod to print an error:

2023-06-08T11:28:15.604471896Z Finished Reconciling K8sGPT with error: failed to call Analyze RPC: rpc error: code = Unimplemented desc = unknown service schema.v1.Server
2023-06-08T11:28:15.604718611Z 2023-06-08T11:28:15Z	ERROR	Reconciler error	{"controller": "k8sgpt", "controllerGroup": "core.k8sgpt.ai", "controllerKind": "K8sGPT", "K8sGPT": {"name":"k8sgpt-azureopenai-new","namespace":"k8sgpt-operator-system"}, "namespace": "k8sgpt-operator-system", "name": "k8sgpt-azureopenai-new", "reconcileID": "efa43bfb-0739-4af3-94bc-be58c525458a", "error": "failed to call Analyze RPC: rpc error: code = Unimplemented desc = unknown service schema.v1.Server"}
2023-06-08T11:28:15.604738413Z sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
2023-06-08T11:28:15.604744713Z 	/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:324
2023-06-08T11:28:15.604761514Z sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
2023-06-08T11:28:15.604766614Z 	/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:265
2023-06-08T11:28:15.604771915Z sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
2023-06-08T11:28:15.604778315Z 	/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:226

When I updated it to use v0.3.2, then Reconciling K8sGPT process was done successfully and I got Result created.

@arbreezy
Copy link
Member

arbreezy commented Jun 8, 2023

Operator at the moment is compatible with
<= v0.3.4 of k8sgpt, the error is expected thanks for flagging it.

@lioryantov
Copy link

@arbreezy Thank you for clarifying it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working feature New functionality.
Projects
Status: Done
Development

No branches or pull requests

5 participants