Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keda admission webhook gives error "Deployment.apps not found" when deployment should exist #5973

Closed
jdgeisler opened this issue Jul 22, 2024 · 5 comments · Fixed by #6186
Closed
Labels
bug Something isn't working

Comments

@jdgeisler
Copy link

jdgeisler commented Jul 22, 2024

Report

We are installing a helm chart that contains a keda scaled object, along with the rest of the manifests like deployment, service, etc. Every once in awhile (maybe 1/10 installs), we are seeing the admission webhook fail the scaled object creation due to the deployment not being found.

admission webhook "vscaledobject.kb.io" denied the request: Deployment.apps "istio-ingressgateway" not found

This seems like it shouldn't occur since custom resources as part of a helm chart are the last resource installed. Meaning the deployment should always be created before the scaled object.

When I check in the cluster, the deployment is successfully created with the correct name, so I am not sure why the admission webhook would fail with this error.

Expected Behavior

The admission webhook should not fail with the above error when the scaled object is created right after the deployment in a helm release.

Actual Behavior

Checking the kubernetes audit logs, I see that the deployment is correctly created before the scaled object. I am not sure why the admission webhook would not be able to find it then. Perhaps there is some race condition here?

Timeline of a helm release that failed:

  • Timestamp of deployment created: 2024-07-22T09:07:40.323Z
  • Timestamp of scaled object created: 2024-07-22T09:07:40.570Z
  • Total time between resource creation: 247ms

Timeline of a helm release that passed:

  • Timestamp of deployment created: 2024-07-22T09:07:40.320Z
  • Timestamp of scaled object created: 2024-07-22T09:07:41.072Z
  • Total time between resource creation: 752ms

Steps to Reproduce the Problem

  1. Deploy a helm chart that contains the deployment, service, scaled object, etc
  2. Intermittently (maybe 1 out of every 10 or more helm installs), see that the keda admission webhook fails with admission webhook "vscaledobject.kb.io" denied the request: Deployment.apps "istio-ingressgateway" not found
  3. Check resources and audit logs and see that the deployment is created before the scaled object but still fails

KEDA Version

2.13.1

Kubernetes Version

1.28

Platform

AWS and GCP

@jdgeisler jdgeisler added the bug Something isn't working label Jul 22, 2024
Copy link

stale bot commented Sep 21, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale All issues that are marked as stale due to inactivity label Sep 21, 2024
@jdgeisler
Copy link
Author

This issue is still occurring frequently.

I have reached out in the keda slack channel here and got the following response from @wozniakjan

this might be due to the fact that the admission control webhooks use an informer cache instead of a direct client. This is the default k8s client in the controller-runtime GetClient(). That means what the cached client sees depends on how quickly the controller's watch loop observes the changes from the kube-apiserver.

it's possible that just for the CREATE verb in the webhook implementation, KEDA might want to use the direct client for this specific reason. I probably would still keep the cached client for UPDATE operations to avoid accidental excessive usage of kube-apiserver.

@stale stale bot removed the stale All issues that are marked as stale due to inactivity label Sep 23, 2024
@wozniakjan
Copy link
Member

today is a community bi-weekly call, I will add it to the agenda and I have a sense that this will have a good chance to be resolved soon. You are welcome to attend as well :)

@jdgeisler
Copy link
Author

Sounds great, I appreciate it. I will be there 👍

@wozniakjan
Copy link
Member

the consensus is that we will add a feature flag to KEDA webhook deployment to configure whether it will use the cached client (default) or the direct client.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants