Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"failed calling webhook" error when creating namespace #147

Closed
yogeek opened this issue Feb 17, 2022 · 10 comments
Closed

"failed calling webhook" error when creating namespace #147

yogeek opened this issue Feb 17, 2022 · 10 comments
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@yogeek
Copy link

yogeek commented Feb 17, 2022

When I create a K8S cluster, some addons are deployed with Jenkins (kubectl, kustomize or helm...) and others with argocd (we are currently migrating our addons deployment to argocd progressively)

  • HNC has been added recently, so it is deployed with ArgoCD
  • other legacy addons like external-dns still are deployed with a script in Jenkins

The script installing external-dns is failing at the step of the namespace creation with this error :

Error from server (InternalError): Internal error occurred: failed calling webhook "namespaces.hnc.x-k8s.io": Post "[https://hnc-webhook-service.hnc-system.svc:443/validate-v1-namespace?timeout=10s](https://hnc-webhook-service.hnc-system.svc/validate-v1-namespace?timeout=10s)": dial tcp 10.100.115.157:443: connect: connection refused

If I launch the pipeline again, all is ok.
I guess this is because the webhook of HNC was not ready yet when the creation of namespace was executed...

Is it normal that commands like kubectl create ns external-dns fail whereas I explicitely added --excluded-namespace=external-dns to exclude this namespace from the HNC scope ?

@erikgb
Copy link
Contributor

erikgb commented Feb 17, 2022

Thanks for registering this issue @yogeek! I am experiencing the same behavior, and it is kinda expected. The excluded/included namespaces in HNC are processed by the webhook server in HNC. The namespace webhook in HNC is configured to fail closed, as it must be, so when the webhook endpoint is not available any namespace API request in the cluster will fail. 😞

I wonder if kubernetes/kubernetes#92157 (comment) can be used to improve the processing of the included/excluded namespace configuration in HNC? That will allow the API server to be aware of the namespace configuration in HNC. WDYT @adrianludwin? It will only be supported on K8s 1.21+, but should work as now on earlier versions....

Related issues: #68 #94

@adrianludwin
Copy link
Contributor

Sorry the for delay in responding.

This seems a bit weird to me because starting in HNC v0.9, the webhooks shouldn't touch any namespace that doesn't have the hnc.x-k8s.io/included-namespace: "true" label on it, and the HNC controller adds that label itself. So on a new cluster where HNC isn't running yet, nothing should have that label and the webhook should have no effect; by the time HNC has started its controller, it should also have started its webhook service as well.

With that said, if this is happening during an update, that might make a lot more sense since the label will already be present. We're working on making the webhooks highly available which should help here, but we're not there yet. In the meantime, perhaps you could install the HNC webhooks as the last step in the pipeline? Simply take them out of the original manifest, wait for everything else to be installed (which, incidentally, will give HNC some time to start up) and then install the webhooks last?

@adrianludwin
Copy link
Contributor

Oh wait, my bad - only the object webhook is restricted to certain namespaces. The webhook namespace operates over all namespaces.

I agree that kubernetes/kubernetes#92157 (comment) could be used to improve this.

@adrianludwin
Copy link
Contributor

Hmm, does HNC need to do anything to make this better? E.g. if you're using K8s 1.21+, you can simply modify the webhook manifests yourself to exclude the namespaces you want HNC to ignore.

@erikgb
Copy link
Contributor

erikgb commented Mar 16, 2022

Hmm, does HNC need to do anything to make this better? E.g. if you're using K8s 1.21+, you can simply modify the webhook manifests yourself to exclude the namespaces you want HNC to ignore.

As a user, I would expect the excluded-namespace flag supplied to HNC to fix this, but that might be a bit complicated to implement? 🤔

@adrianludwin
Copy link
Contributor

adrianludwin commented Mar 16, 2022 via email

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 15, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jul 15, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closing this issue.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests

5 participants