Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clusterctl doesn't wait for deployments to be available #4474

Closed
CecileRobertMichon opened this issue Apr 13, 2021 · 8 comments · Fixed by #4934
Closed

clusterctl doesn't wait for deployments to be available #4474

CecileRobertMichon opened this issue Apr 13, 2021 · 8 comments · Fixed by #4934
Assignees
Labels
area/clusterctl Issues or PRs related to clusterctl kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Milestone

Comments

@CecileRobertMichon
Copy link
Contributor

What steps did you take and what happened:
[A clear and concise description on how to REPRODUCE the bug.]

➜  ~ clusterctl init --infrastructure azure
Fetching providers
Installing cert-manager Version="v1.1.0"
Waiting for cert-manager to be available...
Installing Provider="cluster-api" Version="v0.3.15" TargetNamespace="capi-system"
Installing Provider="bootstrap-kubeadm" Version="v0.3.15" TargetNamespace="capi-kubeadm-bootstrap-system"
Installing Provider="control-plane-kubeadm" Version="v0.3.15" TargetNamespace="capi-kubeadm-control-plane-system"
Installing Provider="infrastructure-azure" Version="v0.4.13" TargetNamespace="capz-system"

Your management cluster has been initialized successfully!

You can now create your first workload cluster by running the following:

  clusterctl config cluster [name] --kubernetes-version [version] | kubectl apply -f -

➜  ~ kubectl get pods -A
NAMESPACE                           NAME                                                             READY   STATUS              RESTARTS   AGE
capi-kubeadm-bootstrap-system       capi-kubeadm-bootstrap-controller-manager-59fcb99846-26shx       2/2     Running             0          106s
capi-kubeadm-control-plane-system   capi-kubeadm-control-plane-controller-manager-d6598bf4-wx6ss     2/2     Running             0          99s
capi-system                         capi-controller-manager-886c77fc7-hqq8l                          2/2     Running             0          110s
capi-webhook-system                 capi-controller-manager-575f7b6c8-mqrlh                          0/2     ContainerCreating   0          2m37s
capi-webhook-system                 capi-kubeadm-bootstrap-controller-manager-7fb76bc9b9-ncn7h       0/2     ContainerCreating   0          109s
capi-webhook-system                 capi-kubeadm-control-plane-controller-manager-6769d967bc-bk2hf   0/2     ContainerCreating   0          104s
capi-webhook-system                 capz-controller-manager-7fd987564d-64t75                         0/2     ContainerCreating   0          90s
capz-system                         capz-controller-manager-69bc66d65c-jl779                         0/2     ContainerCreating   0          14s
capz-system                         capz-nmi-xdhr4                                                   0/1     ContainerCreating   0          14s
cert-manager                        cert-manager-86cb5dcfdd-5hk86                                    1/1     Running             0          3m12s
cert-manager                        cert-manager-cainjector-84cf775b89-nxncz                         1/1     Running             1          3m12s
cert-manager                        cert-manager-webhook-5d5dc765f6-wwjpv                            1/1     Running             1          3m11s
kube-system                         coredns-74ff55c5b-6kjs4                                          1/1     Running             0          4m18s
kube-system                         coredns-74ff55c5b-spd6g                                          1/1     Running             0          4m18s
kube-system                         etcd-kind-control-plane                                          1/1     Running             0          4m24s
kube-system                         kindnet-pbg66                                                    1/1     Running             0          4m18s
kube-system                         kube-apiserver-kind-control-plane                                1/1     Running             0          4m24s
kube-system                         kube-controller-manager-kind-control-plane                       1/1     Running             0          4m24s
kube-system                         kube-proxy-m9n69                                                 1/1     Running             0          4m18s
kube-system                         kube-scheduler-kind-control-plane                                1/1     Running             0          4m24s
local-path-storage                  local-path-provisioner-78776bfc44-zfhv6                          1/1     Running             0          4m18s
➜  ~ clusterctl version
clusterctl version: &version.Info{Major:"0", Minor:"3", GitVersion:"v0.3.15", GitCommit:"b900c6f89f3d433c32db1aa269f77f004a83cc4f", GitTreeState:"clean", BuildDate:"2021-03-30T16:14:03Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"darwin/amd64"}
➜  ~ k get deploy -A
NAMESPACE                           NAME                                            READY   UP-TO-DATE   AVAILABLE   AGE
capi-kubeadm-bootstrap-system       capi-kubeadm-bootstrap-controller-manager       1/1     1            1           2m39s
capi-kubeadm-control-plane-system   capi-kubeadm-control-plane-controller-manager   1/1     1            1           2m32s
capi-system                         capi-controller-manager                         1/1     1            1           2m43s
capi-webhook-system                 capi-controller-manager                         0/1     1            0           3m30s
capi-webhook-system                 capi-kubeadm-bootstrap-controller-manager       1/1     1            1           2m42s
capi-webhook-system                 capi-kubeadm-control-plane-controller-manager   1/1     1            1           2m37s
capi-webhook-system                 capz-controller-manager                         0/1     1            0           2m23s
capz-system                         capz-controller-manager                         0/1     1            0           67s
cert-manager                        cert-manager                                    1/1     1            1           4m5s
cert-manager                        cert-manager-cainjector                         1/1     1            1           4m5s
cert-manager                        cert-manager-webhook                            1/1     1            1           4m4s
kube-system                         coredns                                         2/2     2            2           5m24s
local-path-storage                  local-path-provisioner                          1/1     1            1           5m19s

What did you expect to happen: I would expect clusterctl init not to return until all CAPI provider deployments are Available. I need to check but I'm pretty sure this was the case with previous versions of clusterctl, has there been a regression? This is with clusterctl v0.3.15.

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

Environment:

  • Cluster-api version:
  • Minikube/KIND version:
  • Kubernetes version: (use kubectl version):
  • OS (e.g. from /etc/os-release):

/kind bug
[One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels]

@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Apr 13, 2021
@CecileRobertMichon
Copy link
Contributor Author

The impact of this is that the user would get errors when trying to create a cluster until pods are ready, eg (from a new user):

what I did:

export EXP_MACHINE_POOL=true
clusterctl init --infrastructure azure
clusterctl config cluster my-cluster --kubernetes-version v1.21.0 --flavor machinepool > cluster-api/vmss.yaml

kubectl apply -f cluster-api/vmss.yaml

and I got the following errors:

kubeadmcontrolplane.controlplane.cluster.x-k8s.io/my-cluster-control-plane created
azuremachinetemplate.infrastructure.cluster.x-k8s.io/my-cluster-control-plane created
kubeadmconfig.bootstrap.cluster.x-k8s.io/my-cluster-mp-0 created
Error from server (InternalError): error when creating "cluster-api/vmss.yaml": Internal error occurred: failed calling webhook "default.cluster.cluster.x-k8s.io": Post "https://capi-webhook-service.capi-webhook-system.svc:443/mutate-cluster-x-k8s-io-v1alpha3-cluster?timeout=30s": dial tcp 10.96.157.147:443: connect: connection refused
Error from server (InternalError): error when creating "cluster-api/vmss.yaml": Internal error occurred: failed calling webhook "default.azurecluster.infrastructure.cluster.x-k8s.io": Post "https://capz-webhook-service.capi-webhook-system.svc:443/mutate-infrastructure-cluster-x-k8s-io-v1alpha3-azurecluster?timeout=30s": dial tcp 10.96.201.42:443: connect: connection refused
Error from server (InternalError): error when creating "cluster-api/vmss.yaml": Internal error occurred: failed calling webhook "default.exp.machinepool.cluster.x-k8s.io": Post "https://capi-webhook-service.capi-webhook-system.svc:443/mutate-exp-cluster-x-k8s-io-v1alpha3-machinepool?timeout=30s": dial tcp 10.96.157.147:443: connect: connection refused
Error from server (InternalError): error when creating "cluster-api/vmss.yaml": Internal error occurred: failed calling webhook "azuremachinepool.kb.io": Post "https://capz-webhook-service.capi-webhook-system.svc:443/mutate-exp-infrastructure-cluster-x-k8s-io-v1alpha3-azuremachinepool?timeout=30s": dial tcp 10.96.201.42:443: connect: connection refused

@CecileRobertMichon
Copy link
Contributor Author

/area clusterctl

@k8s-ci-robot k8s-ci-robot added the area/clusterctl Issues or PRs related to clusterctl label Apr 13, 2021
@nilo19
Copy link

nilo19 commented Apr 14, 2021

/cc

@CecileRobertMichon
Copy link
Contributor Author

@fabriziopandini @shysank any thoughts on this?

@fabriziopandini
Copy link
Member

prior discussion/pr about this

the work was stopped due to the operator work; we should reconsider now as soon as we are a rough idea of when the operator work will land

@vincepri
Copy link
Member

Could we pick this up for v0.4?

@vincepri
Copy link
Member

vincepri commented Jul 6, 2021

/assign @ykakarap
/milestone v0.4
/priority important-soon

@k8s-ci-robot k8s-ci-robot added this to the v0.4 milestone Jul 6, 2021
@k8s-ci-robot k8s-ci-robot added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Jul 6, 2021
@sbueringer
Copy link
Member

Somewhat related: kubernetes-sigs/controller-runtime#723 (after we implemented a wait for readiness)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/clusterctl Issues or PRs related to clusterctl kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants