Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle nil pointer in instanceProvision failure to continue deletion #339

Closed
sindhusri16 opened this issue Oct 19, 2023 · 0 comments
Closed
Labels
bug Something isn't working

Comments

@sindhusri16
Copy link
Member

What happened:
We upgraded capioci to v0.11.2, and created some nodepools on existing clusters. There were some provisionFailures we suppose, not sure but we wanted to delete the whole cluster, which was stuck in deleting phase because of these nodepools. In the backend, the instance was running when we issued the delete command. Even though it shows here as instanceProvisionFailed, in the console we were able to see those machines in 'running' state. There could have been some internal issue that caused the provision failure, but when we were trying to delete the cluster we came across this log with some nil pointer:
`{"stream":"stderr","message":"{"ts":1697616351953.0674,"caller":"controller/controller.go:329","msg":"Reconciler error","controller":"ocimachine","controllerGroup":"infrastructure.cluster.x-k8s.io","controllerKind":"OCIMachine","OCIMachine":

{"name":"5e22de10fc6a4da6b24f5d1a5e5c11c7-hmljf","namespace":"oke"}
,"namespace":"oke","name":"5e22de10fc6a4da6b24f5d1a5e5c11c7-hmljf","reconcileID":"7ac776ee-e8e6-473a-b731-b6dc625d7858","err":"error deleting instance 5e22de10fc6a4da6b24f5d1a5e5c11c7-hmljf: can not marshal to path in request for field InstanceId. Due to can not marshal a nil pointer","errVerbose":"can not marshal to path in request for field InstanceId. Due to can not marshal a nil pointer
nerror deleting instance 5e22de10fc6a4da6b24f5d1a5e5c11c7-hmljf\ngit.luolix.top/oracle/cluster-api-provider-oci/controllers.(*OCIMachineReconciler).reconcileDelete\n\t/workspace/controllers/ocimachine_controller.go:391\ngit.luolix.top/oracle/cluster-api-provider-oci/controllers.(*OCIMachineReconciler).Reconcile\n\t/workspace/controllers/ocimachine_controller.go:152\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.5/pkg/internal/controller/controller.go:122\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.5/pkg/internal/controller/controller.go:323\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.5/pkg/internal/controller/controller.go:274\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.5/pkg/internal/controller/controller.go:235\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1598"}","pod":"capoci-controller-manager-9659bd598-hpcp9","container":"manager","image":"253.255.0.31:5000/pca/cluster-api-oci-controller:v0.11.2"}`
What you expected to happen:
Cluster deletion should succeed

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

clusterctl describe cluster 4727b18bc0884a88bbdef686e405176d  -n oke --show-conditions Machine
NAME                                                                                 READY  SEVERITY  REASON                   SINCE  MESSAGE
!! DELETED !! Cluster/4727b18bc0884a88bbdef686e405176d                               True                                      10d
¿¿ClusterInfrastructure - OCICluster/4727b18bc0884a88bbdef686e405176d                True                                      10d
¿¿ControlPlane - KubeadmControlPlane/4727b18bc0884a88bbdef686e405176d-control-plane  True                                      10d
¿ ¿¿3 Machines...                                                                    True                                      10d    See 4727b18bc0884a88bbdef686e405176d-control-plane-7tn2f, 4727b18bc0884a88bbdef686e405176d-control-plane-mctrr, ...
¿¿Workers
  ¿¿Other
    ¿¿!! DELETED !! Machine/5e22de10fc6a4da6b24f5d1a5e5c11c7-67787c94fx4qbkk-d9dkb   False  Error     InstanceProvisionFailed  4d15h
    ¿             ¿¿BootstrapReady                                                   True                                      4d15h
    ¿             ¿¿HealthCheckSucceeded                                             False  Warning   NodeStartupTimeout       4d15h  Node failed to report startup in 10m0s
    ¿             ¿¿InfrastructureReady                                              False  Error     InstanceProvisionFailed  4d15h
    ¿             ¿¿NodeHealthy                                                      False  Info      Deleting                 4d15h
    ¿             ¿¿OwnerRemediated                                                  False  Warning   WaitingForRemediation    4d15h
    ¿             ¿¿PreTerminateDeleteHookSucceeded                                  True                                      4d15h
    ¿¿!! DELETED !! Machine/5e22de10fc6a4da6b24f5d1a5e5c11c7-67787c94fx4qbkk-kf9t7   False  Error     InstanceProvisionFailed  4d16h
    ¿             ¿¿BootstrapReady                                                   True                                      4d16h
    ¿             ¿¿HealthCheckSucceeded                                             False  Warning   NodeStartupTimeout       4d16h  Node failed to report startup in 10m0s
    ¿             ¿¿InfrastructureReady                                              False  Error     InstanceProvisionFailed  4d16h
    ¿             ¿¿NodeHealthy                                                      False  Info      Deleting                 4d16h
    ¿             ¿¿OwnerRemediated                                                  False  Warning   WaitingForRemediation    4d16h
    ¿             ¿¿PreTerminateDeleteHookSucceeded                                  True                                      4d16h
    ¿¿!! DELETED !! Machine/5e22de10fc6a4da6b24f5d1a5e5c11c7-67787c94fx4qbkk-kjn2w   False  Error     InstanceProvisionFailed  4d15h
    ¿             ¿¿BootstrapReady                                                   True                                      4d15h
    ¿             ¿¿HealthCheckSucceeded                                             False  Warning   NodeStartupTimeout       4d15h  Node failed to report startup in 10m0s
    ¿             ¿¿InfrastructureReady                                              False  Error     InstanceProvisionFailed  4d15h
    ¿             ¿¿NodeHealthy                                                      False  Info      Deleting                 4d15h
    ¿             ¿¿OwnerRemediated                                                  False  Warning   WaitingForRemediation    4d15h
    ¿             ¿¿PreTerminateDeleteHookSucceeded                                  True                                      4d15h
    ¿¿!! DELETED !! Machine/5e22de10fc6a4da6b24f5d1a5e5c11c7-67787c94fx4qbkk-nfxwk   False  Error     InstanceProvisionFailed  4d15h
                  ¿¿BootstrapReady                                                   True                                      4d15h
                  ¿¿HealthCheckSucceeded                                             False  Warning   NodeStartupTimeout       4d15h  Node failed to report startup in 10m0s
                  ¿¿InfrastructureReady                                              False  Error     InstanceProvisionFailed  4d15h
                  ¿¿NodeHealthy                                                      False  Info      Deleting                 4d15h
                  ¿¿OwnerRemediated                                                  False  Warning   WaitingForRemediation    4d15h
                  ¿¿PreTerminateDeleteHookSucceeded                                  True                                      4d15h

Environment:

  • CAPOCI version: v0.11.2
  • Cluster-API version (use clusterctl version):
  • Kubernetes version (use kubectl version):
  • Docker version (use docker info):
  • OS (e.g. from /etc/os-release):
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants