Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reconciliation not properly done: capioci proceeds with machine creation when loadbalancer is in Creating state #331

Closed
sindhusri16 opened this issue Sep 14, 2023 · 0 comments · Fixed by #330
Assignees
Labels
bug Something isn't working

Comments

@sindhusri16
Copy link
Member

What happened:
Triggered a cluster creation in our private environment and it failed with the following CRD failure: APIServerLoadBalancerReconciliationFailed and BackendAdditionFailed errors. On debugging, I noticed that the loadbalancer was still in creating and was throwing an incorrectState error for the backendAddition step. The work-request for the loadbalancer also clearly failed at a later stage in our logs. Because of this, when we issue a delete command on the failed cluster it never suceeds.

What you expected to happen:
Clusters should get deleted successfully without any impact due to dependent resources.

  1. capioci should not create a machine without checking the loadbalancer status as active.
  2. If the loadbalancer creation fails and is deleted, it should check the work-request(which will be available in failed state if the loadbalancer is not available as failed state) and subsequently proceed with the other deletions and delete the cluster successfully.

How to reproduce it (as minimally and precisely as possible):
On OCI, we are not sure on the ways to reproduce it but I can briefly explain the steps on our environment.
Apply the cluster CRD yaml to create a cluster. Once LB creation starts, please check if the checks are happening multiple times before machine creation is triggered. This can be done by reduce your checking timeframe in the code to make sure the creating state is checked multiple times. It could be timed out if the loadbalancer is stuck in creating before you go ahead with the machine reconciliation.

Anything else we need to know?:

Environment:

  • CAPOCI version: v0.11.0
  • Cluster-API version (use clusterctl version): 1.4.0
  • Kubernetes version (use kubectl version):1.25.7
  • Docker version (use docker info):N/A
  • OS (e.g. from /etc/os-release): Oracle Linux 8
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants