Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement DRClusterConfig reconciler to create required ClusterClaims #1485

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

ShyamsundarR
Copy link
Member

@ShyamsundarR ShyamsundarR commented Jul 10, 2024

Part of #1403

This implements creation of ClusterClaims for the various classes detected on the ManagedCluster that is also marked via a DRClusterConfig as a DR peer.

Future considerations:

  • Add a basic status (say status.Message and observedGeneration) to DRClusterConfig reflecting success or failure of reconciliation
  • Add required CEL immutability rules to DRClusterConfig, as certain fields should not be changed once created (e.g clusterID)
  • DRClusterConfig can check s3 accessibility and report issues regarding the same, closing the gap for S3 validation on the ManagedCluster
  • Add any required clusterID validation

@ShyamsundarR ShyamsundarR marked this pull request as draft July 10, 2024 12:00
@ShyamsundarR ShyamsundarR force-pushed the claimcreation branch 3 times, most recently from cd24518 to 8b258ad Compare July 12, 2024 14:18
@ShyamsundarR ShyamsundarR marked this pull request as ready for review July 14, 2024 15:37
return err
}

if _, err := mwu.FindManifestWork(mwu.BuildManifestWorkName(util.MWTypeDRCConfig), drcluster.GetName()); err == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ShyamsundarR it's the only place we are checking manifest after deletion. If there is no error in line 248, the manifest will be deleted, so why we need an additional check?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question, I want to ensure that the DRClusterConfig resource at the managed cluster is deleted before we proceed to delete the MW for the ramen components at the managed cluster. If we reach that before the resource is actually deleted, then the DRClusterConfig reconciler may not be existing to finalize the resource locally.

Hence we delete and wait till it is actually deleted, then we proceed with deletion of the ramen operators on the managed cluster.

) error {
rawObject, err := GetRawExtension(mw.Spec.Workload.Manifests, gvk)
if err != nil {
return fmt.Errorf("failed fetching MaintenanceMode from manifest %w", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

failed fetching resource from manifest? Not MaintenanceMode

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed


err = json.Unmarshal(rawObject.Raw, object)
if err != nil {
return fmt.Errorf("failed unmarshaling MaintenanceMode from manifest %w", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as previous

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

if err := u.statusUpdate(); err != nil {
u.log.Info("failed to update status", "failure", err)
}

return ctrl.Result{Requeue: requeue || u.requeue}, reconcileError
return ctrl.Result{Requeue: requeue || u.requeue}, nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is an error during fencing - we will not return error?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is an error in fencing we will note that in the status and log it, and we set requeue to true (for future reconciles). Returning an error is not required at this point.

// another DRPolicy
added := map[string]bool{}

for idx := range drpolicies.Items {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible, that policy is marked for deletion? Do we need to handle it anyway?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly there is a test case for this that passes, though based on the comment I think it is possible that we may retain a stale schedule till a future DRCluster reconcile (due to any reason) is initiated. Will fix this by skipping deleted DRPolicies in the loop.

The parent PR that this is a part of is merged, hence will address it as a separate PR.

@ShyamsundarR ShyamsundarR force-pushed the claimcreation branch 2 times, most recently from e822464 to a1c09b4 Compare July 19, 2024 15:50
Copy link
Member

@BenamarMk BenamarMk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Also, cleanup some scaffolding comments.

Signed-off-by: Shyamsundar Ranganathan <srangana@redhat.com>
- Add finalizer to resource being reconciled
- Remove on delete
- Update reconciler to rate limit max exponential backoff to
5 minutes

Signed-off-by: Shyamsundar Ranganathan <srangana@redhat.com>
Signed-off-by: Shyamsundar Ranganathan <srangana@redhat.com>
Building the scaffold for the overall functionality.

Signed-off-by: Shyamsundar Ranganathan <srangana@redhat.com>
Signed-off-by: Shyamsundar Ranganathan <srangana@redhat.com>
For classes listed, those that do not need a ClusterClaim
any longer are deleted.

Added a StorageClass watcher as well to the reconcile on
changes to StorageClasses.

Signed-off-by: Shyamsundar Ranganathan <srangana@redhat.com>
Signed-off-by: Shyamsundar Ranganathan <srangana@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants