-
Notifications
You must be signed in to change notification settings - Fork 102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix race with ActivePlan between Instance and PlanExecution controllers #722
Conversation
/retest |
|
||
//todo: REMOVE this... we do not want to create a plan outside of reconcile | ||
// most likely... any use of this function is wrong... all plans should be created in reconcile | ||
func createPlanOld(mgr manager.Manager, planName string, instance *kudov1alpha1.Instance) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it was real pain maintaining two, while the only difference was how they accepted the dependencies which was pretty easy to unite
if err := controllerutil.SetControllerReference(instance, &planExecution, mgr.GetScheme()); err != nil { | ||
log.Printf("InstanceController: Error setting ControllerReference") | ||
return err | ||
func addActivePlanReference(c client.Client, r record.EventRecorder, planExecution *kudov1alpha1.PlanExecution, instance *kudov1alpha1.Instance) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
main change
@@ -228,6 +227,11 @@ func (r *ReconcilePlanExecution) Reconcile(request reconcile.Request) (reconcile | |||
return reconcile.Result{}, err | |||
} | |||
|
|||
if instance.Status.ActivePlan.Name != planExecution.Name || instance.Status.ActivePlan.Namespace != planExecution.Namespace { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is needed because otherwise the plan runs until the end but then cannot update instance because that one was modified in between
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in general I feel like this safeguard is correct. If we're reconciling PE that is not current active plan, we probably should not execute anything
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't this be executed right after the PE is created, but before the instance controller calls addActivePlanReference
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm good point, I guess it could. To overcome even that case I guess adding reconcile trigger after linking activeplan could be a solution? But I still think this safeguard is correct, because we should really execute only active plan, not every plan that changed for example...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh wait, yeah, that should be fine, I think the PE controller should also watch for instance update event and trigger reconciliation as it does for deployments etc. Would you agree? Looking at the implementation right now I am not 100% that is happening and that it was intentional but if we agree on that I can push in that direction
EDIT: Oh yeah, right now it watches only for instances that were created as part of that plan not all of them... I would probably watch and reconcile for all updates. I mean controller-runtime will deal with de-duplication and the reconcile should be idempotent anyway
gvk, _ := apiutil.GVKForObject(instance, scheme) | ||
ref := corev1.ObjectReference{ | ||
Kind: gvk.Kind, | ||
Name: instance.Name, | ||
Namespace: instance.Namespace, | ||
} | ||
planExecution := kudov1alpha1.PlanExecution{ | ||
TypeMeta: metav1.TypeMeta{ | ||
Kind: "PlanExecution", | ||
APIVersion: "kudo.dev/v1alpha1", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have this dotted in a lot of places, so I looked for some "standardization" point that already exists in our codebase generated by kubebuilder, maybe we start moving to something more like this? Or at least turn it into a constant we can reference elsewhere. Maybe this is the constant?
kudo/pkg/apis/kudo/v1alpha1/register.go
Line 33 in 4def9dc
SchemeGroupVersion = schema.GroupVersion{Group: "kudo.dev", Version: "v1alpha1"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's try to unify this in another PR... I'll prepare one. Btw. how do you convert the groupversion to the string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, let's unify in a separate PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 we'll have to look that up. hopefully it's something we can just set as a var and not have as a function call everywhere.
@@ -228,6 +227,11 @@ func (r *ReconcilePlanExecution) Reconcile(request reconcile.Request) (reconcile | |||
return reconcile.Result{}, err | |||
} | |||
|
|||
if instance.Status.ActivePlan.Name != planExecution.Name || instance.Status.ActivePlan.Namespace != planExecution.Namespace { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed
@@ -305,57 +305,42 @@ func instanceEventFilter(mgr manager.Manager) predicate.Funcs { | |||
} | |||
} | |||
|
|||
func createPlan(r *ReconcileInstance, planName string, instance *kudov1alpha1.Instance) error { | |||
func createPlan(c client.Client, r record.EventRecorder, scheme *runtime.Scheme, planName string, instance *kudov1alpha1.Instance) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method should be called createPlanAndUpdateInstance
or even createPlanAndUpdateInstanceActivePlanReerence
:P
@@ -228,6 +252,11 @@ func (r *ReconcilePlanExecution) Reconcile(request reconcile.Request) (reconcile | |||
return reconcile.Result{}, err | |||
} | |||
|
|||
if instance.Status.ActivePlan.Name != planExecution.Name || instance.Status.ActivePlan.Namespace != planExecution.Namespace { | |||
// this can happen for newly created PlanExecution where ActivePlan was not yet set to point to this instance | |||
return reconcile.Result{}, fmt.Errorf("instance %s does not have ActivePlan pointing to PlanExecution %s, %s. Instead %s, %s. Retrying", instance.Name, planExecution.Name, planExecution.Namespace, instance.Status.ActivePlan.Name, instance.Status.ActivePlan.Namespace) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're already watching instance updates, so there is no need to re-trigger this reconciliation (it might create a loop)
err = c.Watch( | ||
&source.Kind{Type: &kudov1alpha1.Instance{}}, | ||
&handler.EnqueueRequestsFromMapFunc{ | ||
ToRequests: mapFn, | ||
ToRequests: handler.ToRequestsFunc( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's meta...
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: alenkacz, gerred The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What type of PR is this?
/kind bug
What this PR does / why we need it:
Based on the test I wrote in #718 it became apparent where lies the race that is one of the causes of multiple deploy plan executions.
This PR proposes to move adding the ActivePlan relation to InstanceController. To make that possible I needed to add some safeguards to the PlanExecutionController as well.
Which issue(s) this PR fixes:
This is related to #628
Special notes for your reviewer:
Does this PR introduce a user-facing change?: