-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✨ OnDelete rollout strategy #4346
✨ OnDelete rollout strategy #4346
Conversation
Welcome @relyt0925! |
Hi @relyt0925. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
902d3ac
to
70d1daa
Compare
/milestone v0.4.0 |
0c0409d
to
67421ec
Compare
/retest |
67421ec
to
ea3b146
Compare
671ad55
to
cf9a163
Compare
22e06e5
to
589db28
Compare
@@ -175,7 +175,7 @@ func TestReconcileUpdateObservedGeneration(t *testing.T) { | |||
errGettingObject = testEnv.Get(ctx, util.ObjectKey(kcp), kcp) | |||
g.Expect(errGettingObject).NotTo(HaveOccurred()) | |||
return kcp.Status.ObservedGeneration | |||
}, 10*time.Second).Should(Equal(generation)) | |||
}, 20*time.Second).Should(Equal(generation)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@relyt0925 let's revert this given that #4466 has merged
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it!
lgtm pending enum validation + revert of increased test timeout commit |
// DisableMachineCreate is an annotation that can be used to signal a MachineSet to stop creating new machines. | ||
// It is utilized in the OnDelete MachineDeploymentStrategy to allow the MachineDeployment controller to scale down | ||
// older MachineSets when Machines are deleted and add the new replicas to the latest MachineSet. | ||
DisableMachineCreate = "cluster.x-k8s.io/disable-machine-create" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to open an issue regarding pause and the update of statuses like @detiber suggested
@@ -199,6 +199,10 @@ func (r *MachineDeploymentReconciler) reconcile(ctx context.Context, cluster *cl | |||
return ctrl.Result{}, r.rolloutRolling(ctx, d, msList) | |||
} | |||
|
|||
if d.Spec.Strategy.Type == clusterv1.OnDeleteMachineDeploymentStrategyType { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm +1 adding the enum-based validation now as part of the v0.4 release
@@ -0,0 +1,182 @@ | |||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we name this file machinedeployment_rollout_ondelete
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can! and done
return nil | ||
} | ||
|
||
//reconcileOldMachineSetsOnDelete handles reconciliation of Old MachineSets associated with the MachineDeployment in the OnDelete MachineDeploymentStrategyType. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
//reconcileOldMachineSetsOnDelete handles reconciliation of Old MachineSets associated with the MachineDeployment in the OnDelete MachineDeploymentStrategyType. | |
// reconcileOldMachineSetsOnDelete handles reconciliation of Old MachineSets associated with the MachineDeployment in the OnDelete MachineDeploymentStrategyType. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
log.V(4).Error(err, "failed to convert MachineSet %q label selector to a map", oldMS.Name) | ||
continue | ||
} | ||
log.V(4).Info("fetching Machines associated with MachineSet", "MachineSet", oldMS.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log.V(4).Info("fetching Machines associated with MachineSet", "MachineSet", oldMS.Name) | |
log.V(4).Info("Fetching Machines associated with MachineSet", "MachineSet", oldMS.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
err = patchHelper.Patch(ctx, oldMS) | ||
if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
err = patchHelper.Patch(ctx, oldMS) | |
if err != nil { | |
if err := patchHelper.Patch(ctx, oldMS); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
err = r.Client.List(ctx, | ||
allMachinesInOldMS, | ||
client.InNamespace(oldMS.Namespace), | ||
client.MatchingLabels(selectorMap), | ||
) | ||
if err != nil { | ||
return errors.Wrap(err, "failed to list machines") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
err = r.Client.List(ctx, | |
allMachinesInOldMS, | |
client.InNamespace(oldMS.Namespace), | |
client.MatchingLabels(selectorMap), | |
) | |
if err != nil { | |
return errors.Wrap(err, "failed to list machines") | |
} | |
if err := r.Client.List(ctx, | |
allMachinesInOldMS, | |
client.InNamespace(oldMS.Namespace), | |
client.MatchingLabels(selectorMap), | |
); err != nil { | |
return errors.Wrap(err, "failed to list machines") | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
log.V(4).Error(errors.Errorf("unexpected negative scale down amount: %d", machineSetScaleDownAmountDueToMachineDeletion), fmt.Sprintf("Error reconciling MachineSet %s", oldMS.Name)) | ||
} | ||
scaleDownAmount -= machineSetScaleDownAmountDueToMachineDeletion | ||
log.V(4).Info("adjusting replica count for deleted machines", "replicaCount", oldMS.Name, "replicas", updatedReplicaCount) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log.V(4).Info("adjusting replica count for deleted machines", "replicaCount", oldMS.Name, "replicas", updatedReplicaCount) | |
log.V(4).Info("Adjusting replica count for deleted machines", "replicaCount", oldMS.Name, "replicas", updatedReplicaCount) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
controllers/machineset_controller.go
Outdated
|
||
if ms.Annotations != nil { | ||
if _, ok := ms.Annotations[clusterv1.DisableMachineCreate]; ok { | ||
log.Info("Automatic creation of new machines disabled for machine set") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log.Info("Automatic creation of new machines disabled for machine set") | |
log.V(2).Info("Automatic creation of new machines disabled for machine set") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this should be a condition of some sorts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
initial comment done.
@@ -175,7 +175,7 @@ func TestReconcileUpdateObservedGeneration(t *testing.T) { | |||
errGettingObject = testEnv.Get(ctx, util.ObjectKey(kcp), kcp) | |||
g.Expect(errGettingObject).NotTo(HaveOccurred()) | |||
return kcp.Status.ObservedGeneration | |||
}, 10*time.Second).Should(Equal(generation)) | |||
}, 20*time.Second).Should(Equal(generation)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revert this? #4466 disabled this test temporarily
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done!
Issue opened for potential bug with paused annotation: |
c9dcf9b
to
105d0dc
Compare
/test pull-cluster-api-test-main |
1 similar comment
/test pull-cluster-api-test-main |
105d0dc
to
9756be2
Compare
@CecileRobertMichon @vincepri @detiber all comments should be addressed. |
did you rebase on top of the latest master branch to include #4466? |
/test pull-cluster-api-test-main |
I did! |
return errors.Wrap(err, "failed to list machines") | ||
} | ||
totalMachineCount := int32(len(allMachinesInOldMS.Items)) | ||
log.V(4).Info("retrieved machines", "totalMachineCount", totalMachineCount) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: use Uppercase Retrieved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
dont rely on status update check for nil update update strategy double wait time for flakey kubeadm test update with comments update
9756be2
to
2469f74
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve |
@fabriziopandini would you mind issuing a |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: detiber, relyt0925 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What this PR does / why we need it:
This PR adds a new rollout strategy in which a user can fully control the rollout of machines to a new configuration. The strategy is called "OnDelete" and behaves as follows.
Users control upgrading to a new MachineSet Configuration by triggering deletion of machines in the old machine set(s) with
kubectl delete machine MACHINENAME
. The machineDeployment controller waits for the machine to get deleted completely and then will proceed to provision the new replica with the new configuration.In this model: in order to complete the rollout the user is responsible for ultimately "replacing" every machine using old configuration. This allows the user to identify and execute any order of machine upgrades they choose. They control the velocity of the rollout as well by how quickly they remove machines in the old configuration. The rollout of a new machine does not occur until the old machine is fully removed in the current implementation.
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes # #4344