Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Add KCP feature to clusterctl alpha rollout #6858

Conversation

tobiasgiese
Copy link
Member

@tobiasgiese tobiasgiese commented Jul 7, 2022

Signed-off-by: Tobias Giese tobias.giese@mercedes-benz.com

What this PR does / why we need it:

This PR implements the sub-commands rollout, pause and resume for clusterctl alpha rollout

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Part of #3439 (replaces #6857)

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jul 7, 2022
@tobiasgiese tobiasgiese force-pushed the tobiasgiese/clusterctl-kcp-rollout branch from f77c1f3 to 6dd881d Compare July 7, 2022 14:47
@tobiasgiese tobiasgiese force-pushed the tobiasgiese/clusterctl-kcp-rollout branch 3 times, most recently from 34d4f49 to 4747856 Compare July 7, 2022 15:16
@tobiasgiese
Copy link
Member Author

tobiasgiese commented Jul 7, 2022

/retest

tests were successful on Prow

@tobiasgiese
Copy link
Member Author

/test pull-cluster-api-e2e-informing-main

flaky CC test

@tobiasgiese tobiasgiese force-pushed the tobiasgiese/clusterctl-kcp-rollout branch 2 times, most recently from 24f31ae to 95c0015 Compare July 7, 2022 18:07
@@ -53,3 +65,8 @@ func setRestartedAtAnnotation(proxy cluster.Proxy, name, namespace string) error
patch := client.RawPatch(types.MergePatchType, []byte(fmt.Sprintf("{\"spec\":{\"template\":{\"metadata\":{\"annotations\":{\"cluster.x-k8s.io/restartedAt\":\"%v\"}}}}}", time.Now().Format(time.RFC3339))))
return patchMachineDeployment(proxy, name, namespace, patch)
}

func setRolloutAfter(proxy cluster.Proxy, name, namespace string) error {
patch := client.RawPatch(types.MergePatchType, []byte(fmt.Sprintf("{\"spec\":{\"rolloutAfter\":\"%v\"}}", time.Now().Format(time.RFC3339))))
Copy link
Member Author

@tobiasgiese tobiasgiese Jul 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another idea to handle issues regarding different timezones or async local times: we could use for rolloutAfter the creation timestamp + a few seconds of the most recent control plane machine. In this case the rolling update should start in any case.

CAPI is checking if the machines creation timestamp is before rolloutAfter and rolloutAfter is before the reconciliation time.

return machine.CreationTimestamp.Before(rolloutAfter) && rolloutAfter.Before(reconciliationTime)

Copy link
Member Author

@tobiasgiese tobiasgiese Jul 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding #6858 (comment)

AFAIK kubectl rollout restart has no rolloutAfter alike field. I think that the rollout will be performed immediately.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And btw, e.g. Deployment has also a restartedAt field like our MachineDeployment.

❯ kk rollout restart deploy coredns
deployment.apps/coredns restarted
❯ kk get deploy coredns -oyaml | grep restartedAt
        kubectl.kubernetes.io/restartedAt: "2022-07-18T12:09:53+02:00"

Same for Statefulsets

❯ kk rollout restart sts csi-resizer-cinder
statefulset.apps/csi-resizer-cinder restarted
❯ kk get sts csi-resizer-cinder -oyaml | grep restartedAt
        kubectl.kubernetes.io/restartedAt: "2022-07-18T12:11:06+02:00"

Copy link
Member

@chrischdi chrischdi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR :-)

All in all it looks pretty good, only some nits :-)
Still want to / have to play around with it.

cmd/clusterctl/client/alpha/rollout.go Outdated Show resolved Hide resolved
cmd/clusterctl/client/alpha/rollout_restarter_test.go Outdated Show resolved Hide resolved
cmd/clusterctl/client/alpha/kubeadmcontrolplane.go Outdated Show resolved Hide resolved
cmd/clusterctl/client/alpha/kubeadmcontrolplane.go Outdated Show resolved Hide resolved
@tobiasgiese tobiasgiese force-pushed the tobiasgiese/clusterctl-kcp-rollout branch from 95c0015 to c8c2742 Compare July 18, 2022 09:04
@tobiasgiese tobiasgiese force-pushed the tobiasgiese/clusterctl-kcp-rollout branch from c8c2742 to 3e892ae Compare July 18, 2022 09:42
@chrischdi
Copy link
Member

chrischdi commented Aug 8, 2022

One question: Does making use of RolloutAfter limit the functionality for RolloutAfter.

E.g. a user may have already set RolloutAfter to a timestamp in the future (e.g. to enforce recreation of the control plane to renew certificates).

So what would happen if RolloutAfter is already set (and because of that we should not overwrite it?)

Edit: so I currently think we should not make use of RolloutAfter for clusterctl alpha rollout.

@tobiasgiese
Copy link
Member Author

tobiasgiese commented Aug 13, 2022

One question: Does making use of RolloutAfter limit the functionality for RolloutAfter.

AFAIK the RolloutAfter value will be overriden. Maybe we have to change the update behavior of the KCP? E.g., make use of the cluster.x-k8s.io/restartedAt for the KCP as well. In this case we can leave the RolloutAfter exclusively for the actual use case. This would also fit to #7053.

@sbueringer
Copy link
Member

cc @ykakarap

Copy link
Contributor

@ykakarap ykakarap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks interesting to me. I will take a look a this.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 14, 2022
@tobiasgiese tobiasgiese force-pushed the tobiasgiese/clusterctl-kcp-rollout branch from 3e892ae to 86e02a0 Compare September 15, 2022 15:03
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 15, 2022
@tobiasgiese tobiasgiese force-pushed the tobiasgiese/clusterctl-kcp-rollout branch from 86e02a0 to 2d38aa0 Compare September 15, 2022 15:07
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Sep 15, 2022
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 28, 2022
@sbueringer
Copy link
Member

/retest

@ykakarap
Copy link
Contributor

/lgtm

Copy link
Member

@vincepri vincepri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve
/hold
for @fabriziopandini's review

Great work thank you!

@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Jan 23, 2023
@tobiasgiese
Copy link
Member Author

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 23, 2023
@tobiasgiese
Copy link
Member Author

tobiasgiese commented Jan 23, 2023

/hold

sorry, overlooked your hold Vince

@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Jan 23, 2023
@tobiasgiese
Copy link
Member Author

Will solve the merge conflict after the review from @fabriziopandini

Copy link
Member

@fabriziopandini fabriziopandini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only one question, otherwise lgtm

@@ -49,6 +49,8 @@ require (
github.com/blang/semver/v4 v4.0.0 // indirect
github.com/cespare/xxhash/v2 v2.1.2 // indirect
github.com/containerd/containerd v1.6.12 // indirect
github.com/coredns/caddy v1.1.0 // indirect
github.com/coredns/corefile-migration v1.0.18 // indirect
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

q for my understanding, why are there such changes in hack/tools/go.mod? this sems unrelated to the other changes in this PR

Copy link
Member Author

@tobiasgiese tobiasgiese Jan 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am also not sure why these changes are in the PR. But go mod tidy and the Prow job want this change.

IIRC we found out why these changes happen @chrischdi, right? Do you still know why?

Copy link
Member

@sbueringer sbueringer Jan 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hack/tools/tilt-prepare uses "sigs.k8s.io/cluster-api/cmd/clusterct/*"

"sigs.k8s.io/cluster-api/cmd/clusterct/*" now uses "sigs.k8s.io/cluster-api/controlplane/kubeadm/api/v1beta1" (for the KubeadmControlPlane type)

"sigs.k8s.io/cluster-api/controlplane/kubeadm/api/v1beta1" uses "github.com/coredns/corefile-migration" (for the KCP webhook)

"github.com/coredns/corefile-migration" uses "github.com/coredns/caddy" (internally)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I came to the same conclusion as stefan 🏆

@sbueringer
Copy link
Member

@tobiasgiese Can you please rebase the PR? I would merge then

Signed-off-by: Tobias Giese <tobias.giese@mercedes-benz.com>
@tobiasgiese tobiasgiese force-pushed the tobiasgiese/clusterctl-kcp-rollout branch from 6eb2e31 to 4593c3c Compare January 25, 2023 12:34
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 25, 2023
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 25, 2023
@tobiasgiese
Copy link
Member Author

@tobiasgiese Can you please rebase the PR? I would merge then

Done, thank you 👍🏻

@sbueringer
Copy link
Member

Thank you very much!! (especially for the patience :))

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 25, 2023
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: fad73e5eb2d6bfa576ff62c98f846f5987bce042

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sbueringer, vincepri

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [sbueringer,vincepri]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@tobiasgiese
Copy link
Member Author

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 25, 2023
@k8s-ci-robot k8s-ci-robot merged commit 33d0e8f into kubernetes-sigs:main Jan 25, 2023
@k8s-ci-robot k8s-ci-robot added this to the v1.4 milestone Jan 25, 2023
@tobiasgiese tobiasgiese deleted the tobiasgiese/clusterctl-kcp-rollout branch January 25, 2023 13:03
@fabriziopandini
Copy link
Member

thanks for pushing this over the line @tobiasgiese!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants