Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MachineDeployment refs and ownerRefs are not upgraded after CAPI upgrade with apiVersion bump #6778

Closed
sbueringer opened this issue Jun 29, 2022 · 10 comments · Fixed by #7995
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.
Milestone

Comments

@sbueringer
Copy link
Member

sbueringer commented Jun 29, 2022

What steps did you take and what happened:
I ran the clusterctl upgrade e2e test locally (v1alpha4=>v1beta1) and the refs in MD were not upgraded from v1alpha4 to v1beta1.

Some details:

Refs are initially in-memory bumped to v1beta1 through:

if err := reconcileExternalTemplateReference(ctx, r.Client, r.APIReader, cluster, d.Spec.Template.Spec.Bootstrap.ConfigRef); err != nil {
return ctrl.Result{}, err
}

At some later time we overwrite the in-memory modified version of MD by getting the current one from the server:

err = r.updateMachineDeployment(ctx, d, func(innerDeployment *clusterv1.MachineDeployment) {
mdutil.SetDeploymentRevision(d, msCopy.Annotations[clusterv1.RevisionAnnotation])
})
return msCopy, err

This drops all the changes we made in the meantime and thus the refs stay on v1alpha4 (or on v1alpha3 depends on with which version the MD was initially created)

Example in CI: https://storage.googleapis.com/kubernetes-jenkins/logs/periodic-cluster-api-e2e-main/1542205506544734208/artifacts/clusters/clusterctl-upgrade-f4s62o/resources/clusterctl-upgrade/MachineDeployment/clusterctl-upgrade-rlxhz4-md-0.yaml

We need to fix this in a minor CAPI release before we will be able to drop v1alpha3 from our CRDs in a subsequent release!

Additionally we have to verify that after a CAPI upgrade with apiVersion bumps all ownerReferences have been upgraded.

See ownerRef to v1alpha4 Cluster here: https://storage.googleapis.com/kubernetes-jenkins/logs/periodic-cluster-api-e2e-main/1542205506544734208/artifacts/clusters/clusterctl-upgrade-f4s62o/resources/clusterctl-upgrade/MachineDeployment/clusterctl-upgrade-rlxhz4-md-0.yaml

If we don't do this and just drop the version we run into all kind of fun deadlocks (at the latest when trying to delete a Cluster/MD/...)

P.S. Found this issue when playing around with clusterctl CRD migration and trying to drop v1alpha3 from our CRDs

Environment:

  • Cluster-api version:
  • minikube/kind version:
  • Kubernetes version: (use kubectl version):
  • OS (e.g. from /etc/os-release):

/kind bug
[One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels]

@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Jun 29, 2022
@sbueringer sbueringer changed the title MachineDeployment refs are not upgraded on reconcile MachineDeployment refs and ownerRefs are not upgraded after CAPI upgrade with apiVersion bump Jun 29, 2022
@sbueringer
Copy link
Member Author

Duplicate of an issue I created last year (#5470) :(

/close

@k8s-ci-robot
Copy link
Contributor

@sbueringer: Closing this issue.

In response to this:

Duplicate of an issue I created last year (#5470) :(

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@sbueringer
Copy link
Member Author

/reopen

Realized this issue is more comprehensive as it also mentions ownerRefs.
More details about the ref issue can be found in #5470

@k8s-ci-robot k8s-ci-robot reopened this Sep 5, 2022
@k8s-ci-robot
Copy link
Contributor

@sbueringer: Reopened this issue.

In response to this:

/reopen

Realized this issue is more comprehensive as it also mentions ownerRefs.
More details about the ref issue can be found in #5470

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot
Copy link
Contributor

@sbueringer: This issue is currently awaiting triage.

If CAPI contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Sep 5, 2022
@sbueringer
Copy link
Member Author

/assign @killianmuldoon
(as you were assigned to the other one)

@sbueringer
Copy link
Member Author

Sorry for the back and forth.

I created this issue for ownerRefs: #7224

I've done some additional research on refs and will open a separate issue to address the apiVersion-bump-in-refs topic.

/close

@k8s-ci-robot
Copy link
Contributor

@sbueringer: Closing this issue.

In response to this:

Sorry for the back and forth.

I created this issue for ownerRefs: #7224

I've done some additional research on refs and will open a separate issue to address the apiVersion-bump-in-refs topic.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@sbueringer
Copy link
Member Author

sbueringer commented Jan 25, 2023

/reopen

Let's use this issue to track the fix for the ref bump in MD and also the implementation for MachinePool

/assign

/milestone v1.4

@k8s-ci-robot k8s-ci-robot added this to the v1.4 milestone Jan 25, 2023
@k8s-ci-robot k8s-ci-robot reopened this Jan 25, 2023
@k8s-ci-robot
Copy link
Contributor

@sbueringer: Reopened this issue.

In response to this:

/reopen

Let's use this issue to track the fix for MD and also the implementation for MachinePool

/assign

/milestone v1.4

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.
Projects
None yet
3 participants