Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ServerSideApply fails with "conversion failed" #11136

Closed
3 tasks done
Dbzman opened this issue Nov 1, 2022 · 27 comments
Closed
3 tasks done

ServerSideApply fails with "conversion failed" #11136

Dbzman opened this issue Nov 1, 2022 · 27 comments
Assignees
Labels
bug Something isn't working

Comments

@Dbzman
Copy link

Dbzman commented Nov 1, 2022

Checklist:

  • I've searched in the docs and FAQ for my answer: https://bit.ly/argocd-faq.
  • I've included steps to reproduce the bug.
  • I've pasted the output of argocd version.

Describe the bug
Using ServerSideApply, configured in an Application via Sync Options, fails with

error calculating structured merge diff: error calculating diff: error while running updater.Apply: converting (v1.CronJob) to (v1beta1.CronJob): unknown conversion

Using it only with the "Sync" button, without having it configured for the app, works, though.

To Reproduce

  • Have a CronJob with apiVersion batch/v1 or HPA with apiVersion autoscaling/v2beta2 synced without SSA
  • activate ServerSideApply in App details
  • => most likely fails instantly
  • if not, try to sync manually with "Server-Side apply" option

Expected behavior
ServerSideApply should work in both cases (app config + manual sync)

Screenshots
Application configuration which breaks:
Bildschirmfoto 2022-11-01 um 13 49 03

Using it only with the Sync button works:
Bildschirmfoto 2022-11-01 um 13 50 44

Version

argocd: v2.5.0+b895da4
  BuildDate: 2022-10-25T14:40:01Z
  GitCommit: b895da457791d56f01522796a8c3cd0f583d5d91
  GitTreeState: clean
  GoVersion: go1.18.7
  Compiler: gc
  Platform: linux/amd64
argocd-server: v2.5.0+b895da4
  BuildDate: 2022-10-25T14:40:01Z
  GitCommit: b895da457791d56f01522796a8c3cd0f583d5d91
  GitTreeState: clean
  GoVersion: go1.18.7
  Compiler: gc
  Platform: linux/amd64
  Kustomize Version: v4.5.7 2022-08-02T16:35:54Z
  Helm Version: v3.10.1+g9f88ccb
  Kubectl Version: v0.24.2
  Jsonnet Version: v0.18.0
@Dbzman Dbzman added the bug Something isn't working label Nov 1, 2022
@blakepettersson
Copy link
Member

Which version of k8s are you using?

@Dbzman
Copy link
Author

Dbzman commented Nov 1, 2022

We use 1.22.14.

@blakepettersson
Copy link
Member

Could be the case that this is something that needs to be handled when doing the diff in gitops-engine, but I'm not familiar enough with SSA to say for sure. @leoluz?

(potentially related to #11139?)

@leoluz
Copy link
Collaborator

leoluz commented Nov 2, 2022

@Dbzman Please inspect your Argo CD controller logs and see if you find an entry with this message:

error creating gvk parser: ...

If so, can you provide the full message in the log?

@Dbzman
Copy link
Author

Dbzman commented Nov 2, 2022

@leoluz We didn't see any of those errors. We configured the loglevel to info. Not sure if the error is supposed to show there.
What we further observed is that this doesn't happen consistently for all apps, but they all use the same api version. (batch/v1)

@Dbzman
Copy link
Author

Dbzman commented Nov 2, 2022

We noticed a very strange behavior here. We saved the affected CronJob manifest locally, deleted it on Kubernetes and re-created it again. (so it's the exact same manifest, just re-created) After that, Argo was able to sync the application.
One thing is that those CronJobs were created with an older api version in the past, but we upgraded them to batch/v1 long ago and also in Kubernetes it shows as batch/v1. Don't know why re-creation helps in that case.

@leoluz
Copy link
Collaborator

leoluz commented Nov 2, 2022

We noticed a very strange behavior here. We saved the affected CronJob manifest locally, deleted it on Kubernetes and re-created it again. (so it's the exact same manifest, just re-created) After that, Argo was able to sync the application.

Thanks for the additional info. That actually makes sense. What is strange to me is that from your error message it seems that Argo CD is trying to convert from v1.CronJob to v1beta1.CronJob. Not sure why it is trying to go with an older version. That would only make sense if you are applying a CronJob with v1beta1.

I'll try to reproduce this error locally anyways.

@Dbzman
Copy link
Author

Dbzman commented Nov 4, 2022

Thanks for checking. Indeed, it's really weird that it tries to convert to an older version.

We had this issue on 60 of our 400 apps. Yesterday we fixed them all with the above mentioned workaround. Today all of those 60 apps show the error again. So it seems that it has nothing to do with old manifests that were upgraded.

@leoluz
Copy link
Collaborator

leoluz commented Nov 4, 2022

@Dbzman just confirming.. Are the steps to reproduce still valid with your latest findings??

@Dbzman
Copy link
Author

Dbzman commented Nov 4, 2022

@leoluz I would say yes.

@mile-misan
Copy link

Using 2.5.1 version and having similar issues.
error calculating structured merge diff: error calculating diff: error while running updater.Apply: converting (v1beta1.PodDisruptionBudget) to (v1.PodDisruptionBudget): unknown conversion
and
error calculating structured merge diff: error calculating diff: error while running updater.Apply: converting (v2beta2.HorizontalPodAutoscaler) to (v1.HorizontalPodAutoscaler): unknown conversion

@llavaud
Copy link

llavaud commented Nov 15, 2022

Same here with 2.5.2:
error calculating structured merge diff: error calculating diff: error while running updater.Apply: converting (v2beta1.HorizontalPodAutoscaler) to (v1.HorizontalPodAutoscaler): unknown conversion

@agaudreault
Copy link
Member

Same behavior with 2.5.2:
ComparisonError: error calculating structured merge diff: error calculating diff: error while running updater.Apply: converting (v1.Ingress) to (v1beta1.Ingress): unknown conversion

adding Ingress in case someone hits the issue with that resource.

@leoluz
Copy link
Collaborator

leoluz commented Nov 21, 2022

Just to provide some direction for users that might get into this error, the current workaround is disabling SSA in the failing resources by adding the annotation: argocd.argoproj.io/sync-options: ServerSideApply=false.
For example, if the error is related to Ingress conversion then add the annotation to your Ingress resource.

@chrisduong
Copy link

Just to provide some direction for users that might get into this error, the current workaround is disabling SSA in the failing resources by adding the annotation: argocd.argoproj.io/sync-options: ServerSideApply=false.
For example, if the error is related to Ingress conversion then add the annotation to your Ingress resource.

Hi @leoluz, I add the annotation but it didn't work, still the same problem (HorizontalPodAutoscaler case)

@pseymournutanix
Copy link

fwiw the same occurs with cronJob resources on 2.55 as well

@msw-kialo
Copy link

We run into similar issues when enabling SSA for our apps. However, the issue isn't consistent between clusters/apps (the same app/resource might work on one but not the other).

What is strange to me is that from your error message it seems that Argo CD is trying to convert from v1.CronJob to v1beta1.CronJob. Not sure why it is trying to go with an older version. That would only make sense if you are applying a CronJob with v1beta1.

@leoluz I believe managedFields are to blame. They include an apiVersion field that might reference an older (beta) version.

Managed fields of an affected `Ingress` resource:
metadata:
  managedFields:
    - apiVersion: networking.k8s.io/v1beta1
      fieldsType: FieldsV1
      fieldsV1:
        f:metadata:
          f:annotations:
            .: {}
            f:alb.ingress.kubernetes.io/actions.ssl-redirect: {}
            f:alb.ingress.kubernetes.io/certificate-arn: {}
            f:alb.ingress.kubernetes.io/listen-ports: {}
            f:alb.ingress.kubernetes.io/scheme: {}
            f:alb.ingress.kubernetes.io/ssl-policy: {}
            f:alb.ingress.kubernetes.io/target-type: {}
          f:labels:
            .: {}
            f:app.kubernetes.io/instance: {}
      manager: kubectl
      operation: Update
      time: "2021-05-28T16:20:40Z"
    - apiVersion: networking.k8s.io/v1beta1
      fieldsType: FieldsV1
      fieldsV1:
        f:metadata:
          f:finalizers: {}
      manager: controller
      operation: Update
      time: "2021-08-02T09:10:54Z"
    - apiVersion: networking.k8s.io/v1beta1
      fieldsType: FieldsV1
      fieldsV1:
        f:spec:
          f:ingressClassName: {}
      manager: argocd-application-controller
      operation: Update
      time: "2021-08-02T09:18:03Z"
    - apiVersion: networking.k8s.io/v1
      fieldsType: FieldsV1
      fieldsV1:
        f:metadata:
          f:finalizers:
            v:"group.ingress.k8s.aws/argo-ingresses": {}
        f:status:
          f:loadBalancer:
            f:ingress: {}
      manager: controller
      operation: Update
      time: "2022-03-21T15:25:24Z"
    - apiVersion: networking.k8s.io/v1
      fieldsType: FieldsV1
      fieldsV1:
        f:metadata:
          f:annotations:
            f:alb.ingress.kubernetes.io/group.name: {}
            f:alb.ingress.kubernetes.io/load-balancer-attributes: {}
            f:kubectl.kubernetes.io/last-applied-configuration: {}
        f:spec:
          f:rules: {}
      manager: argocd-application-controller
      operation: Update
      time: "2022-08-15T11:22:05Z"
    name: argocd
    namespace: argocd
    resourceVersion: "206036857"
    uid: 3df56465-962b-42bb-9075-e61740b636cc
Managed fields of corresponding resource (same name / namespace) but on a different cluster (just different cluster / app age):
metadata:
  managedFields:
    - apiVersion: networking.k8s.io/v1
      fieldsType: FieldsV1
      fieldsV1:
        f:metadata:
          f:annotations:
            .: {}
            f:alb.ingress.kubernetes.io/actions.ssl-redirect: {}
            f:alb.ingress.kubernetes.io/certificate-arn: {}
            f:alb.ingress.kubernetes.io/group.name: {}
            f:alb.ingress.kubernetes.io/listen-ports: {}
            f:alb.ingress.kubernetes.io/scheme: {}
            f:alb.ingress.kubernetes.io/ssl-policy: {}
            f:alb.ingress.kubernetes.io/target-type: {}
          f:labels:
            .: {}
            f:app.kubernetes.io/instance: {}
        f:spec:
          f:ingressClassName: {}
          f:rules: {}
      manager: kubectl-client-side-apply
      operation: Update
      time: "2022-05-05T15:11:18Z"
    - apiVersion: networking.k8s.io/v1
      fieldsType: FieldsV1
      fieldsV1:
        f:metadata:
          f:finalizers:
            .: {}
            v:"group.ingress.k8s.aws/argo-ingresses": {}
        f:status:
          f:loadBalancer:
            f:ingress: {}
      manager: controller
      operation: Update
      time: "2022-05-05T15:11:20Z"
    - apiVersion: networking.k8s.io/v1
      fieldsType: FieldsV1
      fieldsV1:
        f:metadata:
          f:annotations:
            f:alb.ingress.kubernetes.io/load-balancer-attributes: {}
            f:kubectl.kubernetes.io/last-applied-configuration: {}
      manager: argocd-application-controller
      operation: Update
      time: "2022-08-15T11:21:51Z"

It also explains why recreating works - it clears the managedFields.

Sadly, it does not help me yet to resolve this issue without recreating the resources (I haven't found a way to clear/edit the managedFields).

@jetersen
Copy link
Contributor

jetersen commented May 1, 2023

I believe managedFields are to blame. They include an apiVersion field that might reference an older (beta) version.

This is not a might this is the definitive issue. 😓

@chrisduong
Copy link

Do we why ArgoCD does not respect ".Capabilities.APIVersions", but use the "managedField" (supposed it is the reason, I don't know internally which component does this) as the way to decide which ApiGroup to use?

@zswanson
Copy link

zswanson commented Aug 30, 2023

We are seeing this in 2.8 with HPA, clusterrole, clusterrolebinding and roles, on clusters that have all been properly upgraded and resource manifests updated but the clusters were created back when these beta api versions were k8s and are now removed.

@Amr-Aly
Copy link

Amr-Aly commented Sep 4, 2023

We're seeing the same issue with ClusterRole, ClusterRoleBinding.

@zswanson
Copy link

zswanson commented Sep 6, 2023

K8s docs notes that you can clear managed fields with a json patch. We've been employing that to get past this issue but this is really tiresome. Not sure if Argo can somehow handle it, which would be great. The errors in the ArgoCD sync panel aren't helpful enough because they don't tell us which resource had the conversion error.

k patch KIND NAME --type json -p '[{"op":"replace","path":"/metadata/managedFields","value":[{}]}]'

@msw-kialo fyi ^

@leoluz
Copy link
Collaborator

leoluz commented Jan 25, 2024

ServerSide Diff feature is merged and available in Argo CD 2.10-RC1. If enabled, it should address this and other diff problems when ServerSide Apply is used.

I am closing this for now and feel free to reopen if the issue persists.

@leoluz leoluz closed this as completed Jan 25, 2024
@pgier
Copy link

pgier commented May 23, 2024

Ran into a similar issue failing to calculate diff for ClusterRole

"converting (v1.ClusterRole) to (v1beta1.ClusterRole):"

Enabling server side diff on the application resolved the issue for me.

@sacherus
Copy link

sacherus commented Aug 12, 2024

We are using 2.10.5 and have this problem when we want to enable server side apply. I deleted all the hpas, but it didn't help.

ComparisonError: Failed to compare desired state to live state: failed to calculate diff: error calculating structured merge diff: error calculating diff: error while running updater.Apply: converting (v2.HorizontalPodAutoscaler) to (v1.HorizontalPodAutoscaler): unknown conversion

@mdtoro-wyn
Copy link

deleting the old HPAs and an old secret solve the issue in my case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests