Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strategic merge patches should retain existing null values in targets #4628

Closed
cep21 opened this issue May 6, 2022 · 13 comments · Fixed by #4890
Closed

Strategic merge patches should retain existing null values in targets #4628

cep21 opened this issue May 6, 2022 · 13 comments · Fixed by #4890
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@cep21
Copy link

cep21 commented May 6, 2022

Reproduction steps are below. In the case no-bug, you can see replicaCount: null applied in the output. In the case has-bug, you can see replicaCount: null is missing. The only difference is the patch value.

< kustomize version
{Version:kustomize/v4.5.2 GitCommit:9091919699baf1c5a5bf71b32ca73a993e98088b BuildDate:2022-02-09T23:19:28Z GoOs:linux GoArch:amd64}

Reproduction steps

Directory structure.

< ls -la
total 4.0K
drwxrwxr-x  5 ec2-user ec2-user   51 May  6 19:37 .
drwxrwxrwt 13 root     root     4.0K May  6 19:38 ..
drwxrwxr-x  2 ec2-user ec2-user   50 May  6 19:37 has-bug
drwxrwxr-x  2 ec2-user ec2-user   32 May  6 19:37 no-bug
drwxrwxr-x  2 ec2-user ec2-user   49 May  6 19:37 original
< cat original/test.yaml 
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
  name: chart
spec:
  releaseName: helm-chart
  timeout: 15m
  values:
    chart:
      replicaCount: null
      autoscaling: true
< cat original/kustomization.yaml 
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- test.yaml
< cat no-bug/kustomization.yaml 
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- ../original
< cat has-bug/kustomization.yaml 
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- ../original
patches:
  - path: patch.yaml
    target:
      kind: HelmRelease
< cat has-bug/patch.yaml 
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
  name: chart
spec:
  releaseName: helm-chart
  timeout: 15m
  values:
    deepgram-api:
      some: value

Kustomize output

original:

< kustomize build .
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
  name: chart
spec:
  releaseName: helm-chart
  timeout: 15m
  values:
    chart:
      autoscaling: true
      replicaCount: null

has-bug:

< kustomize build .
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
  name: chart
spec:
  releaseName: helm-chart
  timeout: 15m
  values:
    chart:
      autoscaling: true
    deepgram-api:
      some: value

no-bug:

< kustomize build .
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
  name: chart
spec:
  releaseName: helm-chart
  timeout: 15m
  values:
    chart:
      autoscaling: true
      replicaCount: null

Expected output of has-bug above:

apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
  name: chart
spec:
  releaseName: helm-chart
  timeout: 15m
  values:
    chart:
      autoscaling: true
      replicaCount: null
    deepgram-api:
      some: value
@cep21 cep21 added the kind/bug Categorizes issue or PR as related to a bug. label May 6, 2022
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label May 6, 2022
@KnVerey
Copy link
Contributor

KnVerey commented May 11, 2022

If I understand the example correctly, the key will the null value is dropped when the strategic merge patch is applied, but it is retained if the resource is output as-is. I think this isn't a bug--if you ask Kustomize to apply the strategic merge patch, it will attempt to merge in the key with the null value as part of that process, i.e. delete it.

@cep21
Copy link
Author

cep21 commented May 11, 2022

An explicit null is very different than not being present. For example, with helm charts you can explicitly set values to null that are part of the default values.yaml. It's unclear why it's not a bug to remove a value that was explicitly set in the parent.

@cep21
Copy link
Author

cep21 commented May 11, 2022

Here is the expected output:

apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
  name: chart
spec:
  releaseName: helm-chart
  timeout: 15m
  values:
    chart:
      autoscaling: true
      replicaCount: null
    deepgram-api:
      some: value

@KnVerey
Copy link
Contributor

KnVerey commented May 12, 2022

When you apply a merge patch to an object, the null value has a special meaning, specifically, delete the key/map in question. It's equivalent to the $delete directive. Our merge2 doc also specifically mentions null triggering removal. That said, neither one specifically address the case where the null is coming from the target rather than the patch. It is possible that situation is not addressed because it is unexpected / would not occur with a patch operation against the API server. However, since this is a merge operation, it strikes me as plausible that the result should be the same. If you find merge documentation to the contrary, we can certainly reconsider.

What is the type of values.chart.replicaCount in your CR? I'm surprised that an apply would not also just remove that key, even if you get Kustomize to retain it. If that happens because it is a string field, you could use the same workaround on the Kustomize side: replicaCount: "null" has no special meaning for SMP and will be retained. Similarly, you could use a JSON patch to make the changes or restore the null, since nulls have no special meaning there.

@cep21
Copy link
Author

cep21 commented May 16, 2022

What is the type of values.chart.replicaCount in your CR?

It's a somewhat common pattern in helm charts. There's a replicaCount that has some default value, like 1. And if you want to use a HPA for your deployment, you need to explicitly unset the replicaCount default value (by setting to null).

@cep21
Copy link
Author

cep21 commented May 16, 2022

I found this in the docs?

if the field is present in both the src and dest, and the src value is null, the field is removed from the dest

Which mentions only the src part (Is this what. you were referring too?)

Does the element of least surprise have any effect here? It's more intuitive that everything under spec.values.chart is untouched since we are only merging in spec.values.deepgram-api

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 14, 2022
@KnVerey
Copy link
Contributor

KnVerey commented Aug 15, 2022

I spoke to @apelisse from SIG API Machinery about this, and we discussed three other sources of precedent for what to do here:

  1. JSON Merge Patch. While the SMP specification and merge2 docs do not address the case where the null is in the target rather than the patch, SMP is strongly inspired from JSON Merge Patch. Despite saying that "merge patch documents are suitable for describing modifications to JSON documents that primarily use objects for their structure and do not make use of explicit null values" (emphasis mine), it does address this case in an example, which shows the null in the target being retained.
  2. Client-side kubectl apply of a Custom Resource with a nullable: true field. This shows the equivalent behaviour of the kubectl implementation of SMP. The answer is that it leaves the null alone. To reproduce, create a CRD with a nullable: true field, then create a CR with that field initially set to null (make sure that field isn't in the last-applied annotation). Then, apply the CR with that field completely absent. (Note that there are significant gotchas here. For example, it seems to be impossible to update the nullable field to a null value using CSA; it is treated as a deletion.)
  3. Server-side kubectl apply of a Custom Resource with a nullable: true field. This does not actually use the SMP implementation anymore, but it is the future of apply and is still a relevant precedent from a user experience standpoint. It also leaves the null alone (and behaves more intuitively in general).

I consider these precedents sufficient evidence to warrant changing Kustomize's behaviour.

/retitle Strategic merge patches should retain existing null values in targets
/triage accepted

@k8s-ci-robot k8s-ci-robot changed the title The null value is inconsistently removed Strategic merge patches should retain existing null values in targets Aug 15, 2022
@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Aug 15, 2022
@KnVerey KnVerey removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 15, 2022
@seh
Copy link
Contributor

seh commented Aug 15, 2022

@KnVerey, with the proposed change in behavior, will a null value for a field in a patch still delete the corresponding field that's present in the target?

@KnVerey
Copy link
Contributor

KnVerey commented Aug 15, 2022

with the proposed change in behavior, will a null value for a field in a patch still delete the corresponding field that's present in the target?

Yes, it is unambiguous that that behaviour is correct.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 13, 2022
@apelisse
Copy link

Should we keep this open?

@KnVerey
Copy link
Contributor

KnVerey commented Nov 16, 2022

Yes, we ended up agreeing that it should be changed, and it hasn't been worked on yet.

/lifecycle frozen

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants