Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rke2 ingress upgrade fails 1.31.4 -> 1.31.5 #7652

Open
xhejtman opened this issue Jan 30, 2025 · 6 comments
Open

rke2 ingress upgrade fails 1.31.4 -> 1.31.5 #7652

xhejtman opened this issue Jan 30, 2025 · 6 comments
Assignees

Comments

@xhejtman
Copy link

Environmental Info:
RKE2 Version: v1.31.5+rke2r1

Node(s) CPU architecture, OS, and Version: Linux kub-h1.priv.cerit-sc.cz 6.8.0-50-generic #51-Ubuntu SMP PREEMPT_DYNAMIC Sat Nov 9 17:58:29 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

Cluster Configuration: 6 servers

Describe the bug:
helm pod with ingress upgrade gets error:

+ helm upgrade --set-string 'global.clusterCIDR=10.42.0.0/16\,xxx:8:2::/96' --set-string global.clusterCIDRv4=10.42.0.0/16 --set-string global.clusterCIDRv6=xxx:8:2::/96 --set-string 'global.clusterDNS=10.43.0.10\,xxx:8:3:0:a' --set-string global.clusterDomain=cluster.local --set-string global.rke2DataDir=/var/lib/rancher/rke2 --set-string 'global.serviceCIDR=10.43.0.0/16\,xxx:8:3::/112' --set-string global.systemDefaultIngressClass=ingress-nginx rke2-ingress-nginx /tmp/rke2-ingress-nginx.tgz --values /config/values-10_HelmChartConfig.yaml
Error: UPGRADE FAILED: template: rke2-ingress-nginx/templates/_helpers.tpl:266:14: executing "system_default_registry" at <.Values.global.systemDefaultRegistry>: nil pointer evaluating interface {}.global

if defaultBackend is enabled via values:

defaultBackend:
  enabled: true

Steps To Reproduce:

download the chart: rke2-ingress-nginx-4.12.003.tgz. untar, change defaultBackend to true in budled values, and run helm template test . -f values.yaml.

Expected behavior:

upgrade ingress controller

Actual behavior:

error

@brandond
Copy link
Member

brandond commented Jan 30, 2025

Please show the value of your rke2-ingress-nginx HelmChartConfig valuesContent. It looks like something in there is clobbering the top-level global key from the chart default values. Here's what it should look like:
https://github.com/rancher/rke2-charts/blob/main/charts/rke2-ingress-nginx/rke2-ingress-nginx/4.12.003/values.yaml#L5-L10

global:
  image:
    # -- Registry host to pull images from.
    registry: registry.k8s.io
  systemDefaultRegistry: ""
  systemDefaultIngressClass: ""

It must either not exist in your values, or also be a map.

@xhejtman
Copy link
Author

it seems that .Values is nil, not .Values.global.

however, it happens also if I use also only bundled values and enable defaultBackend:

wget https://rke2-charts.rancher.io/assets/rke2-ingress-nginx/rke2-ingress-nginx-4.12.003.tgz
tar xf rke2-ingress-nginx-4.12.003.tgz
cd rke2-ingress-nginx
helm template test . -f values.yaml --set defaultBackend.enabled=true
Error: template: rke2-ingress-nginx/templates/_helpers.tpl:266:14: executing "system_default_registry" at <.Values.global.systemDefaultRegistry>: nil pointer evaluating interface {}.global

Use --debug flag to render out invalid YAML

these are my values (public address and dns name hidden)

---
apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
metadata:
  name: rke2-ingress-nginx
  namespace: kube-system
spec:
  valuesContent: |-
    controller:
      kind: Deployment
      extraArgs:
        enable-annotation-validation: true
      hostNetwork: false
      hostPort:
        enabled: false
      autoscaling:
        enabled: true
        minReplicas: 1
        maxReplicas: 2
      resources:
        requests:
          cpu: 2
          memory: 2200Mi
      service:
        enabled: true
        type: LoadBalancer
        loadBalancerIP: xxx
        allocateLoadBalancerNodePorts: false
        externalTrafficPolicy: Local
        annotations:
          loadbalancer.openstack.org/timeout-client-data: '1200001'
          loadbalancer.openstack.org/timeout-member-data: '1200001'
      metrics:
        port: 10254
        enabled: true
        service:
          annotations:
            prometheus.io/scrape: "true"
            prometheus.io/port: "10254"
      watchIngressWithoutClass: true
      ingressClassResource:
          default: true
      config:
          proxy-read-timeout: "1200"
          upstream-keepalive-timeout: "1200"
          custom-http-errors: "500,503"
          limit-req-status-code: "429"
          limit-conn-status-code: "429"
          worker-shutdown-timeout: "7200s"
      extraArgs:
        publish-status-address: xxx
    defaultBackend:
      enabled: true
      autoscaling:
        enabled: false
      image:
        repository: cerit.io/cerit/default-backend
        tag: "v1.1"
        readOnlyRootFilesystem: false

@brandond
Copy link
Member

brandond commented Jan 31, 2025

Ah. I suspect that the template scope for the defaultBackend image helper was broken by the most recent chart update - cc @dereknola.

We should probably have a test that covers the defaultBackend, I don't think it sees MUCH use since generally all it did for most folks was send a static 404 page.

In the mean time you might try this:

    defaultBackend:
      enabled: true
      autoscaling:
        enabled: false
      image:
        repository: cerit.io/cerit/default-backend
        tag: "v1.1"
        readOnlyRootFilesystem: false
      global:
        systemDefaultRegistry: ""

@xhejtman
Copy link
Author

We should probably have a test that covers the defaultBackend, I don't think it sees MUCH use since generally all it did for most folks was send a static 404 page.

our use case is that it sends better page than just pure 404 and it handles also 500 and 503 error pages explaining what happened and how it can be fixed.

In the mean time you might try this:

defaultBackend:
  enabled: true
  autoscaling:
    enabled: false
  image:
    repository: cerit.io/cerit/default-backend
    tag: "v1.1"
    readOnlyRootFilesystem: false
  global:
    systemDefaultRegistry: ""

did not help. It seems, that for some reason, .Values is nil, so it actually does not help to define global.

even such fix in _helpers.tpl: {{- if and .Values.global .Values.global.systemDefaultRegistry -}} do not help

@dereknola dereknola self-assigned this Jan 31, 2025
@brandond
Copy link
Member

Yeah, the root scope (.) is replaced by a merge of .Values.defaultBackend.image .Values.global.image:

          {{- with (merge .Values.defaultBackend.image .Values.global.image) }}
          image: "{{ template "system_default_registry" . }}{{ template "repository_or_registry_and_image" .Values.defaultBackend.image }}"
          {{- end }}

This should do it until we can fix the chart in the February release:

    defaultBackend:
      enabled: true
      autoscaling:
        enabled: false
      image:
        repository: cerit.io/cerit/default-backend
        tag: "v1.1"
        readOnlyRootFilesystem: false
        Values:
          global:
            systemDefaultRegistry: ""
          defaultBackend:
            image:
              repository: cerit.io/cerit/default-backend
              tag: "v1.1"

@xhejtman
Copy link
Author

xhejtman commented Feb 2, 2025

Yeah, the root scope (.) is replaced by a merge of .Values.defaultBackend.image .Values.global.image:

          {{- with (merge .Values.defaultBackend.image .Values.global.image) }}
          image: "{{ template "system_default_registry" . }}{{ template "repository_or_registry_and_image" .Values.defaultBackend.image }}"
          {{- end }}

This should do it until we can fix the chart in the February release:

defaultBackend:
  enabled: true
  autoscaling:
    enabled: false
  image:
    repository: cerit.io/cerit/default-backend
    tag: "v1.1"
    readOnlyRootFilesystem: false
    Values:
      global:
        systemDefaultRegistry: ""
      defaultBackend:
        image:
          repository: cerit.io/cerit/default-backend
          tag: "v1.1"

yes, this workaround does work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants