Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chart breaks when upgrading from k8s 1.24 to 1.25 #794

Closed
MaxRink opened this issue Jan 11, 2023 · 10 comments
Closed

Chart breaks when upgrading from k8s 1.24 to 1.25 #794

MaxRink opened this issue Jan 11, 2023 · 10 comments

Comments

@MaxRink
Copy link

MaxRink commented Jan 11, 2023

Describe the bug
Currently the chart doesnt survive an upgrade of k8s 1.24.x to 1.25 due to PodSecurityPolicies

Helm upgrade failed: unable to build kubernetes objects from current release manifest: resource mapping not found for name: "tridentoperatorpods" namespace: "" from "": no matches for kind "PodSecurityPolicy" in version "policy/v1beta1"
ensure CRDs are installed first

Environment
Provide accurate information about the environment to help us reproduce the issue.

  • Trident version: [e.g. 19.10] 22.10
  • Trident installation flags used: [e.g. -d -n trident --use-custom-yaml]
  • Container runtime: [e.g. Docker 19.03.1-CE]
  • Kubernetes version: [e.g. 1.15.1]
  • Kubernetes orchestrator: [e.g. OpenShift v3.11, Rancher v2.3.3]
  • Kubernetes enabled feature gates: [e.g. CSINodeInfo]
  • OS: [e.g. RHEL 7.6, Ubuntu 16.04]
  • NetApp backend types: [e.g. CVS for AWS, ONTAP AFF 9.5, HCI 1.7]
  • Other:

To Reproduce
Have the chart installed in an 1.24.x cluster
upgrade K8s
try to upgrade/change the chart

Expected behavior
Chart doesnt break
Additional context
The root cause is

{{- if semverCompare "<1.25-0" .Capabilities.KubeVersion.GitVersion }}

Basically helm keeps track of the PSP and wants to remove it after this evaluates to false. But K8s doesnt know anything about that resource, thus helm fails.
The only way to prevent this is to manually prevent PSPs from being created while being on 1.24 (which is bad and most people will forget) or to automatically drop PSPs in 1.24 unless manually enabled, making sure the resource is deleted when the api still knows it

@MaxRink MaxRink added the bug label Jan 11, 2023
@ffilippopoulos
Copy link

yes, we've ran into the same problem. PSPs are deprecated long time now, trident should remove them completely from upstream manifests.

ffilippopoulos added a commit to utilitywarehouse/system-manifests that referenced this issue Jan 13, 2023
Kube 1.25 doesn't accept PSP manifetsts any more. Removing manually until
trident repo catches up:
NetApp/trident#794
ffilippopoulos added a commit to utilitywarehouse/system-manifests that referenced this issue Jan 13, 2023
Kube 1.25 doesn't accept PSP manifetsts any more. Removing manually until
trident repo catches up:
NetApp/trident#794
@gnarl gnarl added the tracked label Jan 20, 2023
@anchense
Copy link

something else to note that 1.25 needs the latest trident version 23.01.
https://github.com/NetApp/trident/releases/download/v23.01.0/trident-installer-23.01.0.tar.gz

@pvdputte
Copy link

According to Helm upgrade error after Kubernetes Upgrade to 1.25 with Trident installed the upgrade to trident version 23.01 removes PodSecurityPolicies.

Solution
* The upgrade to Trident 23.01 will fix, i.e. remove the Trident's PodSecurityPolicies.
* Another way to fix the issue, staying with Trident 22.10.0  is to uninstall Trident with Help and reinstall it with or without operator using tridentctl.

A few months ago I ran k8s 1.23 with trident 22.7.0
I first upgraded trident from 22.7.0 to 23.01.0
Next k8s from 1.23 => 1.24 => 1.25 => 1.26 without a hitch.

Now I'm trying to deploy trident 23.04.0 prior to k8s 1.27 but I'm still getting the same error as you 🤔

$ helm list -n trident
NAME                    NAMESPACE       REVISION        UPDATED                                 STATUS          CHART                           APP VERSION
trident-operator        trident         2               2023-03-06 16:16:29.842913588 +0100 CET deployed        trident-operator-23.01.0        23.01.0

$ helm upgrade -n trident trident-operator netapp-trident/trident-operator --version 23.04.0
Error: UPGRADE FAILED: unable to build kubernetes objects from current release manifest: resource mapping not found for name: "tridentoperatorpods" namespace: "" from "": no matches for kind "PodSecurityPolicy" in version "policy/v1beta1"
ensure CRDs are installed first

@NA-Scott
Copy link

On the helm upgrade line for Trident, add the parameter: --set excludePodSecurityPolicy=true

@pvdputte
Copy link

I'm afraid I already tried that, after checking the release notes:
When upgrading a Kubernetes cluster from 1.24 to 1.25 or later that has Astra Trident installed, you must update values.yaml to set excludePodSecurityPolicy to true or add --set excludePodSecurityPolicy=true to the helm upgrade command before you can upgrade the cluster.

$ helm upgrade --set excludePodSecurityPolicy=true -n trident trident-operator netapp-trident/trident-operator --version 23.04.0 
Error: UPGRADE FAILED: unable to build kubernetes objects from current release manifest: resource mapping not found for name: "tridentoperatorpods" namespace: "" from "": no matches for kind "PodSecurityPolicy" in version "policy/v1beta1"
ensure CRDs are installed first

Could it be because I installed Trident while on 1.23, so the PSP's were there at the time, and Helm is now still looking for them?
Because it wants to compare resources from the last time it upgraded trident?
I installed trident 23.01 on k8s 1.23 prior to upgrading to 1.24, and never thought about excludePodSecurityPolicy later, so it was not set when I did 1.24 => 1.25.

@pvdputte
Copy link

I was able to fix it by editing the sh.helm.release.v1.trident-operator.v# secret's data.release contents.

More specifically, I think I removed these parts from it:

{"name":"templates/podsecuritypolicy.yaml","data":"e3st...IH19Cg=="},

as well as

# Source: trident-operator/templates/podsecuritypolicy.yaml\napiVersion: policy/v1beta1\nkind: PodSecurityPolicy\nmetadata:\n  name: tridentoperatorpods\n  labels:\n    app: operator.trident.netapp.io\nspec:\n  privileged: false\n  seLinux:\n    rule: RunAsAny\n  supplementalGroups:\n    rule: RunAsAny\n  runAsUser:\n    rule: RunAsAny\n  fsGroup:\n    rule: RunAsAny\n  volumes:\n    - projected\n---\n

Perhaps the latter one would have been enough.

After replacing data.release in the secret with the updated version (i.e. after applying gzip -c | base64 | base64 -w0 to the data again) I could upgrade successfully.

$ helm upgrade --set kubeletDir=/var/lib/k0s/kubelet --set excludePodSecurityPolicy=true -n trident trident-operator netapp-trident/trident-operator --version 23.04.0

@JaslynnWangTR
Copy link

Do we have any update on this? We are seeing the similar issue. I believe we have a bug ticket for this issue. #819

@gr33npr
Copy link

gr33npr commented Aug 1, 2023

It would be nice to have any update or a bugfixrelease. Its nice that the operator will delete the psps during update, but unfortunately K8s distribution like OpenShift and RKE will not let you start an update to a version with K8s > 1.25 until there are no existing psps in the cluster.

@temirg
Copy link

temirg commented Aug 8, 2023

Hello all,
would recommend the Rancher article:
https://ranchermanager.docs.rancher.com/how-to-guides/new-user-guides/authentication-permissions-and-global-configuration/pod-security-standards#cleaning-up-releases-after-a-kubernetes-v125-upgrade

Already tested:
export KUBECONFIG=...
helm -n trident list
NAME NAMESPACE REVISION CHART APP VERSION
trident-operator-22-1680184337 trident 4 trident-operator-23.04.0 23.04.0

helm mapkubeapis --dry-run -n trident trident-operator-22-1680184337
and without "--dry-run":
helm mapkubeapis -n trident trident-operator-22-1680184337

Then upgrade to the same version but with: "exclude PodSecurityPolicy=true" or
get helm chart and images for trident-operator-23.07.0 // upgrade to 23.07.0

Regards, temirg.

@torirevilla
Copy link
Contributor

Trident is updating the documentation to use the "exclude PodSecurityPolicy=true" flag in Helm when upgrading.
Kubernetes 1.25 will be the minimum supported version for the 24.10 release of Trident.
Closing the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants