Immediate release uninstall - Helm Operator #6244

dannyt-cx · 2023-01-10T18:17:57Z

Type of question

Open question

Question

What did you do?

I have a Helm Based operator with base 1.25.2 and I have an EKS Cluster 1.24.
I have 2 scenarios:

The Jenkins Scenario:

I install my helm based operator just fine and it holds 3 CRs in total (Im aware its against the documentation practice).
I then pass a helm upgrade --install my-release --set rabbitmq.enabled=false -f custom-values.yaml

My CR is called Platform - which basically is just a few known charts that i use to create global secrets and configmaps for other services to use. (.i.e Minio address, RabbitMQ connection string etc.)

what happens next puzzles me greatly - i see in operator logs that it beings to watch: "logger":"helm.controller","msg":"Watching dependent resource"
Those are the helm objects it detects and watches, also prints that its owned by the CR i created.

I get this message as well:
{"level":"info","ts":1673373443.4492545,"logger":"helm.controller","msg":"Upgraded release","namespace":"ndp","name":"ndp-platform","apiVersion":"ndp.com/v1","kind":"Platform","release":"ndp-platform","force":false}

Immediately after that i get these messages and no pods are created.
{"level":"error","ts":1673373443.870208,"msg":"Reconciler error","controller":"platform-controller","object":{"name":"ndp-platform","namespace":"ndp"},"namespace":"ndp","name":"ndp-platform","reconcileID":"8f2cafe7-33a6-446f-af61-e051d139df3f","error":"Operation cannot be fulfilled on platforms.ndp.com \"ndp-platform\": the object has been modified; please apply your changes to the latest version and try again","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.2/pkg/internal/controller/controller.go:273\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.2/pkg/internal/controller/controller.go:234"}

{"level":"info","ts":1673373444.9783742,"logger":"helm.controller","msg":"Uninstalled release","namespace":"ndp","name":"ndp-platform","apiVersion":"ndp.com/v1","kind":"Platform","release":"ndp-platform"} {"level":"info","ts":1673373445.014033,"logger":"helm.controller","msg":"Removing finalizer","namespace":"ndp","name":"ndp-platform","apiVersion":"ndp.com/v1","kind":"Platform","release":"ndp-platform"}

Now this scenario happens ONLY when running an install from Jenkins.

The Manual Scenario

I took the values I generate through Jenkins and manually apply from MY CLI and my Laptop
But this time - I see all pods I need and all the configmaps and other objects.
Here's where it get funny.
If i run the helm uninstall my-release which should remove all those pods and configmaps.
they are deleted but then I see in the Operator logs - that it installs it back again!

Without removing the operator pod, there's no way to remove the objects.

What did you expect to see?

Generated resources are persisting and not removed.
The chart that utilizes the CR is successfully removed.

What did you see instead? Under which circumstances?

No resources generated.
Unable to remove resources.

Environment

Operator type:

/language helm

Kubernetes cluster type:
EKS 1.24

$ operator-sdk version

$ go version (if language is Go)

$ kubectl version

Client Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.4", GitCommit:"872a965c6c6526caa949f0c6ac028ef7aff3fb78", GitTreeState:"clean", BuildDate:"2022-11-09T13:28:30Z", GoVersion:"go1.19.3", Compiler:"gc", Platform:"darwin/amd64"} Kustomize Version: v4.5.7 Server Version: version.Info{Major:"1", Minor:"24+", GitVersion:"v1.24.8-eks-ffeb93d", GitCommit:"abb98ec0631dfe573ec5eae40dc48fd8f2017424", GitTreeState:"clean", BuildDate:"2022-11-29T18:45:03Z", GoVersion:"go1.18.8", Compiler:"gc", Platform:"linux/amd64"}

Additional context

Thank you kindly.

The text was updated successfully, but these errors were encountered:

jberkhahn · 2023-01-23T19:28:24Z

So, it seems like you're trying to run helm commands manually against releases that have been created by your helm operator. That's not ever going to work because without any updates to the Specs of the CR, the helm operator controller will see that as divergent state and re-reconcile it back to the way it was.

Why are you trying to interact with the releases directly rather than by modifying the CR? What's your specific use case?
Also, could you post your controller pod's logs from when you attempt these commands?

dannyt-cx · 2023-01-24T09:56:31Z

@jberkhahn Thank you for the reply!

The thing is - im not running any command manually. I have in my operator - 3 CRs - Platform | MicroService | MicroFrontend. the Platform installs some 3rd party charts - DBs,Redis and others and also contains 2 CRs of the MicroService type which basically enables me to have the entire backend services i need for any Application related Microservice CRs to run. - This does contain some actions of creating resources - like Jobs and Service modification as part of those 3rd party charts.

Maybe thats why it appears to be "manual". However this happens solely when using a CD platform like Jenkins.

On the 2nd scenario where I install the operator and pass my CR with its values - this uninstall doesnt happen.
My actual use case is that I install The Platform CR and a subsequent Helm chart which contains my Microservices CRs (Basically a Chart of CRs) [I have many such charts due to the nature of the application.]

The logs that i put in my original message are direct outputs from the Controller pod. thats all it prints out.

Thanks again!

jberkhahn · 2023-01-24T23:28:32Z

The thing is - im not running any command manually.

I'm not sure I follow, then. What is that helm upgrade command coming from then? Is Jenkins running it? If not, what is Jenkins doing? Creating instances of your Platform CR?

On the 2nd scenario where I install the operator and pass my CR with its values - this uninstall doesnt happen.

So are you creating or deleting a CR? You say you create a CR but are expecting resources to be deleted? What?

dannyt-cx · 2023-01-28T11:35:07Z

Within the Jenkins Scenario - jenkins is running the helm upgrade --install command.
Jenkins runs that command. The chart that it installs contains my Platform CR.

I am creating a CR.
and when i remove the Chart that contains it - helm uninstall the CRs are not removed at all.
instead i see within the operator logs that it says "reconciled release\upgrade release"

openshift-bot · 2023-04-29T01:00:50Z

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

openshift-bot · 2023-05-29T08:30:26Z

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

openshift-bot · 2023-06-29T00:00:49Z

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

openshift-ci · 2023-06-29T00:00:53Z

@openshift-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-ci bot added the language/helm Issue is related to a Helm operator project label Jan 10, 2023

jberkhahn self-assigned this Jan 23, 2023

jberkhahn added this to the Backlog milestone Jan 23, 2023

jberkhahn added the triage/support Indicates an issue that is a support question. label Jan 23, 2023

openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 29, 2023

openshift-ci bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 29, 2023

openshift-ci bot closed this as completed Jun 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Immediate release uninstall - Helm Operator #6244

Immediate release uninstall - Helm Operator #6244

dannyt-cx commented Jan 10, 2023

jberkhahn commented Jan 23, 2023

dannyt-cx commented Jan 24, 2023

jberkhahn commented Jan 24, 2023

dannyt-cx commented Jan 28, 2023 •

edited

Loading

openshift-bot commented Apr 29, 2023

openshift-bot commented May 29, 2023

openshift-bot commented Jun 29, 2023

openshift-ci bot commented Jun 29, 2023

Immediate release uninstall - Helm Operator #6244

Immediate release uninstall - Helm Operator #6244

Comments

dannyt-cx commented Jan 10, 2023

Type of question

Question

What did you do?

The Jenkins Scenario:

The Manual Scenario

What did you expect to see?

What did you see instead? Under which circumstances?

Environment

Additional context

jberkhahn commented Jan 23, 2023

dannyt-cx commented Jan 24, 2023

jberkhahn commented Jan 24, 2023

dannyt-cx commented Jan 28, 2023 • edited Loading

openshift-bot commented Apr 29, 2023

openshift-bot commented May 29, 2023

openshift-bot commented Jun 29, 2023

openshift-ci bot commented Jun 29, 2023

dannyt-cx commented Jan 28, 2023 •

edited

Loading