Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release 4.8.4 just disappeared? #10784

Closed
kbelosevic opened this issue Dec 20, 2023 · 23 comments
Closed

Release 4.8.4 just disappeared? #10784

kbelosevic opened this issue Dec 20, 2023 · 23 comments
Assignees
Labels
needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-priority triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@kbelosevic
Copy link

I had release 4.8.4 of helm chart deployed everywhere and in my local cache and now it is gone from repo, from github releases?

@k8s-ci-robot k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-priority labels Dec 20, 2023
@Blueblazer172
Copy link

@kbelosevic i can confirm the same problem with pulumi.

error: kubernetes:helm.sh/v3:Release resource 'ingress-nginx': property chart value {ingress-nginx} has a problem: chart "ingress-nginx" version "4.8.4" not found in https://kubernetes.github.io/ingress-nginx repository; check the chart name and repository configuration.

@longwuyuan
Copy link
Contributor

longwuyuan commented Dec 20, 2023

cc @Gacko @strongjz

@longwuyuan
Copy link
Contributor

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Dec 20, 2023
@longwuyuan
Copy link
Contributor

#10782

@schowave
Copy link

What is a /triage and why was the release deleted? I cannot find any informations about that...

@longwuyuan
Copy link
Contributor

@schowave the story is long and complicated. Part of the reason is that that release caused problems. Some threads in #ingress-nginx-dev channel of slack have som info. The PR that made that change is linked above.

Kindly move to another release if a re-install is needed.

@strongjz
Copy link
Member

4.8.4 was released by mistake; it had changes that were incompatible with the controller and was not a patch release. We apologize for the confusion.

@strongjz
Copy link
Member

The way the helm release works, anyone can bump the chart number, and when it's merged, it runs the helm release CI. We need to figure out an improved way to do a chart release. The controller is easier since we have to do the image promotion, which requires 3 PRs and then a GitHub release to get released officially.

@strongjz strongjz self-assigned this Dec 20, 2023
@strongjz
Copy link
Member

I'll keep this issue open for a little while longer so others can comment and ask questions,

https://twitter.com/IngressNGINX/status/1737511143860511040?t=7mARrT7Zr7iL_bJoJI0NmA&s=19

@Gacko
Copy link
Member

Gacko commented Dec 20, 2023

/assign

@mkjpryor
Copy link

mkjpryor commented Dec 20, 2023

@longwuyuan @strongjz

Next time this happens, could we consider reverting the changes in a new bug fix release?

Just deleting releases really messes up people who update dependencies automatically when they become available using chores, whereas a new, fixed release would be picked up automatically.

@strongjz
Copy link
Member

it also causes other issues. we need a playbook for a roll forward and then remove older versions. Something to discuss in the new year in our community meetings.

@cayla
Copy link

cayla commented Dec 20, 2023

Heavy agreement with @mkjpryor. I use a promotion workflow where staging is always latest and gates the prod deploy until the staging deploy has been running successfully for a few days. This strikes my balance of keeping up but with sanity. I only even happened to notice this issue because I saw an errant flux warning about a release not being available in prod. It was very surprising to have something pulled after it was available for some time.

I appreciate this project and everyone’s hard work but imho releases should only move forward even if it’s a revert. Pulling releases introduces too many unknowns.

@sereinity
Copy link

Is it possible, to solve some issues for many users, to release 4.8.3 as 4.8.5?

@tamcore
Copy link
Contributor

tamcore commented Dec 21, 2023

Removing something "mistakenly released" after it has been out already for 19 days really is not the best idea.

@strongjz
Copy link
Member

We discussed this on the community call, whether to do a removal or roll forward; we made the wrong choice here, it happens as humans, and we know now that we should roll forward when/if this happens again. Thank you all for the feedback to help make us better.

As for the 19 days, one of the maintainers was sick, and the other two had work issues. We have added @Gacko as approver on the chart and @cpanato and @puerco as approvers for k8s.io and ingress. We are working on adding more approvers and maintainers of the controller and chart.

@Gacko
Copy link
Member

Gacko commented Dec 21, 2023

Maybe giving some background on what happened exactly also helps:

4.8.3 was out and stable for a while. It was forged on our release-1.9 branch and only selected & cherry-picked changes got into it. So each patch release of a minor version normally is not just a snapshot of the current state of our main branch, but - at best - a well prepared decision.

4.8.4 in comparison was mistakenly & partially released as part of a PR merged to main. What does this mean?

We have some automation around releasing controller & chart. Part of that is, as @strongjz already described, that once the version inside the Chart.yaml gets bumped and merged, a new Helm chart release is getting forged and published from the commit it got bumped in.

In the particular case of version 4.8.4 a contributor's PR bumped this version and got merged accidentally because we as reviewers & maintainers missed the according change in the Chart.yaml.

Since this PR got merged to main and not to the according release branch, the commit the release was forged from contained way more changes than you normally would put into a patch release. Apart from that, this release also did not come with several other resources, as for example the changelogs in the charts/ingress-nginx/changelog/ directory or the changes annotation in the Chart.yaml.

So to sum it up: 4.8.4 brought all the changes we had on main for a long time already, some of them really game changing, hopefully none of them breaking, but actually worth more than just a small patch release. Additionally it was missing proper changelogs.

We realized this really quick and started discussing about how to solve this issue. Rolling forward was also a possibility, but since we were already in the process of forging controller version 1.9.5 back then and due to how our automation currently works, we could not simply release a proper chart release only - neither a 4.8.5 nor a 4.9.0.

There have been several issues on our way to 1.9.5 we had to solve in our free time. This is probably not the best explanation for why it took 19 days to do something, but that's the current situation of this project and none of us is happy about it - trust me.

Please also keep in mind: Upgrading your deployment of the Ingress NGINX chart from 4.8.3 to 4.8.4 brought several improvements on many different parts of the chart. Some of them might be hard to revert once installed in your cluster since you might already have adopted to the new benefits.

In the end we realize simply removing 4.8.4 might not have been the best idea and we maybe should have left it at least until we release a 4.9.0 which more properly reflects the changes it brings than a simple patch release.

We are now looking forward to release controller 1.9.5 and chart 4.9.0 and will bring some effort to improving our release process and automation afterwards.

We are really sorry for the inconvenience this whole hick-hack brings to you and hope we can improve the whole situation in the near future!

@mstavrev
Copy link

mstavrev commented Dec 21, 2023

Thanks for the hard work and detailed explanation. It all makes sense now.

I have one question in relation to releases. I am myself maintaining a fork of the project that only patches the nginx server itself. However I still need to create the custom controller docker image with my custom nginx build.

Anyway, what I wanted to ask is why the TAG file content still v1.9.4 on main, the controller-1.9.5 and helm 4.9.0 branches? Isn't it supposed to be bumped to match what one would get upon building that branch?

@strongjz
Copy link
Member

Release branches are kicked off via tag to start the cloud build, and it had to be done on the release branch, so theres no issues since main release is different. So before we did release branches, it would always be done on main. The release branch generates all the docs, and we pull that into the main. So that tag was missed, and I'll fix that now.

@Gacko
Copy link
Member

Gacko commented Dec 22, 2023

I created a post mortem for this. We can work on and discuss improvements there. Please feel free to make concrete action items so we can maybe create sub-issues.

#10801

@Gacko
Copy link
Member

Gacko commented Dec 30, 2023

Closing this for now. Please see the above mentioned follow-up for further progress.

@Gacko
Copy link
Member

Gacko commented Dec 30, 2023

/close

@k8s-ci-robot
Copy link
Contributor

@Gacko: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-priority triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Archived in project
Development

No branches or pull requests