Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Github bot created bad release 1.7.3 #10749

Closed
ader1990 opened this issue Jun 11, 2024 · 19 comments
Closed

Github bot created bad release 1.7.3 #10749

ader1990 opened this issue Jun 11, 2024 · 19 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. needs-priority Indicates an issue lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.

Comments

@ader1990
Copy link
Contributor

ader1990 commented Jun 11, 2024

What steps did you take and what happened?

https://github.com/kubernetes-sigs/cluster-api/releases/tag/v1.7.3 does not contain any binary / yamls.

To give more context, this situation breaks the cluster init command, as by default, the command tries to find the latest release, and at this moment, it finds the 1.7.3 release on Github, and then tries to use files that are not in the release download links (as they currently don t exist).

What did you expect to happen?

https://github.com/kubernetes-sigs/cluster-api/releases/tag/v1.7.3 to contain all binary / yamls like https://github.com/kubernetes-sigs/cluster-api/releases/tag/v1.7.2 does.

Cluster API version

None.

Kubernetes version

No response

Anything else you would like to add?

No response

Label(s) to be applied

/kind bug
One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels.

@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. needs-priority Indicates an issue lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jun 11, 2024
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If CAPI contributors determine this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@chrischdi
Copy link
Member

Could you please refer which clusterctl version you are using?

This seems to be this bug:

Which should have been solved in:

which be part of clusterctl versions >= v1.7.0 :-)

A similar fix was done for the e2e test module which is part of v1.7.3.

@vishalanarase
Copy link
Member

Just followed the release process document, i will discuss this issue in the release call

@chrischdi
Copy link
Member

chrischdi commented Jun 11, 2024

Note: v1.7.3 is not yet released, which is why binaries/yamls are not visible yet. The github action did actually work, but the release is not published yet.

@sbueringer
Copy link
Member

Yeah, everything is fine with the release. It's just in progress. The point here is just about if clusterctl works correctly or not.

@ader1990
Copy link
Contributor Author

Just followed the release process document, i will discuss this issue in the release call

Thank you. Saw that the previous releases were done from a PR and the binary/yamls appeared in the release page after 15-20 minutes (almost 5h passed already now). I can close this issue if the release has not finished being released.

The term release can be confusing as the PR merged was actually creating a commit to be tagged and not a release. And saw that no actions were in progress, that s why I created the issue.

@ader1990
Copy link
Contributor Author

Yeah, everything is fine with the release. It's just in progress. The point here is just about if clusterctl works correctly or not.

I can close the issue if the "release" was properly created by the Github bot process and there are no pending issues on why it takes so much time.

@sbueringer
Copy link
Member

The reason is that cutting a release is much more than what the GitHub bot is doing. Please see: https://github.com/kubernetes-sigs/cluster-api-provider-vsphere/blob/main/docs/release/release-tasks.md

What Christian is talking about is that clusterctl >= v1.7.0 should be able to handle this. It should just use v1.7.2.

Are you using clusterctl >= v1.7.0?

@sbueringer
Copy link
Member

So to clarify:

  • it is expected that the release is not entirely done yet
  • it is not expected that clusterctl init fails (if you use clusterctl >= v1.7.0)

@sbueringer
Copy link
Member

This is what clusterctl v1.7.0 is doing on my env

/clusterctl init --infrastructure docker
Fetching providers
Installing cert-manager Version="v1.14.4"
Waiting for cert-manager to be available...
Installing Provider="cluster-api" Version="v1.7.2" TargetNamespace="capi-system"
Installing Provider="bootstrap-kubeadm" Version="v1.7.2" TargetNamespace="capi-kubeadm-bootstrap-system"
Installing Provider="control-plane-kubeadm" Version="v1.7.2" TargetNamespace="capi-kubeadm-control-plane-system"
Installing Provider="infrastructure-docker" Version="v1.7.2" TargetNamespace="capd-system"

Your management cluster has been initialized successfully!

You can now create your first workload cluster by running the following:

  clusterctl generate cluster [name] --kubernetes-version [version] | kubectl apply -f -


New clusterctl version available: v1.7.0 -> v1.7.3
sigs.k8s.io/cluster-api

@ader1990
Copy link
Contributor Author

ader1990 commented Jun 11, 2024

So to clarify:

  • it is expected that the release is not entirely done yet
  • it is not expected that clusterctl init fails (if you use clusterctl >= v1.7.0)

Thank you for the explanation @sbueringer. It took me a while to test things out, as I had to wait around 4 minutes for clusterctl to run (as it had to timeout or maybe retry the http requests):

  • it is expected that the release is not entirely done yet -> I understand now. A suggestion maybe: have a release issue as other projects have to track the actual release and close it once the FULL release gets produced (all steps are followed)? I would not have opened this issue if there wasn't a PR with the name containing the term release. I took a look first at the previous releases and saw that the yamls/binaries were usually added to the release page after around 15 minutes.
  • it is not expected that clusterctl init fails (if you use clusterctl >= v1.7.0) -> I was using clusterctl version 1.3 and updated now to 1.7.2. After that, I ran with the new clusterctl version:
time clusterctl init --infrastructure tinkerbell -v 5
Using configuration File="~/.cluster-api/clusterctl.yaml"
Fetching providers
error using httpGet to get file from GitHub releases, falling back to github client: error getting file, status code: 404 owner="kubernetes-sigs" repository="cluster-api" version="v1.7.3" path="metadata.yaml"
Potential override file SearchFile="~/.config/cluster-api/overrides/cluster-api/v1.7.2/core-components.yaml" Provider="cluster-api" Version="v1.7.2"
Fetching File="core-components.yaml" Provider="cluster-api" Type="CoreProvider" Version="v1.7.2"
error using httpGet to get file from GitHub releases, falling back to github client: error getting file, status code: 404 owner="kubernetes-sigs" repository="cluster-api" version="v1.7.3" path="metadata.yaml"
Potential override file SearchFile="~/.config/cluster-api/overrides/bootstrap-kubeadm/v1.7.2/bootstrap-components.yaml" Provider="bootstrap-kubeadm" Version="v1.7.2"
Fetching File="bootstrap-components.yaml" Provider="kubeadm" Type="BootstrapProvider" Version="v1.7.2"
error using httpGet to get file from GitHub releases, falling back to github client: error getting file, status code: 404 owner="kubernetes-sigs" repository="cluster-api" version="v1.7.3" path="metadata.yaml"
Potential override file SearchFile="~/.config/cluster-api/overrides/control-plane-kubeadm/v1.7.2/control-plane-components.yaml" Provider="control-plane-kubeadm" Version="v1.7.2"
Fetching File="control-plane-components.yaml" Provider="kubeadm" Type="ControlPlaneProvider" Version="v1.7.2"
Potential override file SearchFile="~/.config/cluster-api/overrides/infrastructure-tinkerbell/v0.4.0/infrastructure-components.yaml" Provider="infrastructure-tinkerbell" Version="v0.4.0"
Fetching File="infrastructure-components.yaml" Provider="tinkerbell" Type="InfrastructureProvider" Version="v0.4.0"

...
real    3m5.752s
user    0m2.069s
sys     0m0.342s

clusterctl version
clusterctl version: &version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.2", GitCommit:"a5898a2f63b8d3c556f192b76794b5b67900be5d", GitTreeState:"clean", BuildDate:"2024-05-14T16:09:34Z", GoVersion:"go1.21.9", Compiler:"gc", Platform:"linux/arm64"}

New clusterctl version available: v1.7.2 -> v1.7.3
sigs.k8s.io/cluster-api

@ader1990
Copy link
Contributor Author

As I was using an old version of clusterctl, my issue is invalid. Still, would be great to have an issue or discussion/project tracking the release that gets closed when everything has been completed.

@sbueringer
Copy link
Member

sbueringer commented Jun 11, 2024

Thx for the feedback!

I'll re-open. I think we should look into the 4 min delay. I saw something similar on my machine. Just wasn't sure if it's specific to my env

/reopen

(I'll leave the point about having an issue etc. up to the release team)

@k8s-ci-robot
Copy link
Contributor

@sbueringer: Reopened this issue.

In response to this:

Thx for the feedback!

I'll re-open. I think we should look into the 4 min delay. I saw something similar on my machine. Just wasn't sure if it's specific to my env

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot reopened this Jun 11, 2024
@sbueringer
Copy link
Member

(cc/fyi @chrischdi regarding the delay)

@chrischdi
Copy link
Member

Just checked again: The following PR:

has an additional fix which improves on the timeout.

This was backported to v1.7 and gets released in clusterctl v1.7.3 .

@sbueringer
Copy link
Member

Tested this with the partially-released CAPV v1.10.1 and with clusterctl v1.7.3 it was falling back in a reasonable timeframe to v1.10.0.

I'll close this issue then
/close

My take on release process changes. We already have a lot of work, especially for the release team (see: https://github.com/kubernetes-sigs/cluster-api/blob/main/docs/release/release-tasks.md). I think it's not worth the effort to add additional work by having to open an issue for every release that we do.

I'll prefer having to answer/close a few issues if folks are using old clusterctl versions & if the release process takes a bit longer. That's a lot less work in the long run.

@k8s-ci-robot
Copy link
Contributor

@sbueringer: Closing this issue.

In response to this:

Tested this with the partially-released CAPV v1.10.1 and with clusterctl v1.7.3 it was falling back in a reasonable timeframe to v1.10.0.

I'll close this issue then
/close

My take on release process changes. We already have a lot of work, especially for the release team (see: https://github.com/kubernetes-sigs/cluster-api/blob/main/docs/release/release-tasks.md). I think it's not worth the effort to add additional work by having to open an issue for every release that we do.

I'll prefer having to answer/close a few issues if folks are using old clusterctl versions & if the release process takes a bit longer. That's a lot less work in the long run.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ader1990
Copy link
Contributor Author

Tested this with the partially-released CAPV v1.10.1 and with clusterctl v1.7.3 it was falling back in a reasonable timeframe to v1.10.0.

I'll close this issue then /close

My take on release process changes. We already have a lot of work, especially for the release team (see: https://github.com/kubernetes-sigs/cluster-api/blob/main/docs/release/release-tasks.md). I think it's not worth the effort to add additional work by having to open an issue for every release that we do.

I'll prefer having to answer/close a few issues if folks are using old clusterctl versions & if the release process takes a bit longer. That's a lot less work in the long run.

Thank you for the info and help, much appreciated!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. needs-priority Indicates an issue lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.
Projects
None yet
Development

No branches or pull requests

5 participants