Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: prow.sh: update to kind 0.9, support Kubernetes 1.19 #493

Closed
wants to merge 2 commits into from

Conversation

pohly
Copy link
Contributor

@pohly pohly commented Oct 6, 2020

That release adds support for recent Kubernetes releases like 1.19 and
simplifies configuration of feature gates.

With Kubernetes 1.19, new feature gates were introduced which might
become relevant for the alpha Prow jobs.

/kind cleanup

What this PR does / why we need it:

Testing of the prow.sh change - do not merge.

Does this PR introduce a user-facing change?:

NONE

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note-none Denotes a PR that doesn't merit a release note. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Oct 6, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: pohly

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Oct 6, 2020
override_k8s_version "1.16.15"
override_k8s_version "1.17.11"
override_k8s_version "1.18.8"
override_k8s_version "1.19.1"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overriding the K8S version was added in #313 because KinD only has images for certain versions.

But the recommendations in the KinD release notes go even further than that: it can happen that kindest/node:v1.19.1 becomes incompatible with kind v0.9 if the image needs to be rebuilt for, e.g. kind 1.0.

@msau42: should we switch to using images with hash, i.e. kindest/node:v1.19.1@sha256:98cf5288864662e37115e362b23e4369c8c4a408f99cbc06e58ac30ddc721600?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure let's use images with hash.

But I also want to move away from the model of specifying versions in prow.sh, so that we don't need to update prow.sh every time we need to add a new K8s version (although maybe unavoidable until kind apis stabilze more). I prefer to put the burden on the caller to make sure the kind version + k8s version specified are compatible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed it so that the script picks the right kind image based on the major/minor version specified by the job.

We will now get some minor version skew between E2E test (patch version picked by job) and Kubernetes cluster (patch version picked by kind). IMHO that is okay because the two should be compatible.

#
# If the version is prefixed with "release-", then nothing
# is overridden.
override_k8s_version "1.15.3"
override_k8s_version "1.13.12"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't support versions < 1.17 anymore

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is not too much work, then I prefer to keep the script a bit more flexible. Even Kubernetes-CSI no longer supports 1.16, some downstream user of a component might still do and then has to test themselves.

I'm not even sure whether Kubernetes-CSI never needs to test against 1.16. For example, external-provisioner 1.6.x is still supported, right? That was supported on Kubernetes 1.16 when it came out, so while it's perhaps a bit fuzzy, it might be expected that an updated 1.16.x still gets tested with it.

Backporting multiarch support to 1.6.x or bumping to a still supported Go version would be done by updating csi-release-tools in the release-1.6 branch to the latest csi-release-tools, because we avoid forking it. Then master of csi-release-tools still needs support for older Kubernetes.

# Kubernetes, as documented in the release notes of each
# kind release.
#
# We need to override CSI_PROW_KUBERNETES_VERSION
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I already changed our CI jobs to explicitly specify the K8s version so we should update them there too: https://github.com/kubernetes/test-infra/blob/master/config/jobs/kubernetes-csi/gen-jobs.sh#L25

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the kind of mutual dependency between jobs and repos that I'd prefer to avoid.

If the jobs specify the exact Kubernetes version to be used by kind, then we may not be able to update kind in the repo because the newer kind doesn't have images for those Kubernetes versions.

If we then update the jobs first in preparation of a kind update, the jobs will fail until the repos also get updated.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to work towards a goal where adding support for a new K8s version doesn't require updating the release framework to understand the new version. I'm looking at k/k and kubetest as an example. But maybe that is not achievable at the moment because of the kind image dependencies.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, kind is the problem here. There's just no clean "API" for installing kind and configuring it for different Kubernetes versions.

We either have to have all of that in the component or in the job definition. In the latter case, the prow.sh would have to be given some helper script that complete replaces the current start_kubernetes. The downside then will be that it becomes much harder to try out changes to prow.sh locally because one has to provide that helper script.

# TODO: add new CSI_PROW_ALPHA_GATES_xxx entry for future Kubernetes releases and
# add new gates to CSI_PROW_E2E_ALPHA_GATES_LATEST.
configvar CSI_PROW_E2E_ALPHA_GATES_LATEST '' "alpha feature gates for latest Kubernetes"
configvar CSI_PROW_E2E_ALPHA_GATES_1_17 '' "alpha feature gates for Kubernetes 1.17"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't run alpha tests on versions < latest

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same argument about older Kubernetes applies here: some downstream user may want to do this.

Keeping the historic ALPHA_GATES definitions isn't such a big burden.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree. This repo is not intended to be used for anything other than CI for these repos. We don't test or support these older configurations and it's a maintenance burden for us if there are problems.

Downstream repos that consume this do so at their own risk and need to do their own maintenance if they fall out of the Kubernetes support window.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kind 0.9 adds support for recent Kubernetes releases like 1.19 and
simplifies configuration of feature gates and runtime config.

With Kubernetes 1.19, new feature gates were introduced which might
become relevant for the alpha Prow jobs.

The updated kind release comes with images for different Kubernetes
releases than the one before. To avoid breaking existing jobs,
the script now picks kind images automatically. As an additional
bonus, it then uses images with hash, i.e. it's less likely to break
if those image tags change because of a future kind release.
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Oct 16, 2020
@pohly
Copy link
Contributor Author

pohly commented Oct 16, 2020

@msau42 I've pushed the updated commit, please take another look.

@pohly
Copy link
Contributor Author

pohly commented Oct 16, 2020

/test pull-kubernetes-csi-external-provisioner-alpha-1-19-on-kubernetes-master

Bazel makes sense in the Prow jobs because those often get invoked
with a pre-populated Bazel cache. But local invocation don't need it
and now can turn it off with CSI_PROW_USE_BAZEL=false.
@k8s-ci-robot
Copy link
Contributor

@pohly: The following tests failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
pull-kubernetes-csi-external-provisioner-alpha-1-19-on-kubernetes-master d6402b1 link /test pull-kubernetes-csi-external-provisioner-alpha-1-19-on-kubernetes-master
pull-kubernetes-csi-external-provisioner-unit 3eb7fd2 link /test pull-kubernetes-csi-external-provisioner-unit

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@pohly
Copy link
Contributor Author

pohly commented Oct 26, 2020

/close

Merged via kubernetes-csi/csi-release-tools#107

@k8s-ci-robot
Copy link
Contributor

@pohly: Closed this PR.

In response to this:

/close

Merged via kubernetes-csi/csi-release-tools#107

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

pohly added a commit to pohly/external-health-monitor that referenced this pull request Oct 27, 2020
Commit 7bcee13d79012 added alpha feature gates for Kubernetes 1.19 in
the CSI_PROW_E2E_ALPHA_GATES_LATEST variable based on the comment in
kubernetes-csi/external-provisioner#493 (comment)
that alpha testing only runs for the latest Kubernetes.

But some components (like external-health-monitor) are configured with
a single Prow job which runs the default set of tests on a stable
Kubernetes release (currently 1.17). Those tests used to include alpha
testing which then broke during Kind cluster startup because the
Kubernetes 1.19 feature gates weren't recognized by 1.17.

The solution is to disable alpha testing for Kubernetes != latest in
the default set of tests.
pohly added a commit to pohly/external-provisioner that referenced this pull request Oct 27, 2020
Commit 7bcee13d79012 added alpha feature gates for Kubernetes 1.19 in
the CSI_PROW_E2E_ALPHA_GATES_LATEST variable based on the comment in
kubernetes-csi#493 (comment)
that alpha testing only runs for the latest Kubernetes.

But some components (like external-health-monitor) are configured with
a single Prow job which runs the default set of tests on a stable
Kubernetes release (currently 1.17). Those tests used to include alpha
testing which then broke during Kind cluster startup because the
Kubernetes 1.19 feature gates weren't recognized by 1.17.

The solution is to disable alpha testing for Kubernetes != latest in
the default set of tests.
pohly added a commit to pohly/csi-release-tools that referenced this pull request Oct 27, 2020
Commit 7bcee13 added alpha feature gates for Kubernetes 1.19 in
the CSI_PROW_E2E_ALPHA_GATES_LATEST variable based on the comment in
kubernetes-csi/external-provisioner#493 (comment)
that alpha testing only runs for the latest Kubernetes.

But some components (like external-health-monitor) are configured with
a single Prow job which runs the default set of tests on a stable
Kubernetes release (currently 1.17). Those tests used to include alpha
testing which then broke during Kind cluster startup because the
Kubernetes 1.19 feature gates weren't recognized by 1.17.

The solution is to disable alpha testing for Kubernetes != latest in
the default set of tests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants