Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for pulling bazel-published artifacts. #49884

Closed
wants to merge 1 commit into from

Conversation

pipejakob
Copy link
Contributor

What this PR does / why we need it:

This adds support to get-kube.sh to be able to fetch Kubernetes releases that were staged by bazel build jobs, which use a different prefix and versioning scheme than our other CI jobs. The new functionality is meant to be used by the kubeadm presubmit job in order to fetch e2e tests built by the bazel presubmit job (since the kubeadm presubmit only runs after the bazel presubmit succeeds).

In order to differentiate bazel builds from our normal CI builds, I added an artificial bazel/ prefix to the version specified, so that you would call this like:

$ KUBERNETES_RELEASE=bazel/v1.8.0-alpha.2.899+2c624e590f5670 get-kube.sh

Our bazel jobs upload artifacts in two path schemes. Postsubmit builds are stored as:

gs://kubernetes-release-dev/bazel/<version>

e.g.:

gs://kubernetes-release-dev/bazel/v1.8.0-alpha.2.899+2c624e590f5670

Presubmit builds are stored differently, in order to have a consistent location if the same merge happens at different times (producing a different commit SHAs, so we can't depend on the SemVer string):

gs://kubernetes-release-dev/bazel/<PR>/<baseBranch:baseCommit>,<PR>:<commit1>,<pull number>:<commit2>,...

e.g.:

gs://kubernetes-release-dev/bazel/49747/master:b341939d6d3666b119028400a4311cc66da9a542,49747:c4656c3d029e47d03b3d7d9915d79cab72a80852

The bazel build jobs have already been publishing artifacts using these schemes for a long time. This PR just adds support to be able to fetch them via get-kube.sh.

@ixdy

In order to differentiate them from our normal CI builds, I added an
artificial "bazel/" prefix to the version, so that you would call this
like:

  $ KUBERNETES_RELEASE=bazel/v1.8.0-alpha.2.899+2c624e590f5670 get-kube.sh

Our bazel jobs upload artifacts in two path schemes. Postsubmit builds
are stored as:

  gs://kubernetes-release-dev/bazel/<version>

e.g.:

  gs://kubernetes-release-dev/bazel/v1.8.0-alpha.2.899+2c624e590f5670

Presubmit builds are stored differently, in order to have a consistent
location if the same merge happens at different times (producing a
different commit SHAs, so we can't depend on the SemVer string):

  gs://kubernetes-release-dev/bazel/<PR>/<baseBranch:baseCommit>,<PR>:<commit1>,<pull number>:<commit2>,...

e.g.:

  gs://kubernetes-release-dev/bazel/49747/master:b341939d6d3666b119028400a4311cc66da9a542,49747:c4656c3d029e47d03b3d7d9915d79cab72a80852
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jul 31, 2017
@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: pipejakob
We suggest the following additional approver: eparis

Assign the PR to them by writing /assign @eparis in a comment when ready.

No associated issue. Update pull-request body to add a reference to an issue, or get approval with /approve no-issue

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@k8s-github-robot k8s-github-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. release-note-label-needed labels Jul 31, 2017
@ixdy
Copy link
Member

ixdy commented Jul 31, 2017

/assign

pipejakob added a commit to pipejakob/test-infra that referenced this pull request Jul 31, 2017
This will allow "kubetest --extract bazel/..." to operate as a
pass-through to get-kube.sh with the same version.

x-ref: kubernetes/kubernetes#49884
pipejakob added a commit to pipejakob/test-infra that referenced this pull request Jul 31, 2017
This will allow "kubetest --extract bazel/..." to operate as a
pass-through to get-kube.sh with the same version.

x-ref: kubernetes/kubernetes#49884
pipejakob added a commit to pipejakob/test-infra that referenced this pull request Jul 31, 2017
This will allow "kubetest --extract bazel/..." to operate as a
pass-through to get-kube.sh with the same version.

x-ref: kubernetes/kubernetes#49884
pipejakob added a commit to pipejakob/test-infra that referenced this pull request Jul 31, 2017
This will allow "kubetest --extract bazel/..." to operate as a
pass-through to get-kube.sh with the same version.

x-ref: kubernetes/kubernetes#49884
pipejakob added a commit to pipejakob/test-infra that referenced this pull request Jul 31, 2017
This will allow "kubetest --extract bazel/..." to operate as a
pass-through to get-kube.sh with the same version.

x-ref: kubernetes/kubernetes#49884
@pipejakob
Copy link
Contributor Author

/retest

@fejta
Copy link
Contributor

fejta commented Jul 31, 2017

Why do we need to distinguish bazel from non-bazel staged builds? That seems extremely odd

@@ -61,16 +61,24 @@ set -o errexit
set -o nounset
set -o pipefail

# TODO(pipejakob): Use dl.k8s.io/bazel once the redirection uses the correct bucket.
KUBERNETES_BAZEL_RELEASE_URL="${KUBERNETES_BAZEL_RELEASE_URL:-${KUBERNETES_RELEASE_URL:-https://storage.googleapis.com/kubernetes-release-dev/bazel}}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't seem like this var is necessary given that we already have KUBERNTES_RELEASE_URL and kubetest can use this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's definitely not necessary, but I was following the pattern that we have for the CI release url defaulting on the next line:

KUBERNETES_CI_RELEASE_URL="${KUBERNETES_CI_RELEASE_URL:-${KUBERNETES_RELEASE_URL:-https://dl.k8s.io/ci}}"

I was trying to keep the bazel-fetching implementation as close to possible as the existing CI fetching so there weren't any surprises for people who try to switch over, but I can also rip this out if you think it's not useful.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason this isn't https://dl.k8s.io/bazel ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, there's a short explanation in the TODO comment, but to elaborate: right now, since /bazel is brand new, there is no explicit support for it in our dl.k8s.io handler, so visiting https://dl.k8s.io/bazel/blah incorrectly redirects to our normal release bucket (kubernetes-release) instead of the CI bucket (kubernetes-release-dev). There's an explicit pattern match so that https://dl.k8s.io/ci/blah redirects to our CI bucket, and I would like to add another one so that /bazel/* points to this as well. I wanted this PR to happen first to vet the idea of fetching these bazel versions, but if this gets merged, then I'll update our dl.k8s.io handler, and then update this code again to use the vanity URL after it gets deployed.

KUBE_RELEASE_VERSION_REGEX="^v(0|[1-9][0-9]*)\\.(0|[1-9][0-9]*)\\.(0|[1-9][0-9]*)(-([a-zA-Z0-9]+)\\.(0|[1-9][0-9]*))?$"
KUBE_CI_VERSION_REGEX="^v(0|[1-9][0-9]*)\\.(0|[1-9][0-9]*)\\.(0|[1-9][0-9]*)-([a-zA-Z0-9]+)\\.(0|[1-9][0-9]*)(\\.(0|[1-9][0-9]*)\\+[-0-9a-z]*)?$"
KUBE_BAZEL_VERSION_REGEX="^bazel/((v(0|[1-9][0-9]*)\\.(0|[1-9][0-9]*)\\.(0|[1-9][0-9]*)-([a-zA-Z0-9]+)\\.(0|[1-9][0-9]*)(\\.(0|[1-9][0-9]*)\\+[-0-9a-z]*)?)|(${KUBE_PULL_REQUEST_REGEX}/${KUBE_BRANCH_REGEX}:${KUBE_COMMIT_REGEX}(,${KUBE_PULL_REQUEST_REGEX}:${KUBE_COMMIT_REGEX})+))$"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like it has more to do with whether or not we use branch:commit,pull:commit or v1.2.3+commit, not whether or not it is bazel

@@ -183,6 +193,9 @@ if [[ ${KUBE_VERSION} =~ ${KUBE_CI_VERSION_REGEX} ]]; then
# Override KUBERNETES_RELEASE_URL to point to the CI bucket;
# this will be used by get-kube-binaries.sh.
KUBERNETES_RELEASE_URL="${KUBERNETES_CI_RELEASE_URL}"
elif [[ ${KUBE_VERSION} =~ ${KUBE_BAZEL_VERSION_REGEX} ]]; then
KUBE_VERSION=${KUBE_VERSION#bazel/}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just add an export I_KNOW_WHAT_I_AM_DOING=true type option and expect kubetest to set KUBE_VERSION and KUBERNETES_RELEASE_URL correctly?

This seems more straight forward than all this convoluted parsing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds fine to me. I only jumped through all of these hoops because the existing pattern was very tight regex validation of the release string. If we add an option to disable the regex matching altogether, then that should be the only change that I need.

What do you think, @ixdy? You have a lot of commits to this script specifically to keep these regexes up to date and accurate.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd tend to agree with respect to all the convoluted parsing, but someone(s) definitely went to a lot of work for all the regex validation. I like the idea of just having an escape hatch to use a specific URL without all of it though.

@ixdy
Copy link
Member

ixdy commented Aug 2, 2017

I've always disliked that we use a different directory structure for uploading the artifacts of dockerized builds and bazel builds.

I started working on unifying these in kubernetes/test-infra#2732, but that PR got stuck.

I can see the advantage of using the PULL_REFS value for PR builds, though I kinda wish prow had some way of passing information between various build stages, so we could use a slightly shorter identifier there.

@ixdy
Copy link
Member

ixdy commented Aug 3, 2017

cc @BenTheElder

@BenTheElder
Copy link
Member

BenTheElder commented Aug 3, 2017

The general idea I have right now is that prow run_after_success jobs should be able to know where the previous jobs output is in GCS and that jobs should be able to put out a metadata file to be consumed be dependent jobs, or something along those lines.
I haven't gotten a chance to start figuring out how feasible that is yet but we'd like to get the e2e jobs using something like this so that we can cut down on the number of builds required.

@pipejakob
Copy link
Contributor Author

Why do we need to distinguish bazel from non-bazel staged builds? That seems extremely odd

@fejta The technical reason right now is that they are stored in different paths, and get-kube.sh just knowing the version you want to fetch isn't enough to know which path to use to find it. Why they're stored in different paths, I assume, is because of our slow transition from docker builds to bazel builds and not wanting to clobber artifacts while we have CI jobs running for both. Two different jobs running against the same commit or pull request could generate the same SemVer string, so we need to use different paths to keep their artifacts separated. Also, my understanding is that our bazel builds still don't have full feature parity with our normal docker builds, so their artifacts might not be suitable for use for all of our different usecases, and job owners will probably want to opt into using those artifacts voluntarily rather than use them by default. This might change if/when bazel support is completely done and we cut over to depending on bazel builds only.

The general idea I have right now is that prow run_after_success jobs should be able to know where the previous jobs output is in GCS and that jobs should be able to put out a metadata file to be consumed be dependent jobs, or something along those lines.

I am 100% in favor of this direction (and have been asking @spxtr about it for a while :-P), but my immediate goal is to unblock enabling our kubeadm presubmit tests ASAP. Do you expect to have this kind of information piping available in prow soon?

Without the ability for the bazel job to inform the next step where to find important artifacts, is this at least a decent stopgap in the interim? We've already been living in the world where these two chained jobs need to stay in sync with a well-known location to store shared binaries. This PR (along with kubernetes/test-infra#3815) just makes it a little easier to do the fetching/extraction.

@BenTheElder
Copy link
Member

BenTheElder commented Aug 4, 2017 via email

@pipejakob
Copy link
Contributor Author

I think that's a different Jeff :-). I assume you meant @ixdy?

@BenTheElder
Copy link
Member

yes! haha, fixed.

@BenTheElder
Copy link
Member

/retest

@BenTheElder
Copy link
Member

BenTheElder commented Aug 7, 2017 via email

@pipejakob
Copy link
Contributor Author

Given the discussion here, I've created #50391 as an alternative to add a new option to disable this regex validation altogether.

hh pushed a commit to ii/kubernetes that referenced this pull request Aug 10, 2017
Automatic merge from submit-queue (batch tested with PRs 49725, 50367, 50391, 48857, 50181)

New get-kube.sh option: KUBERNETES_SKIP_RELEASE_VALIDATION

**What this PR does / why we need it**:
This is an alternative solution to kubernetes#49884. The goal is to be able to pull releases that were built by bazel jobs (both presubmit and postsubmit builds), which currently fail our regex validation against the version string.

This implementation is a simple "I know what I'm doing" breakglass option to turn regex validation off, whereas kubernetes#49884 was to extend our validation to support the new formats of bazel build jobs. I'm testing the waters to see if this is a more palatable solution.

**Release note**:

```release-note
New get-kube.sh option: KUBERNETES_SKIP_RELEASE_VALIDATION
```

CC @BenTheElder @fejta @ixdy
@pipejakob
Copy link
Contributor Author

Closing in favor of #50391.

@pipejakob pipejakob closed this Aug 10, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants