Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add arm64/v8 support for krte image #24783

Closed
wants to merge 1 commit into from

Conversation

ruquanzhao
Copy link
Member

How about we add multi-arch support of krte?

Signed-off-by: Ruquan Zhao ruquan.zhao@arm.com

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jan 5, 2022
@k8s-ci-robot
Copy link
Contributor

Hi @ruquanzhao. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 5, 2022
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: ruquanzhao
To complete the pull request process, please assign amwat after the PR has been reviewed.
You can assign the PR to them by writing /assign @amwat in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot requested review from amwat and spiffxp January 5, 2022 09:30
@k8s-ci-robot k8s-ci-robot added area/images sig/testing Categorizes an issue or PR as relevant to SIG Testing. labels Jan 5, 2022
@ruquanzhao
Copy link
Member Author

cc @chendave

@chendave
Copy link
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 13, 2022
Copy link
Member

@chendave chendave left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have your verified with gcb?

@@ -1 +0,0 @@
../kubekins-e2e/variants.yaml
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you pls explain why you remove this link and create a new one later?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed the bootstrap image is DEPRECATED, and kubekins-e2e based on bootstrap.
So, I thought it's a better way to maintain two variants.yaml.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is reverting a previous intentional change (which admittedly I didn't like), but is completely orthogonal to this PR. this change, if merged, should be a different PR.

I don't think arm64 should be a variant anyhow. we should just make the tags multi-arch if we're going to do this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using bootstrap directly is deprecated, but kubekins-e2e is the majority of kubernetes CI.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BenTheElder thanks for the comments! let's see if we can make this change more clear to follow your suggestion!

@@ -1,23 +1,29 @@
timeout: 1800s
steps:
- name: gcr.io/cloud-builders/docker
- name: gcr.io/k8s-testimages/gcb-docker-gcloud
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the new way to build the multi-arch image?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chendave
Copy link
Member

There is a serial of patches to make the test-infra/images support for multi-arch, see the relevant change of this issue - #16588.

And per our evaluation, this krte image is one of the bits that are not ignorable.

Might you take a look at this and let us know your thoughts?

@aojea @BenTheElder @spiffxp

@ruquanzhao
Copy link
Member Author

ruquanzhao commented Jan 14, 2022

Have your verified with gcb?

Yes. And here is the log of arm64.

Step #0: #9 DONE 226.3s
Finished Step #0
Starting Step #1
Step #1: Already have image (with digest): gcr.io/k8s-testimages/gcb-docker-gcloud
Step #1: Created [asia-southeast1-docker.pkg.dev/arm-nwcs-ci-dev/arm64-k8s-ci/krte:latest-arm64].
Finished Step #1
PUSH
DONE

Full build log in https://pastebin.ubuntu.com/p/4WSskkNXg3/

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 20, 2022
- --build-arg=UPGRADE_DOCKER_ARG=$_UPGRADE_DOCKER
- --build-arg=IMAGE_ARG=gcr.io/$PROJECT_ID/krte:$_GIT_TAG-$_CONFIG
- .
- build
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the indent here and below necessary?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your review!
It's not necessay, I have removed it.

Signed-off-by: Ruquan Zhao ruquan.zhao@arm.com
@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jan 20, 2022
@@ -30,6 +36,3 @@ substitutions:
_CFSSL_VERSION: R1.2
options:
substitution_option: ALLOW_LOOSE
images:
- 'gcr.io/$PROJECT_ID/krte:$_GIT_TAG-$_CONFIG'
- 'gcr.io/$PROJECT_ID/krte:latest-$_CONFIG'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why those lines are removed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refer to #22444

@ruquanzhao
Copy link
Member Author

Dependence of variants.yaml is removed.
Would you mind taking a look?
/cc @BenTheElder

@BenTheElder
Copy link
Member

BenTheElder commented Jan 22, 2022

How about we add multi-arch support of krte?

I'm still missing the why, I mean we could build for arbitrarily many architectures but building them is not free, and this image is used to test kind on Kubernetes's CI, and isn't supported for other users. What's the use case?

I don't want to add build time and complexity and pay to store more image contents without a concrete use case, most of these CI images are amd64 because SIG k8s infra provides amd64 clusters to host CI and these images are just used as the project's CI environment.

See also the image's readme: https://github.com/kubernetes/test-infra/blob/master/images/krte/README.md#warning

There is a serial of patches to make the test-infra/images support for multi-arch, see the relevant change of this issue - #16588.

OK, I see that these PRs exist, but in #16588 the conversation settled that while redhat needed the core prow components (the "pod utilities") to be multi-arch, there was no decision regarding the rest.

My experience so far has been that making docker images multi-arch "just because" causes CI flakes and much slower builds due to the penalty of building under qemu, and costs far more energy than the initial effort put in to port it with essentially no reward for the upstream project.

In the main kubernetes repo we are not accepting new architectures without a KEP.

Is there a plan to run multi-arch CI within the Kubernetes project? Or some other reason SIG testing should continue to port everything?

@@ -72,7 +73,7 @@ RUN echo "Installing Packages ..." \
unzip \
&& rm -rf /var/lib/apt/lists/* \
&& echo "Installing Go ..." \
&& export GO_TARBALL="go${GO_VERSION}.linux-amd64.tar.gz"\
&& export GO_TARBALL="go${GO_VERSION}.linux-${TARGETARCH:-amd64}.tar.gz"\
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we just default this once with ARG TARGETARCH=amd64 and simplify here and below?

args:
- build
- --tag=gcr.io/$PROJECT_ID/krte:$_GIT_TAG-$_CONFIG
- --platform=linux/amd64,linux/arm64/v8
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

possibly something we should parameterize

@chendave
Copy link
Member

Since @ruquanzhao is on vacation, so pls let me try to reply, just for something I heard from his early investigation, might not correct though.

I'm still missing the why, I mean we could build for arbitrarily many architectures but building them is not free, and this image is used to test kind on Kubernetes's CI, and isn't supported for other users. What's the use case?

Based on what @ruquanzhao told me, we need this to provide a containerized environment for test execution. So, why we need this change?

  1. As ARM server / instance is not rare on the market, upstream project like kubevirt and kubeedge etc. have ARM CI ready beside the X86_64, the project like kubevirt are also using prow framework for testing. Currently, the ARM CI is running periodically on baremetal in the kubevirt due to the lack of support from the prow and the utility like krte.

Whether we need ARM CI coverage for k8s is another topic and it's totally community's call but we can make this utility work-able for mulit-arch so that they are not struggle with containerized environment.

  1. We are maintaining an internal CI infrastructure to track is there any issue in the upstream, so we can fix them timely. We hope those utilities are also supported on arm64 in the upstream, so that we are more aligned with upstream.

In the main kubernetes repo we are not accepting new architectures without a KEP.

Sure, we can provide a KEP if this is needed, we might need to have a consensus on how far we could go.

Is there a plan to run multi-arch CI within the Kubernetes project?

Of course, we'd love to.
Might start from minimal coverage (conformance?) and make it just as an experimental reference? This will benefit the cloud provider like AWS and the provider who has the plan to provide the ARM instance.
And if take it as an experiment, we can provide bare-metals.

Again, it's community's call to decide where we should go.

@ruquanzhao @BenTheElder correct me?

@BenTheElder
Copy link
Member

We are plagued by timeouts in the image build today from a very lengthy build in these images as-is supporting all the necessary Kubernetes environments in CI, such as this currently most recent build.

As ARM server / instance is not rare on the market, upstream project like kubevirt and kubeedge etc. have ARM CI ready beside the X86_64, the project like kubevirt are also using prow framework for testing. Currently, the ARM CI is running periodically on baremetal in the kubevirt due to the lack of support from the prow and the utility like krte.

I don't see how these are related, the KRTE image is the environment used for prow.k8s.io to test KIND (again, please read the README for KRTE).

Kubernetes can be tested on any architecture without this image. And there is no ARM CI in the Kubernetes project, certainly not in the KIND project, for which this image is supported only.

Whether we need ARM CI coverage for k8s is another topic and it's totally community's call but we can make this utility work-able for mulit-arch so that they are not struggle with containerized environment.

Again, this image is only supported for the purposes of testing KIND in the project's CI, we do not have the maintainer bandwidth to support other things with this, the KIND project is already overloaded as-is. So it is not another topic. This image is an implementation detail of the KIND project's CI and is expressly not supported for other purposes.

krte image

krte - KIND RunTime Environment

This image contains things we need to run kind in Kubernetes CI, and
is maintained for the sole purpose of testing Kubernetes with KIND.

WARNING

This image is not supported for other use cases. Use at your own risk.

https://github.com/kubernetes/test-infra/blob/ae487b7658048076ea6cedee6408421d6b642f30/images/krte/README.md

@chendave
Copy link
Member

@BenTheElder thanks for the detailed explanation!

I was thinking if we need to test kubernetes with KIND on ARM, we must need this, maybe I am wrong.

@ruquanzhao could you pls take a look at @BenTheElder 's comments when you back? thanks!

@aojea
Copy link
Member

aojea commented Jan 26, 2022

I was thinking if we need to test kubernetes with KIND on ARM, we must need this, maybe I am wrong.

This is a problem about maintaining the image after this merge, last experience was very disappointing, people added changes to support ARM and after some months nobody keep working on it, CI kept failing forever and we end with more code to maintain.

kubernetes-sigs/kind#188

I don't want to add build time and complexity and pay to store more image contents without a concrete use case, most of these CI images are amd64 because SIG k8s infra provides amd64 clusters to host CI and these images are just used as the project's CI environment.

I'm with Ben on this, if people wants to add a CI with kind with ARM I'm happy to support them, answer questions, .... , the image can be copied, and new jobs can be added in parallel, but people has to maintain their own CI,not at the cost of putting more load on us. This can also be reviewed in the future, and if there is more alignment, it is just a matter of merging things ...

@ruquanzhao
Copy link
Member Author

Just got back from a long vacation.

Thank you all for the comments and reviews! @BenTheElder @aojea @chendave

Again, this image is only supported for the purposes of testing KIND in the project's CI, we do not have the maintainer bandwidth to support other things with this, the KIND project is already overloaded as-is. So it is not another topic. This image is an implementation detail of the KIND project's CI and is expressly not supported for other purposes.

We(@chendave and I) are deploying k8s CI with Prow on arm locally and trying to align with upstream, preparing to make it public in the future if possible. We found that prow.k8s.io uses kubekins-e2e and krte for most k8s jobs. So I am wondering if we could add arm64 support to kubekins-e2e and krte.

Kubernetes can be tested on any architecture without this image.

Yes, indeed. Kubernetes can be tested on any architecture without krte. We need this because we want to test k8s with KIND, or else we must do it on bare-metal. Since we are using Prow, working with containers is a more elegant and maintainable way.

I'm with Ben on this, if people wants to add a CI with kind with ARM I'm happy to support them, answer questions, .... , the image can be copied, and new jobs can be added in parallel, but people has to maintain their own CI,not at the cost of putting more load on us. This can also be reviewed in the future, and if there is more alignment, it is just a matter of merging things ...

We have dedicated resources to work or maintain such things if ARM CI is acceptable for KIND or K8S, so this is not a problem anymore. Thanks again for your comments! @aojia

Hope to hear your opinion! @BenTheElder

@BenTheElder
Copy link
Member

Please reach out to SIG k8s infra about a plan for community arm testing resources and staffing maintaining them if that approach is to be taken.

I don't think the community will need or want an additional prow deployment to accomplish testing, so effort to "align" is relatively wasted with respect to being part of community CI in the future. Prow can readily schedule to any kubernetes cluster from the same services cluster.

At this point in time the Kubernetes project and by extension the kind project do not have any non amd64 CI nor any plans to provide any given lack of funding.

This image is maintained for our own upstream testing purposes as documented and is not intended for reuse elsewhere. I do not think the community should pay the cost to maintain and host this for external CI, Antonio and I are stretched extremely thin as-is. It is not a blocker to run downstream kind tests using some other image, and we make no stability guarantees about it's reuse.

This image is an internal detail of the kind project, which is fully on amd64 provided by sig k8s infra.

@k8s-ci-robot
Copy link
Contributor

@ruquanzhao: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-test-infra-prow-image-build-test ee1e861 link true /test pull-test-infra-prow-image-build-test
pull-test-infra-unit-test ee1e861 link true /test pull-test-infra-unit-test

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@dims
Copy link
Member

dims commented Apr 14, 2022

/close

@k8s-ci-robot
Copy link
Contributor

@dims: Closed this PR.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/images cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants