Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update kubectl version used in kubectl launcher #1899

Closed
robert-ulbrich-mercedes-benz opened this issue Sep 4, 2023 · 2 comments · Fixed by #1909
Closed

Update kubectl version used in kubectl launcher #1899

robert-ulbrich-mercedes-benz opened this issue Sep 4, 2023 · 2 comments · Fixed by #1909

Comments

@robert-ulbrich-mercedes-benz
Copy link
Contributor

When a MPI job is being started a launcher and multiple worker pods are being created. The launcher then connects to the workers using kubectl exec. The kubectl version used is pretty old. It looks like a kubectl cli in version 1.15 is being used.

If sidecar containers are being used, the chances are high that the launcher then connects to the sidecar container instead to the actual worker container. This can be mitigated by setting an annotation kubectl.kubernetes.io/default-container. This annotation can only be correctly interpreted by a kubectl in a version >= 1.21.

So what is needed is an updated kubectl in the MPI launcher pod. Kubectl seems to be installed by an init container named "kubectl-delivery". It uses the following image: mpioperator/kubectl-delivery:latest. That image should be rebuilt using a new version of kubectl, so there is no problem with multiple containers in the MPI worker pods.

@tenzen-y
Copy link
Member

tenzen-y commented Sep 4, 2023

Yes, we need to copy the Dockerfile from the mpi-operator repo into this repo and then construct the build pipeline.

ref: #1857

Feel free to open PR, thanks.

/help
/good-first-issue

@google-oss-prow
Copy link

@tenzen-y:
This request has been marked as suitable for new contributors.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-good-first-issue command.

In response to this:

Yes, we need to copy the Dockerfile from the mpi-operator repo into this repo and then construct the build pipeline.

ref: #1857

Feel free to open PR, thanks.

/help
/good-first-issue

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants