Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated examples/pytorch to disable istio sidecar injection #2004

Merged
merged 2 commits into from
Feb 23, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 10 additions & 1 deletion examples/pytorch/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
## Installation & deployment tips
1. You need to configure your node to utilize GPU. In order this can be done the following way:
1. You need to configure your node to utilize GPU. This can be done the following way:
* Install [nvidia-docker2](https://github.com/NVIDIA/nvidia-docker)
* Connect to your MasterNode and set nvidia as the default run in `/etc/docker/daemon.json`:
```
Expand Down Expand Up @@ -28,3 +28,12 @@
3. Building image. Each example has prebuilt images that are stored on google cloud resources (GCR). If you want to create your own image we recommend using dockerhub. Each example has its own Dockerfile that we strongly advise to use. To build your custom image follow instruction on [TechRepublic](https://www.techrepublic.com/article/how-to-create-a-docker-image-and-push-it-to-docker-hub/).

4. To deploy your job we recommend using official [kubeflow documentation](https://www.kubeflow.org/docs/guides/components/pytorch/). Each example has example yaml files for two versions of apis. Feel free to modify them, e.g. image or number of GPUs.

**Note**: PyTorch job doesn’t work in a user namespace by default because of Istio [automatic sidecar injection](https://istio.io/v1.3/docs/setup/additional-setup/sidecar-injection/#automatic-sidecar-injection). In order to get it running, it needs annotation sidecar.istio.io/inject: "false" to disable it for either PyTorch pods or namespace. For example:

```yaml
template:
metadata:
annotations:
sidecar.istio.io/inject: "false"
```
2 changes: 0 additions & 2 deletions examples/pytorch/cpu-demo/demo.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,6 @@ spec:
containers:
- name: pytorch
image: pytorch-cpu:py3.8
imagePullPolicy: Always
command:
- "torchrun"
- "demo.py"
Expand All @@ -25,7 +24,6 @@ spec:
containers:
- name: pytorch
image: pytorch-cpu:py3.8
imagePullPolicy: Always
command:
- "torchrun"
- "demo.py"
6 changes: 0 additions & 6 deletions examples/pytorch/mnist/v1/pytorch_job_mnist_gloo.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,6 @@ spec:
replicas: 1
restartPolicy: OnFailure
template:
metadata:
annotations:
sidecar.istio.io/inject: "false"
spec:
containers:
- name: pytorch
Expand All @@ -24,9 +21,6 @@ spec:
replicas: 1
restartPolicy: OnFailure
template:
metadata:
annotations:
sidecar.istio.io/inject: "false"
spec:
containers:
- name: pytorch
Expand Down
6 changes: 0 additions & 6 deletions examples/pytorch/mnist/v1/pytorch_job_mnist_mpi.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,6 @@ spec:
replicas: 1
restartPolicy: OnFailure
template:
metadata:
annotations:
sidecar.istio.io/inject: "false"
spec:
containers:
- name: pytorch
Expand All @@ -24,9 +21,6 @@ spec:
replicas: 1
restartPolicy: OnFailure
template:
metadata:
annotations:
sidecar.istio.io/inject: "false"
spec:
containers:
- name: pytorch
Expand Down
6 changes: 0 additions & 6 deletions examples/pytorch/mnist/v1/pytorch_job_mnist_nccl.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,6 @@ spec:
replicas: 1
restartPolicy: OnFailure
template:
metadata:
annotations:
sidecar.istio.io/inject: "false"
spec:
containers:
- name: pytorch
Expand All @@ -23,9 +20,6 @@ spec:
replicas: 1
restartPolicy: OnFailure
template:
metadata:
annotations:
sidecar.istio.io/inject: "false"
spec:
containers:
- name: pytorch
Expand Down
Loading