Added k8s mnist example using minikube #2323

agunapal · 2023-05-04T20:39:09Z

Description

Added an example for torchserve inference using minikube

Fixes #(issue)

Type of change

Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
New feature (non-breaking change which adds functionality)
This change requires a documentation update

Feature/Issue validation/testing

minikube.txt

Logs included in the README

(torchserve) ubuntu@ip-172-31-60-100:~/serve$ torch-model-archiver --model-name mnist --version 1.0 --model-file examples/image_classifier/mnist/mnist.py --serialized-file examples/image_classifier/mnist/mnist_cnn.pt --handler  examples/image_classifier/mnist/mnist_handler.py
(torchserve) ubuntu@ip-172-31-60-100:~/serve$ mv mnist.mar model_store/

(torchserve) ubuntu@ip-172-31-60-100:~/serve$ minikube start --mount-string="$HOME/serve:/host" --mount
😄  minikube v1.30.1 on Ubuntu 18.04
✨  Using the docker driver based on existing profile
👍  Starting control plane node minikube in cluster minikube
🚜  Pulling base image ...
🎉  minikube 1.31.1 is available! Download it: https://github.com/kubernetes/minikube/releases/tag/v1.31.1
💡  To disable this notice, run: 'minikube config set WantUpdateNotification false'

🤷  docker "minikube" container is missing, will recreate.



🔥  Creating docker container (CPUs=2, Memory=15900MB) ...

🧯  Docker is nearly out of disk space, which may cause deployments to fail! (94% of capacity). You can pass '--force' to skip this check.
💡  Suggestion: 

    Try one or more of the following to free up space on the device:
    
    1. Run "docker system prune" to remove unused Docker data (optionally with "-a")
    2. Increase the storage allocated to Docker for Desktop by clicking on:
    Docker icon > Preferences > Resources > Disk Image Size
    3. Run "minikube ssh -- docker system prune" if using the Docker container runtime
🍿  Related issue: https://github.com/kubernetes/minikube/issues/9024


🧯  Docker is nearly out of disk space, which may cause deployments to fail! (94% of capacity). You can pass '--force' to skip this check.
💡  Suggestion: 

    Try one or more of the following to free up space on the device:
    
    1. Run "docker system prune" to remove unused Docker data (optionally with "-a")
    2. Increase the storage allocated to Docker for Desktop by clicking on:
    Docker icon > Preferences > Resources > Disk Image Size
    3. Run "minikube ssh -- docker system prune" if using the Docker container runtime
🍿  Related issue: https://github.com/kubernetes/minikube/issues/9024

🐳  Preparing Kubernetes v1.26.3 on Docker 23.0.2 ...
    ▪ Generating certificates and keys ...
    ▪ Booting up control plane ...
    ▪ Configuring RBAC rules ...
🔗  Configuring bridge CNI (Container Networking Interface) ...
    ▪ Using image gcr.io/k8s-minikube/storage-provisioner:v5
🔎  Verifying Kubernetes components...
🌟  Enabled addons: storage-provisioner, default-storageclass
🏄  Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default
(torchserve) ubuntu@ip-172-31-60-100:~/serve$ 


(torchserve) ubuntu@ip-172-31-60-100:~/serve$ kubectl apply -f kubernetes/examples/mnist/deployment.yaml
deployment.apps/ts-def created
(torchserve) ubuntu@ip-172-31-60-100:~/serve$ kubectl get pods
No resources found in default namespace.
(torchserve) ubuntu@ip-172-31-60-100:~/serve$ kubectl get pods
NAME                      READY   STATUS              RESTARTS   AGE
ts-def-5c95fdfd57-rc25d   0/1     ContainerCreating   0          8s
(torchserve) ubuntu@ip-172-31-60-100:~/serve$ kubectl get pods
NAME                      READY   STATUS              RESTARTS   AGE
ts-def-5c95fdfd57-rc25d   0/1     ContainerCreating   0          29s
(torchserve) ubuntu@ip-172-31-60-100:~/serve$ kubectl get pods
NAME                      READY   STATUS    RESTARTS   AGE
ts-def-5c95fdfd57-rc25d   1/1     Running   0          69s
(torchserve) ubuntu@ip-172-31-60-100:~/serve$ kubectl apply -f kubernetes/examples/mnist/service.yaml
service/ts-def created
(torchserve) ubuntu@ip-172-31-60-100:~/serve$ kubectl get svc
NAME         TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)                         AGE
kubernetes   ClusterIP   10.96.0.1      <none>        443/TCP                         101s
ts-def       NodePort    10.98.186.21   <none>        8080:30279/TCP,8081:31843/TCP   7s
(torchserve) ubuntu@ip-172-31-60-100:~/serve$ kubectl port-forward svc/ts-def 8080:8080 8081:8081 &
[1] 41401
(torchserve) ubuntu@ip-172-31-60-100:~/serve$ Forwarding from 127.0.0.1:8080 -> 8080
Forwarding from [::1]:8080 -> 8080
Forwarding from 127.0.0.1:8081 -> 8081
Forwarding from [::1]:8081 -> 8081

(torchserve) ubuntu@ip-172-31-60-100:~/serve$ 
(torchserve) ubuntu@ip-172-31-60-100:~/serve$ curl -X POST "localhost:8081/models?model_name=mnist&url=mnist.mar&initial_workers=4"
Handling connection for 8081
{
  "status": "Model \"mnist\" Version: 1.0 registered with 4 initial workers"
}
(torchserve) ubuntu@ip-172-31-60-100:~/serve$ curl http://127.0.0.1:8080/predictions/mnist -T examples/image_classifier/mnist/test_data/0.png
Handling connection for 8080
0(torchserve) ubuntu@ip-172-31-60-100:~/serve$ 
(torchserve) ubuntu@ip-172-31-60-100:~/serve$ 
(torchserve) ubuntu@ip-172-31-60-100:~/serve$ 
(torchserve) ubuntu@ip-172-31-60-100:~/serve$ minikube stop
✋  Stopping node "minikube"  ...
🛑  Powering off "minikube" via SSH ...
error: lost connection to pod

🛑  1 node stopped.
[1]+  Exit 1                  kubectl port-forward svc/ts-def 8080:8080 8081:8081
(torchserve) ubuntu@ip-172-31-60-100:~/serve$ minikube delete
🔥  Deleting "minikube" in docker ...
🔥  Deleting container "minikube" ...
🔥  Removing /home/ubuntu/.minikube/machines/minikube ...
💀  Removed all traces of the "minikube" cluster.
(torchserve) ubuntu@ip-172-31-60-100:~/serve$

Checklist:

Did you have fun?
Have you added tests that prove your fix is effective or that this feature works?
Has code been commented, particularly in hard-to-understand areas?
Have you made corresponding changes to the documentation?

codecov · 2023-05-04T21:03:17Z

Codecov Report

Merging #2323 (261d99b) into master (61f1c41) will not change coverage.
The diff coverage is n/a.

❗ Current head 261d99b differs from pull request most recent head 234a651. Consider uploading reports for the commit 234a651 to get more accurate results

@@           Coverage Diff           @@
##           master    #2323   +/-   ##
=======================================
  Coverage   72.66%   72.66%           
=======================================
  Files          78       78           
  Lines        3669     3669           
  Branches       58       58           
=======================================
  Hits         2666     2666           
  Misses        999      999           
  Partials        4        4

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

msaroufim · 2023-05-05T16:21:43Z

kubernetes/examples/mnist/deployment.yaml

+            path: /host/model_store
+      containers:
+        - name: torchserve
+          image: pytorch/torchserve:latest-cpu


We also ship a kserve image, would that be more appropriate to use? Look at Dockerfile here https://github.com/pytorch/serve/tree/master/kubernetes

chauhang · 2023-07-19T05:20:40Z

@agunapal Please resolve the conflicts for merging the PR

agunapal · 2023-07-21T18:33:10Z

@chauhang Done.

msaroufim · 2023-07-21T21:43:29Z

LGTM, can we get another stamp?

agunapal · 2023-07-24T21:38:56Z

@chauhang Please find the logs included in the PR

chauhang · 2023-07-29T02:16:11Z

kubernetes/examples/mnist/deployment.yaml

+            path: /host/model_store
+      containers:
+        - name: torchserve
+          image: pytorch/torchserve:latest-cpu


Will be good to also include a version for GPU, Can we added as a follow-on PR

chauhang

@agunapal Thanks for this example. It will be good to also do a follow-up PR for GPU configuration, which requires additional special settings for CUDA setup

Added k8s mnist example using minikube

22576e6

agunapal requested review from jagadeeshi2i, msaroufim and chauhang May 4, 2023 20:39

agunapal mentioned this pull request May 4, 2023

Torchserve gives errors while running docker image on k8s but not when running image locally #2300

Open

spellcheck addition

6325abf

msaroufim reviewed May 5, 2023

View reviewed changes

msaroufim approved these changes May 5, 2023

View reviewed changes

Merge branch 'master' into feature/k8s_example

0642535

msaroufim requested review from lxning, namannandan and HamidShojanazeri July 21, 2023 21:43

agunapal and others added 6 commits July 21, 2023 15:24

Merge branch 'master' into feature/k8s_example

89deae6

Merge branch 'master' into feature/k8s_example

682f70f

remove bash

43190fa

remove bash

a2bf251

formatting

f62679f

Merge branch 'master' into feature/k8s_example

eb3884a

agunapal added 2 commits July 24, 2023 14:46

Merge branch 'master' into feature/k8s_example

1688510

Merge branch 'master' into feature/k8s_example

234a651

chauhang reviewed Jul 29, 2023

View reviewed changes

chauhang approved these changes Jul 29, 2023

View reviewed changes

msaroufim merged commit e2cd91b into master Jul 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added k8s mnist example using minikube #2323

Added k8s mnist example using minikube #2323

agunapal commented May 4, 2023 •

edited

Loading

codecov bot commented May 4, 2023 •

edited

Loading

msaroufim May 5, 2023

chauhang commented Jul 19, 2023

agunapal commented Jul 21, 2023

msaroufim commented Jul 21, 2023

agunapal commented Jul 24, 2023

chauhang Jul 29, 2023

chauhang left a comment

Added k8s mnist example using minikube #2323

Added k8s mnist example using minikube #2323

Conversation

agunapal commented May 4, 2023 • edited Loading

Description

Type of change

Feature/Issue validation/testing

Checklist:

codecov bot commented May 4, 2023 • edited Loading

Codecov Report

msaroufim May 5, 2023

Choose a reason for hiding this comment

chauhang commented Jul 19, 2023

agunapal commented Jul 21, 2023

msaroufim commented Jul 21, 2023

agunapal commented Jul 24, 2023

chauhang Jul 29, 2023

Choose a reason for hiding this comment

chauhang left a comment

Choose a reason for hiding this comment

agunapal commented May 4, 2023 •

edited

Loading

codecov bot commented May 4, 2023 •

edited

Loading