Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added k8s mnist example using minikube #2323

Merged
merged 11 commits into from
Jul 29, 2023
Merged

Added k8s mnist example using minikube #2323

merged 11 commits into from
Jul 29, 2023

Conversation

agunapal
Copy link
Collaborator

@agunapal agunapal commented May 4, 2023

Description

Added an example for torchserve inference using minikube

Fixes #(issue)

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • New feature (non-breaking change which adds functionality)
  • This change requires a documentation update

Feature/Issue validation/testing

minikube.txt

Logs included in the README

(torchserve) ubuntu@ip-172-31-60-100:~/serve$ torch-model-archiver --model-name mnist --version 1.0 --model-file examples/image_classifier/mnist/mnist.py --serialized-file examples/image_classifier/mnist/mnist_cnn.pt --handler  examples/image_classifier/mnist/mnist_handler.py
(torchserve) ubuntu@ip-172-31-60-100:~/serve$ mv mnist.mar model_store/

(torchserve) ubuntu@ip-172-31-60-100:~/serve$ minikube start --mount-string="$HOME/serve:/host" --mount
😄  minikube v1.30.1 on Ubuntu 18.04
✨  Using the docker driver based on existing profile
👍  Starting control plane node minikube in cluster minikube
🚜  Pulling base image ...
🎉  minikube 1.31.1 is available! Download it: https://github.com/kubernetes/minikube/releases/tag/v1.31.1
💡  To disable this notice, run: 'minikube config set WantUpdateNotification false'

🤷  docker "minikube" container is missing, will recreate.



🔥  Creating docker container (CPUs=2, Memory=15900MB) ...

🧯  Docker is nearly out of disk space, which may cause deployments to fail! (94% of capacity). You can pass '--force' to skip this check.
💡  Suggestion: 

    Try one or more of the following to free up space on the device:
    
    1. Run "docker system prune" to remove unused Docker data (optionally with "-a")
    2. Increase the storage allocated to Docker for Desktop by clicking on:
    Docker icon > Preferences > Resources > Disk Image Size
    3. Run "minikube ssh -- docker system prune" if using the Docker container runtime
🍿  Related issue: https://github.com/kubernetes/minikube/issues/9024


🧯  Docker is nearly out of disk space, which may cause deployments to fail! (94% of capacity). You can pass '--force' to skip this check.
💡  Suggestion: 

    Try one or more of the following to free up space on the device:
    
    1. Run "docker system prune" to remove unused Docker data (optionally with "-a")
    2. Increase the storage allocated to Docker for Desktop by clicking on:
    Docker icon > Preferences > Resources > Disk Image Size
    3. Run "minikube ssh -- docker system prune" if using the Docker container runtime
🍿  Related issue: https://github.com/kubernetes/minikube/issues/9024

🐳  Preparing Kubernetes v1.26.3 on Docker 23.0.2 ...
    ▪ Generating certificates and keys ...
    ▪ Booting up control plane ...
    ▪ Configuring RBAC rules ...
🔗  Configuring bridge CNI (Container Networking Interface) ...
    ▪ Using image gcr.io/k8s-minikube/storage-provisioner:v5
🔎  Verifying Kubernetes components...
🌟  Enabled addons: storage-provisioner, default-storageclass
🏄  Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default
(torchserve) ubuntu@ip-172-31-60-100:~/serve$ 


(torchserve) ubuntu@ip-172-31-60-100:~/serve$ kubectl apply -f kubernetes/examples/mnist/deployment.yaml
deployment.apps/ts-def created
(torchserve) ubuntu@ip-172-31-60-100:~/serve$ kubectl get pods
No resources found in default namespace.
(torchserve) ubuntu@ip-172-31-60-100:~/serve$ kubectl get pods
NAME                      READY   STATUS              RESTARTS   AGE
ts-def-5c95fdfd57-rc25d   0/1     ContainerCreating   0          8s
(torchserve) ubuntu@ip-172-31-60-100:~/serve$ kubectl get pods
NAME                      READY   STATUS              RESTARTS   AGE
ts-def-5c95fdfd57-rc25d   0/1     ContainerCreating   0          29s
(torchserve) ubuntu@ip-172-31-60-100:~/serve$ kubectl get pods
NAME                      READY   STATUS    RESTARTS   AGE
ts-def-5c95fdfd57-rc25d   1/1     Running   0          69s
(torchserve) ubuntu@ip-172-31-60-100:~/serve$ kubectl apply -f kubernetes/examples/mnist/service.yaml
service/ts-def created
(torchserve) ubuntu@ip-172-31-60-100:~/serve$ kubectl get svc
NAME         TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)                         AGE
kubernetes   ClusterIP   10.96.0.1      <none>        443/TCP                         101s
ts-def       NodePort    10.98.186.21   <none>        8080:30279/TCP,8081:31843/TCP   7s
(torchserve) ubuntu@ip-172-31-60-100:~/serve$ kubectl port-forward svc/ts-def 8080:8080 8081:8081 &
[1] 41401
(torchserve) ubuntu@ip-172-31-60-100:~/serve$ Forwarding from 127.0.0.1:8080 -> 8080
Forwarding from [::1]:8080 -> 8080
Forwarding from 127.0.0.1:8081 -> 8081
Forwarding from [::1]:8081 -> 8081

(torchserve) ubuntu@ip-172-31-60-100:~/serve$ 
(torchserve) ubuntu@ip-172-31-60-100:~/serve$ curl -X POST "localhost:8081/models?model_name=mnist&url=mnist.mar&initial_workers=4"
Handling connection for 8081
{
  "status": "Model \"mnist\" Version: 1.0 registered with 4 initial workers"
}
(torchserve) ubuntu@ip-172-31-60-100:~/serve$ curl http://127.0.0.1:8080/predictions/mnist -T examples/image_classifier/mnist/test_data/0.png
Handling connection for 8080
0(torchserve) ubuntu@ip-172-31-60-100:~/serve$ 
(torchserve) ubuntu@ip-172-31-60-100:~/serve$ 
(torchserve) ubuntu@ip-172-31-60-100:~/serve$ 
(torchserve) ubuntu@ip-172-31-60-100:~/serve$ minikube stop
✋  Stopping node "minikube"  ...
🛑  Powering off "minikube" via SSH ...
error: lost connection to pod

🛑  1 node stopped.
[1]+  Exit 1                  kubectl port-forward svc/ts-def 8080:8080 8081:8081
(torchserve) ubuntu@ip-172-31-60-100:~/serve$ minikube delete
🔥  Deleting "minikube" in docker ...
🔥  Deleting container "minikube" ...
🔥  Removing /home/ubuntu/.minikube/machines/minikube ...
💀  Removed all traces of the "minikube" cluster.
(torchserve) ubuntu@ip-172-31-60-100:~/serve$ 

Checklist:

  • Did you have fun?
  • Have you added tests that prove your fix is effective or that this feature works?
  • Has code been commented, particularly in hard-to-understand areas?
  • Have you made corresponding changes to the documentation?

@codecov
Copy link

codecov bot commented May 4, 2023

Codecov Report

Merging #2323 (261d99b) into master (61f1c41) will not change coverage.
The diff coverage is n/a.

❗ Current head 261d99b differs from pull request most recent head 234a651. Consider uploading reports for the commit 234a651 to get more accurate results

@@           Coverage Diff           @@
##           master    #2323   +/-   ##
=======================================
  Coverage   72.66%   72.66%           
=======================================
  Files          78       78           
  Lines        3669     3669           
  Branches       58       58           
=======================================
  Hits         2666     2666           
  Misses        999      999           
  Partials        4        4           

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

path: /host/model_store
containers:
- name: torchserve
image: pytorch/torchserve:latest-cpu
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also ship a kserve image, would that be more appropriate to use? Look at Dockerfile here https://github.com/pytorch/serve/tree/master/kubernetes

@chauhang
Copy link
Contributor

@agunapal Please resolve the conflicts for merging the PR

@agunapal
Copy link
Collaborator Author

@chauhang Done.

@msaroufim
Copy link
Member

LGTM, can we get another stamp?

@agunapal
Copy link
Collaborator Author

@chauhang Please find the logs included in the PR

path: /host/model_store
containers:
- name: torchserve
image: pytorch/torchserve:latest-cpu
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will be good to also include a version for GPU, Can we added as a follow-on PR

Copy link
Contributor

@chauhang chauhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@agunapal Thanks for this example. It will be good to also do a follow-up PR for GPU configuration, which requires additional special settings for CUDA setup

@msaroufim msaroufim merged commit e2cd91b into master Jul 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants