Skip to content

Commit

Permalink
Tutorial on using TorchServe on Vertex AI
Browse files Browse the repository at this point in the history
This new tutorial is based on the Vertex AI documentation.
The PR provides a similar doc as a tutorial hosted locally.

Fixes #2346

Signed-off-by: Sahdev Zala <spzala@us.ibm.com>
  • Loading branch information
spzala committed Nov 14, 2023
1 parent 16e4f2a commit 0ee1636
Show file tree
Hide file tree
Showing 2 changed files with 149 additions and 0 deletions.
8 changes: 8 additions & 0 deletions index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -637,6 +637,13 @@ What's new in PyTorch tutorials?
:link: beginner/knowledge_distillation_tutorial.html
:tags: Model-Optimization,Image/Video

.. customcarditem::
:header: Deploying a PyTorch Stable Diffusion model as a Vertex AI Endpoint
:card_description: Learn how to deploy model in Vertex AI with TorchServe
:image: _static/img/thumbnails/cropped/generic-pytorch-logo.png
:link: intermediate/torchserve_vertexai_tutorial.rst
:tags: Model-Optimization,Production

.. Parallel-and-Distributed-Training
Expand Down Expand Up @@ -1042,6 +1049,7 @@ Additional Resources
intermediate/inductor_debug_cpu
intermediate/scaled_dot_product_attention_tutorial
beginner/knowledge_distillation_tutorial
intermediate/torchserve_vertexai_tutorial.rst

.. toctree::
:maxdepth: 2
Expand Down
141 changes: 141 additions & 0 deletions intermediate_source/torchserve_vertexai_tutorial.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,141 @@
Deploying a PyTorch Stable Diffusion model as a Vertex AI Endpoint
==================================================================

Deploying large models, like Stable Diffusion, can be challenging and time-consuming.
In this tutorial, we will show how you can streamline the deployment of a PyTorch Stable Diffusion
model by leveraging Vertex AI. PyTorch is the framework used by Stability AI on Stable
Diffusion v1.5. Vertex AI is a fully-managed machine learning platform with tools and
infrastructure designed to help ML practitioners accelerate and scale ML in production with
the benefit of open-source frameworks like PyTorch. In four steps you can deploy a PyTorch
Stable Diffusion model (v1.5).

Deploying your Stable Diffusion model on a Vertex AI Endpoint can be done in four steps:

* Create a custom TorchServe handler.

* Upload model artifacts to Google Cloud Storage (GCS).

* Create a Vertex AI model with the model artifacts and a prebuilt PyTorch container image.

* Deploy the Vertex AI model onto an endpoint.

Let’s have a look at each step in more detail. You can follow and implement the steps using the
`Notebook example <https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/vertex_endpoints/torchserve/dreambooth_stablediffusion.ipynb>`__.

NOTE: please keep in mind that this tutorial requires a billable Vertex AI as explained in more details in the notebook example.

Create a custom TorchServe handler
----------------------------------

TorchServe is an easy and flexible tool for serving PyTorch models. The model deployed to Vertex AI
uses TorchServe to handle requests and return responses from the model. You must create a custom
TorchServe handler to include in the model artifacts uploaded to Vertex AI. Include the handler file in the
directory with the other model artifacts, like this: `model_artifacts/handler.py`.

After creating the handler file, you must package the handler as a model archiver (MAR) file.
The output file must be named `model.mar`.


.. code:: shell
!torch-model-archiver \
-f \
--model-name <your_model_name> \
--version 1.0 \
--handler model_artifacts/handler.py \
--export-path model_artifacts
Upload model artifacts to Google Cloud Storage (GCS)
----------------------------------------------------

In this step we are uploading
`model artifacts <https://github.com/pytorch/serve/tree/master/model-archiver#artifact-details>`__
to GCS, like the model file or handler. The advantage of storing your artifacts on GCS is that you can
track the artifacts in a central bucket.


.. code:: shell
BUCKET_NAME = "your-bucket-name-unique" # @param {type:"string"}
BUCKET_URI = f"gs://{BUCKET_NAME}/"
# Will copy the artifacts into the bucket
!gsutil cp -r model_artifacts $BUCKET_URI
Create a Vertex AI model with the model artifacts and a prebuilt PyTorch container image
----------------------------------------------------------------------------------------

Once you've uploaded the model artifacts into a GCS bucket, you can upload your PyTorch model to
`Vertex AI Model Registry <https://cloud.google.com/vertex-ai/docs/model-registry/introduction>`__.
From the Vertex AI Model Registry, you have an overview of your models
so you can better organize, track, and train new versions. For this you can use the
`Vertex AI SDK <https://cloud.google.com/vertex-ai/docs/python-sdk/use-vertex-ai-python-sdk>`__
and this
`pre-built PyTorch container <https://cloud.google.com/blog/products/ai-machine-learning/prebuilt-containers-with-pytorch-and-vertex-ai>`__.


.. code:: shell
from google.cloud import aiplatform as vertexai
PYTORCH_PREDICTION_IMAGE_URI = (
"us-docker.pkg.dev/vertex-ai/prediction/pytorch-gpu.1-12:latest"
)
MODEL_DISPLAY_NAME = "stable_diffusion_1_5-unique"
MODEL_DESCRIPTION = "stable_diffusion_1_5 container"
vertexai.init(project='your_project', location='us-central1', staging_bucket=BUCKET_NAME)
model = aiplatform.Model.upload(
display_name=MODEL_DISPLAY_NAME,
description=MODEL_DESCRIPTION,
serving_container_image_uri=PYTORCH_PREDICTION_IMAGE_URI,
artifact_uri=BUCKET_URI,
)
Deploy the Vertex AI model onto an endpoint
-------------------------------------------

Once the model has been uploaded to Vertex AI Model Registry you can then take it and deploy
it to an Vertex AI Endpoint. For this you can use the Console or the Vertex AI SDK. In this
example you will deploy the model on a NVIDIA Tesla P100 GPU and n1-standard-8 machine. You can
specify your machine type.


.. code:: shell
endpoint = aiplatform.Endpoint.create(display_name=ENDPOINT_DISPLAY_NAME)
model.deploy(
endpoint=endpoint,
deployed_model_display_name=MODEL_DISPLAY_NAME,
machine_type="n1-standard-8",
accelerator_type="NVIDIA_TESLA_P100",
accelerator_count=1,
traffic_percentage=100,
deploy_request_timeout=1200,
sync=True,
)
If you follow the
`notebook <https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/vertex_endpoints/torchserve/dreambooth_stablediffusion.ipynb>`__
you can also get online predictions using the Vertex AI SDK as shown in the following snippet.


.. code:: shell
instances = [{"prompt": "An examplePup dog with a baseball jersey."}]
response = endpoint.predict(instances=instances)
with open("img.jpg", "wb") as g:
g.write(base64.b64decode(response.predictions[0]))
display.Image("img.jpg")
Create a Vertex AI model with the model artifacts and a prebuilt PyTorch container image

More resources
--------------

This tutorial was created using the vendor documentation. To refer to the original documentation on the vendor site, please see
`torchserve example <https://cloud.google.com/blog/products/ai-machine-learning/get-your-genai-model-going-in-four-easy-steps>`__.

0 comments on commit 0ee1636

Please sign in to comment.