diff --git a/index.rst b/index.rst index 21c1302268..9100612941 100644 --- a/index.rst +++ b/index.rst @@ -286,7 +286,7 @@ What's new in PyTorch tutorials? :header: Introduction to ONNX Registry :card_description: Demonstrate end-to-end how to address unsupported operators by using ONNX Registry. :image: _static/img/thumbnails/cropped/Exporting-PyTorch-Models-to-ONNX-Graphs.png - :link: advanced/onnx_registry_tutorial.html + :link: advanced/onnx_registry_tutorial.html :tags: Production,ONNX,Backends .. Reinforcement Learning @@ -1043,6 +1043,7 @@ Additional Resources intermediate/scaled_dot_product_attention_tutorial beginner/knowledge_distillation_tutorial + .. toctree:: :maxdepth: 2 :includehidden: diff --git a/recipes_source/recipes_index.rst b/recipes_source/recipes_index.rst index 10a6ca3fe3..5dc5587445 100644 --- a/recipes_source/recipes_index.rst +++ b/recipes_source/recipes_index.rst @@ -324,6 +324,15 @@ Recipes are bite-sized, actionable examples of how to use specific PyTorch featu :link: ../recipes/DCP_tutorial.html :tags: Distributed-Training +.. TorchServe + +.. customcarditem:: + :header: Deploying a PyTorch Stable Diffusion model as a Vertex AI Endpoint + :card_description: Learn how to deploy model in Vertex AI with TorchServe + :image: ../_static/img/thumbnails/cropped/generic-pytorch-logo.png + :link: ../recipes/torchserve_vertexai_tutorial.html + :tags: Production + .. End of tutorial card section .. raw:: html diff --git a/recipes_source/torchserve_vertexai_tutorial.rst b/recipes_source/torchserve_vertexai_tutorial.rst new file mode 100644 index 0000000000..9c748e7b8c --- /dev/null +++ b/recipes_source/torchserve_vertexai_tutorial.rst @@ -0,0 +1,144 @@ +Deploying a PyTorch Stable Diffusion model as a Vertex AI Endpoint +================================================================== + +Deploying large models, like Stable Diffusion, can be challenging and time-consuming. + +In this recipe, we will show how you can streamline the deployment of a PyTorch Stable Diffusion +model by leveraging Vertex AI. + +PyTorch is the framework used by Stability AI on Stable +Diffusion v1.5. Vertex AI is a fully-managed machine learning platform with tools and +infrastructure designed to help ML practitioners accelerate and scale ML in production with +the benefit of open-source frameworks like PyTorch. + +In four steps you can deploy a PyTorch Stable Diffusion model (v1.5). + +Deploying your Stable Diffusion model on a Vertex AI Endpoint can be done in four steps: + +* Create a custom TorchServe handler. + +* Upload model artifacts to Google Cloud Storage (GCS). + +* Create a Vertex AI model with the model artifacts and a prebuilt PyTorch container image. + +* Deploy the Vertex AI model onto an endpoint. + +Let’s have a look at each step in more detail. You can follow and implement the steps using the +`Notebook example `__. + +NOTE: Please keep in mind that this recipe requires a billable Vertex AI as explained in more details in the notebook example. + +Create a custom TorchServe handler +---------------------------------- + +TorchServe is an easy and flexible tool for serving PyTorch models. The model deployed to Vertex AI +uses TorchServe to handle requests and return responses from the model. +You must create a custom TorchServe handler to include in the model artifacts uploaded to Vertex AI. Include the handler file in the +directory with the other model artifacts, like this: `model_artifacts/handler.py`. + +After creating the handler file, you must package the handler as a model archiver (MAR) file. +The output file must be named `model.mar`. + + +.. code:: shell + + !torch-model-archiver \ + -f \ + --model-name \ + --version 1.0 \ + --handler model_artifacts/handler.py \ + --export-path model_artifacts + +Upload model artifacts to Google Cloud Storage (GCS) +---------------------------------------------------- + +In this step we are uploading +`model artifacts `__ +to GCS, like the model file or handler. The advantage of storing your artifacts on GCS is that you can +track the artifacts in a central bucket. + + +.. code:: shell + + BUCKET_NAME = "your-bucket-name-unique" # @param {type:"string"} + BUCKET_URI = f"gs://{BUCKET_NAME}/" + + # Will copy the artifacts into the bucket + !gsutil cp -r model_artifacts $BUCKET_URI + +Create a Vertex AI model with the model artifacts and a prebuilt PyTorch container image +---------------------------------------------------------------------------------------- + +Once you've uploaded the model artifacts into a GCS bucket, you can upload your PyTorch model to +`Vertex AI Model Registry `__. +From the Vertex AI Model Registry, you have an overview of your models +so you can better organize, track, and train new versions. For this you can use the +`Vertex AI SDK `__ +and this +`pre-built PyTorch container `__. + + +.. code:: shell + + from google.cloud import aiplatform as vertexai + PYTORCH_PREDICTION_IMAGE_URI = ( + "us-docker.pkg.dev/vertex-ai/prediction/pytorch-gpu.1-12:latest" + ) + MODEL_DISPLAY_NAME = "stable_diffusion_1_5-unique" + MODEL_DESCRIPTION = "stable_diffusion_1_5 container" + + vertexai.init(project='your_project', location='us-central1', staging_bucket=BUCKET_NAME) + + model = aiplatform.Model.upload( + display_name=MODEL_DISPLAY_NAME, + description=MODEL_DESCRIPTION, + serving_container_image_uri=PYTORCH_PREDICTION_IMAGE_URI, + artifact_uri=BUCKET_URI, + ) + +Deploy the Vertex AI model onto an endpoint +------------------------------------------- + +Once the model has been uploaded to Vertex AI Model Registry you can then take it and deploy +it to an Vertex AI Endpoint. For this you can use the Console or the Vertex AI SDK. In this +example you will deploy the model on a NVIDIA Tesla P100 GPU and n1-standard-8 machine. You can +specify your machine type. + + +.. code:: shell + + endpoint = aiplatform.Endpoint.create(display_name=ENDPOINT_DISPLAY_NAME) + + model.deploy( + endpoint=endpoint, + deployed_model_display_name=MODEL_DISPLAY_NAME, + machine_type="n1-standard-8", + accelerator_type="NVIDIA_TESLA_P100", + accelerator_count=1, + traffic_percentage=100, + deploy_request_timeout=1200, + sync=True, + ) + +If you follow this +`notebook `__ +you can also get online predictions using the Vertex AI SDK as shown in the following snippet. + + +.. code:: shell + + instances = [{"prompt": "An examplePup dog with a baseball jersey."}] + response = endpoint.predict(instances=instances) + + with open("img.jpg", "wb") as g: + g.write(base64.b64decode(response.predictions[0])) + + display.Image("img.jpg") + +Create a Vertex AI model with the model artifacts and a prebuilt PyTorch container image + +More resources +-------------- + +This tutorial was created using the vendor documentation. To refer to the original documentation on the vendor site, please see +`torchserve example `__.