Name		Name	Last commit message	Last commit date
parent directory ..
LLM		LLM
DeepSpeed_mii_handler.py		DeepSpeed_mii_handler.py
Download_deepseed_mii_models.py		Download_deepseed_mii_models.py
Readme.md		Readme.md
config.properties		config.properties
deepspeed_mii_stable_diffusion.py		deepspeed_mii_stable_diffusion.py
query.py		query.py
requirements.txt		requirements.txt
setup_config.json		setup_config.json

Readme.md

Running Stable diffusion model using Microsoft DeepSpeed-MII in Torchserve.

This document briefs on serving HG Stable diffusion model with Microsoft DeepSpeed-MII in Torchserve. With DeepSpeed-MII there has been significant progress in system optimizations for DL model inference, drastically reducing both latency and cost.

Model Paper

Step 1: Download model

Login into huggingface hub with token by running the below command

huggingface-cli login

paste the token generated from huggingface hub.

python Download_deepseed_mii_models.py --model_path downloaded_model --model_name CompVis/stable-diffusion-v1-4 --revision main

The script prints the path where the model is downloaded as below.

downloaded_model/models--bert-base-uncased/snapshots/5546055f03398095e385d7dc625e636cc8910bf2/

Run Stable Diffusion model with DeepSpeed-MII

python deepspeed_mii_stable_diffusion.py --model_path downloaded_model/models--bert-base-uncased/snapshots/5546055f03398095e385d7dc625e636cc8910bf2/ --prompt "a dog chaing a cat"

Step 2: Compress downloaded model

NOTE: Install Zip cli tool

Navigate to the path got from the above script. Here it is

cd downloaded_model/models--bert-base-uncased/snapshots/5546055f03398095e385d7dc625e636cc8910bf2/
zip -r /serve/examples/deepspeed_mii/model.zip *

Step 3: Generate MAR file

Navigate up to deepspeed_mii directory.

torch-model-archiver --model-name stable-diffusion --version 1.0 --handler DeepSpeed_mii_handler.py --extra-files model.zip -r requirements.txt

DeepSpeed-MII by default support 2 kinds of deployments AzureML and Local deployment. The model optimized by deepspeed MII is served via AzureML endpoint for Azure and gRPC endpoint for local deployment. For Torchserve the internal gRPC server is bye passed and the optimized model in loaded in handler.

NOTE: Refer deepspeed_mii_stable_diffusion.py file for using DeepSpeed-MII without the gRPC server.

Huggingface Stable Diffusion

Step 4: Start torchserve

Update config.properties and start torchserve

Increase max_response_size for image response.

Refer: https://github.com/pytorch/serve/blob/master/docs/configuration.md#other-properties

torchserve --start --ts-config config.properties --disable-token-auth  --enable-model-api

Step 5: Run inference

python query.py --url "http://localhost:8080/predictions/stable-diffusion" --prompt "a photo of an astronaut riding a horse on mars"

The image generated will be written to a file output-20221027213010.jpg.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

deepspeed_mii

deepspeed_mii

Readme.md

Running Stable diffusion model using Microsoft DeepSpeed-MII in Torchserve.

Step 1: Download model

Run Stable Diffusion model with DeepSpeed-MII

Step 2: Compress downloaded model

Step 3: Generate MAR file

Step 4: Start torchserve

Step 5: Run inference

Files

deepspeed_mii

Directory actions

More options

Directory actions

More options

Latest commit

History

deepspeed_mii

Folders and files

parent directory

Readme.md

Running Stable diffusion model using Microsoft DeepSpeed-MII in Torchserve.

Step 1: Download model

Run Stable Diffusion model with DeepSpeed-MII

Step 2: Compress downloaded model

Step 3: Generate MAR file

Step 4: Start torchserve

Step 5: Run inference