Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A guide for loading models in TorchServe #2592

Merged
merged 6 commits into from
Sep 15, 2023
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ TorchServe is a performant, flexible and easy to use tool for serving PyTorch ea
* [Serving Quick Start](https://github.com/pytorch/serve/blob/master/README.md#serve-a-model) - Basic server usage tutorial
* [Model Archive Quick Start](https://github.com/pytorch/serve/tree/master/model-archiver#creating-a-model-archive) - Tutorial that shows you how to package a model archive file.
* [Installation](https://github.com/pytorch/serve/blob/master/README.md#install-torchserve) - Installation procedures
* [Model loading](model_loading.md) - How to load a model in TorchServe?
* [Serving Models](server.md) - Explains how to use TorchServe
* [REST API](rest_api.md) - Specification on the API endpoint for TorchServe
* [gRPC API](grpc_api.md) - TorchServe supports gRPC APIs for both inference and management calls
Expand Down
28 changes: 28 additions & 0 deletions docs/model_loading.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# How to load a model in TorchServe

There are multiple ways to load to model in TorchServe. The below flowchart tries to simplify the process and shows the various options

```mermaid
flowchart TD
id1[[How to load a model in TorchServe?]] --> id13{Handler has an initialize method?}
id13{Handler has an initialize method?} -- No, using BaseHandler initialize method --> id2{Model Type?} --> id3(PyTorch Eager) & id4(TorchScripted) & id5(ONNX) & id6(TensorRT)
id3(PyTorch Eager) --Required--> id7(Model File & weights file)
id4(TorchScripted) --Required--> id8(TorchScripted weights ending in '.pt')
id5(ONNX) --Required--> id9(Weights ending in '.onnx')
id6(TensorRT) --Required--> id10(TensorRT weights ending in '.pt')
id7(Model File & weights file) & id8(TorchScripted weights ending in '.pt') & id9(Weights ending in '.onnx') & id10(TensorRT weights ending in '.pt') --> id11(Created a model archive .mar file)
id13{Handler has an initialize method?} -- yes --> id11(Create a model archive .mar file)
id15["Pass the weights with --serialized-file option
- Completely packaged for production/reproducibility
- Model archiving and model loading can be slow for large models"]
id16["Pass the path to the weights in model-config.yaml
- Extremely fast to create model archive
- You can use deferred initialization for large models
- Model loading can be faster for large model
- Model management can be harder"]
id11(Create a model archive .mar file) --> id14{Self-contained package} --Yes--> id15
id14{Self-contained package} --No--> id16
id15 & id16 --> id17[Start TorchServe with mar file]
id15 & id16 --> id18[Start TorchServe] --> id19[Register Model with mar file]

```
Loading