From c469292b0acdef392c83d8b6f6b575826010522c Mon Sep 17 00:00:00 2001 From: agunapal Date: Wed, 13 Sep 2023 22:34:02 +0000 Subject: [PATCH 1/6] A guide for loading models in TorchServe --- docs/README.md | 1 + docs/model_loading.md | 28 ++++++++++++++++++++++++++++ 2 files changed, 29 insertions(+) create mode 100644 docs/model_loading.md diff --git a/docs/README.md b/docs/README.md index f44e6c56cc..cf47f95f81 100644 --- a/docs/README.md +++ b/docs/README.md @@ -7,6 +7,7 @@ TorchServe is a performant, flexible and easy to use tool for serving PyTorch ea * [Serving Quick Start](https://github.com/pytorch/serve/blob/master/README.md#serve-a-model) - Basic server usage tutorial * [Model Archive Quick Start](https://github.com/pytorch/serve/tree/master/model-archiver#creating-a-model-archive) - Tutorial that shows you how to package a model archive file. * [Installation](https://github.com/pytorch/serve/blob/master/README.md#install-torchserve) - Installation procedures +* [Model loading](model_loading.md) - How to load a model in TorchServe? * [Serving Models](server.md) - Explains how to use TorchServe * [REST API](rest_api.md) - Specification on the API endpoint for TorchServe * [gRPC API](grpc_api.md) - TorchServe supports gRPC APIs for both inference and management calls diff --git a/docs/model_loading.md b/docs/model_loading.md new file mode 100644 index 0000000000..e9b3f39712 --- /dev/null +++ b/docs/model_loading.md @@ -0,0 +1,28 @@ +# How to load a model in TorchServe + +There are multiple ways to load to model in TorchServe. The below flowchart tries to simplify the process and shows the various options + +```mermaid +flowchart TD + id1[[How to load a model in TorchServe?]] --> id13{Handler has an initialize method?} + id13{Handler has an initialize method?} -- No, using BaseHandler initialize method --> id2{Model Type?} --> id3(PyTorch Eager) & id4(TorchScripted) & id5(ONNX) & id6(TensorRT) + id3(PyTorch Eager) --> id7(Model File & weights file) + id4(TorchScripted) --> id8(TorchScripted weights ending in '.pt') + id5(ONNX) --> id9(Weights ending in '.onnx') + id6(TensorRT) --> id10(TensorRT weights ending in '.pt') + id7(Model File & weights file) & id8(TorchScripted weights ending in '.pt') & id9(Weights ending in '.onnx') & id10(TensorRT weights ending in '.pt') --> id11(Created a model archive .mar file) + id13{Handler has an initialize method?} -- yes --> id11(Create a model archive .mar file) + id15["Pass the weights with --serialized-file option + - Completely packaged for production/reproducibility + - Model archiving and model loading can be slow for large models"] + id16["Pass the path to the weights in model-config.yaml + - Extremely fast to create model archive + - You can use defered initialization for large models + - Model loading can be faster for large model + - Model management can be harder"] + id11(Create a model archive .mar file) --> id14{Self-contained package} --Yes--> id15 + id14{Self-contained package} --No--> id16 + id15 & id16 --> id17[Start TorchServe with mar file] + id15 & id16 --> id18[Start TorchServe] --> id19[Register Model with mar file] + +``` From 9379ae51c9c9a6928b4b2e1f8cea355400968846 Mon Sep 17 00:00:00 2001 From: agunapal Date: Wed, 13 Sep 2023 22:36:26 +0000 Subject: [PATCH 2/6] A guide for loading models in TorchServe --- docs/model_loading.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/docs/model_loading.md b/docs/model_loading.md index e9b3f39712..045af44ff4 100644 --- a/docs/model_loading.md +++ b/docs/model_loading.md @@ -6,10 +6,10 @@ There are multiple ways to load to model in TorchServe. The below flowchart trie flowchart TD id1[[How to load a model in TorchServe?]] --> id13{Handler has an initialize method?} id13{Handler has an initialize method?} -- No, using BaseHandler initialize method --> id2{Model Type?} --> id3(PyTorch Eager) & id4(TorchScripted) & id5(ONNX) & id6(TensorRT) - id3(PyTorch Eager) --> id7(Model File & weights file) - id4(TorchScripted) --> id8(TorchScripted weights ending in '.pt') - id5(ONNX) --> id9(Weights ending in '.onnx') - id6(TensorRT) --> id10(TensorRT weights ending in '.pt') + id3(PyTorch Eager) --Required--> id7(Model File & weights file) + id4(TorchScripted) --Required--> id8(TorchScripted weights ending in '.pt') + id5(ONNX) --Required--> id9(Weights ending in '.onnx') + id6(TensorRT) --Required--> id10(TensorRT weights ending in '.pt') id7(Model File & weights file) & id8(TorchScripted weights ending in '.pt') & id9(Weights ending in '.onnx') & id10(TensorRT weights ending in '.pt') --> id11(Created a model archive .mar file) id13{Handler has an initialize method?} -- yes --> id11(Create a model archive .mar file) id15["Pass the weights with --serialized-file option @@ -17,7 +17,7 @@ flowchart TD - Model archiving and model loading can be slow for large models"] id16["Pass the path to the weights in model-config.yaml - Extremely fast to create model archive - - You can use defered initialization for large models + - You can use deferred initialization for large models - Model loading can be faster for large model - Model management can be harder"] id11(Create a model archive .mar file) --> id14{Self-contained package} --Yes--> id15 From 074a2530c47801673d062a102207d08d8bee448b Mon Sep 17 00:00:00 2001 From: agunapal Date: Wed, 13 Sep 2023 23:02:06 +0000 Subject: [PATCH 3/6] Based on feedback --- docs/model_loading.md | 21 ++++++++++++++------- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/docs/model_loading.md b/docs/model_loading.md index 045af44ff4..91856cbff7 100644 --- a/docs/model_loading.md +++ b/docs/model_loading.md @@ -2,27 +2,34 @@ There are multiple ways to load to model in TorchServe. The below flowchart tries to simplify the process and shows the various options +` ```mermaid flowchart TD id1[[How to load a model in TorchServe?]] --> id13{Handler has an initialize method?} - id13{Handler has an initialize method?} -- No, using BaseHandler initialize method --> id2{Model Type?} --> id3(PyTorch Eager) & id4(TorchScripted) & id5(ONNX) & id6(TensorRT) + id13{"- Handler has an initialize method? + - Does the initialize method inherit from BaseHandler?"} -- No, using BaseHandler initialize method --> id2{Model Type?} --> id3(PyTorch Eager) & id4(TorchScripted) & id5(ONNX) & id6(TensorRT) id3(PyTorch Eager) --Required--> id7(Model File & weights file) id4(TorchScripted) --Required--> id8(TorchScripted weights ending in '.pt') - id5(ONNX) --Required--> id9(Weights ending in '.onnx') + id5(ONNX) --Required --> id9(Weights ending in '.onnx') id6(TensorRT) --Required--> id10(TensorRT weights ending in '.pt') id7(Model File & weights file) & id8(TorchScripted weights ending in '.pt') & id9(Weights ending in '.onnx') & id10(TensorRT weights ending in '.pt') --> id11(Created a model archive .mar file) - id13{Handler has an initialize method?} -- yes --> id11(Create a model archive .mar file) + id13{"- Handler has an initialize method? + - Does the initialize method inherit from BaseHandler?"} -- yes to both --> id20(Create a custom method to load the model in the handler) --> id11(Create a model archive .mar file) id15["Pass the weights with --serialized-file option - Completely packaged for production/reproducibility - Model archiving and model loading can be slow for large models"] id16["Pass the path to the weights in model-config.yaml - Extremely fast to create model archive - - You can use deferred initialization for large models - - Model loading can be faster for large model + - You can use defered initialization for large models + - Model loading can be faster for large models - Model management can be harder"] - id11(Create a model archive .mar file) --> id14{Self-contained package} --Yes--> id15 - id14{Self-contained package} --No--> id16 + id11(Create a model archive .mar file) --> id14{"Is your model large? + Do you care about model archiving and loading time?"} --No--> id15 + id14{"Is your model large? + Do you care about model archiving and loading time?"} --yes to either--> id16 id15 & id16 --> id17[Start TorchServe with mar file] id15 & id16 --> id18[Start TorchServe] --> id19[Register Model with mar file] + + ``` From b666326e7021f033cf6d8c76366bf3c1fc425544 Mon Sep 17 00:00:00 2001 From: agunapal Date: Wed, 13 Sep 2023 23:12:03 +0000 Subject: [PATCH 4/6] Based on feedback --- docs/model_loading.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/model_loading.md b/docs/model_loading.md index 91856cbff7..1874cfb3a1 100644 --- a/docs/model_loading.md +++ b/docs/model_loading.md @@ -14,7 +14,8 @@ flowchart TD id6(TensorRT) --Required--> id10(TensorRT weights ending in '.pt') id7(Model File & weights file) & id8(TorchScripted weights ending in '.pt') & id9(Weights ending in '.onnx') & id10(TensorRT weights ending in '.pt') --> id11(Created a model archive .mar file) id13{"- Handler has an initialize method? - - Does the initialize method inherit from BaseHandler?"} -- yes to both --> id20(Create a custom method to load the model in the handler) --> id11(Create a model archive .mar file) + - Does the initialize method inherit from BaseHandler?"} -- yes to both --> id20("Create a custom method + to load the model in the handler") --> id11(Create a model archive .mar file) id15["Pass the weights with --serialized-file option - Completely packaged for production/reproducibility - Model archiving and model loading can be slow for large models"] From 3a513f9f6210dfa65672865d5778c8c8c6745987 Mon Sep 17 00:00:00 2001 From: agunapal Date: Wed, 13 Sep 2023 23:27:29 +0000 Subject: [PATCH 5/6] Based on feedback --- docs/model_loading.md | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/docs/model_loading.md b/docs/model_loading.md index 1874cfb3a1..70941b16ac 100644 --- a/docs/model_loading.md +++ b/docs/model_loading.md @@ -2,20 +2,20 @@ There are multiple ways to load to model in TorchServe. The below flowchart tries to simplify the process and shows the various options -` + ```mermaid flowchart TD id1[[How to load a model in TorchServe?]] --> id13{Handler has an initialize method?} - id13{"- Handler has an initialize method? - - Does the initialize method inherit from BaseHandler?"} -- No, using BaseHandler initialize method --> id2{Model Type?} --> id3(PyTorch Eager) & id4(TorchScripted) & id5(ONNX) & id6(TensorRT) + id13{Handler has an initialize method?} -- No, using BaseHandler initialize method --> id2{Model Type?} --> id3(PyTorch Eager) & id4(TorchScripted) & id5(ONNX) & id6(TensorRT) id3(PyTorch Eager) --Required--> id7(Model File & weights file) id4(TorchScripted) --Required--> id8(TorchScripted weights ending in '.pt') id5(ONNX) --Required --> id9(Weights ending in '.onnx') id6(TensorRT) --Required--> id10(TensorRT weights ending in '.pt') id7(Model File & weights file) & id8(TorchScripted weights ending in '.pt') & id9(Weights ending in '.onnx') & id10(TensorRT weights ending in '.pt') --> id11(Created a model archive .mar file) - id13{"- Handler has an initialize method? - - Does the initialize method inherit from BaseHandler?"} -- yes to both --> id20("Create a custom method - to load the model in the handler") --> id11(Create a model archive .mar file) + id13{Handler has an initialize method?} --Yes--> id21{"Does the initialize method inherit from BaseHandler?"} + id21{"Does the initialize method inherit from BaseHandler?"} -- Yes --> id2{Model Type?} + id21{Does the initialize method inherit from BaseHandler?} -- No --> id20("Create a custom method to + load the model in the handler") --> id11(Create a model archive .mar file) id15["Pass the weights with --serialized-file option - Completely packaged for production/reproducibility - Model archiving and model loading can be slow for large models"] @@ -32,5 +32,4 @@ flowchart TD id15 & id16 --> id18[Start TorchServe] --> id19[Register Model with mar file] - ``` From f9a53e8bc96fb50bd8974efd4ed48347465995f8 Mon Sep 17 00:00:00 2001 From: agunapal Date: Thu, 14 Sep 2023 21:38:14 +0000 Subject: [PATCH 6/6] Based on feedback --- docs/model_loading.md | 28 ++++++++++++++-------------- 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/docs/model_loading.md b/docs/model_loading.md index 70941b16ac..8881f53878 100644 --- a/docs/model_loading.md +++ b/docs/model_loading.md @@ -3,6 +3,7 @@ There are multiple ways to load to model in TorchServe. The below flowchart tries to simplify the process and shows the various options + ```mermaid flowchart TD id1[[How to load a model in TorchServe?]] --> id13{Handler has an initialize method?} @@ -16,20 +17,19 @@ flowchart TD id21{"Does the initialize method inherit from BaseHandler?"} -- Yes --> id2{Model Type?} id21{Does the initialize method inherit from BaseHandler?} -- No --> id20("Create a custom method to load the model in the handler") --> id11(Create a model archive .mar file) - id15["Pass the weights with --serialized-file option - - Completely packaged for production/reproducibility - - Model archiving and model loading can be slow for large models"] - id16["Pass the path to the weights in model-config.yaml - - Extremely fast to create model archive - - You can use defered initialization for large models - - Model loading can be faster for large models - - Model management can be harder"] - id11(Create a model archive .mar file) --> id14{"Is your model large? - Do you care about model archiving and loading time?"} --No--> id15 - id14{"Is your model large? - Do you care about model archiving and loading time?"} --yes to either--> id16 - id15 & id16 --> id17[Start TorchServe with mar file] - id15 & id16 --> id18[Start TorchServe] --> id19[Register Model with mar file] + id15["Create model archive by passing the + weights with --serialized-file option"] + id16["Specify path to the weights in model-config.yaml + Create model archive by specifying yaml file with --config-file "] + id11(Work on creating a model archive .mar file) --> id14{"Is your model large?"} --No--> id22{Do you want a self-contained model artifact} --Yes--> id15 + id14{"Is your model large?"} --Yes--> id16 + id22{Do you want a self-contained model artifact} --No, I want model archieving & loading to be faster--> id16 + id15 & id16 --> id17["Start TorchServe. + Two ways of starting torchserve + - Pass the mar file with --models + - Start TorchServe and call the register API with mar file"] + + ```