From c469292b0acdef392c83d8b6f6b575826010522c Mon Sep 17 00:00:00 2001
From: agunapal <agunapal@ischool.berkeley.edu>
Date: Wed, 13 Sep 2023 22:34:02 +0000
Subject: [PATCH 1/6] A guide for loading models in TorchServe

---
 docs/README.md        |  1 +
 docs/model_loading.md | 28 ++++++++++++++++++++++++++++
 2 files changed, 29 insertions(+)
 create mode 100644 docs/model_loading.md

diff --git a/docs/README.md b/docs/README.md
index f44e6c56cc..cf47f95f81 100644
--- a/docs/README.md
+++ b/docs/README.md
@@ -7,6 +7,7 @@ TorchServe is a performant, flexible and easy to use tool for serving PyTorch ea
 * [Serving Quick Start](https://github.com/pytorch/serve/blob/master/README.md#serve-a-model) - Basic server usage tutorial
 * [Model Archive Quick Start](https://github.com/pytorch/serve/tree/master/model-archiver#creating-a-model-archive) - Tutorial that shows you how to package a model archive file.
 * [Installation](https://github.com/pytorch/serve/blob/master/README.md#install-torchserve) - Installation procedures
+* [Model loading](model_loading.md) - How to load a model in TorchServe?
 * [Serving Models](server.md) - Explains how to use TorchServe
 * [REST API](rest_api.md) - Specification on the API endpoint for TorchServe
 * [gRPC API](grpc_api.md) - TorchServe supports gRPC APIs for both inference and management calls
diff --git a/docs/model_loading.md b/docs/model_loading.md
new file mode 100644
index 0000000000..e9b3f39712
--- /dev/null
+++ b/docs/model_loading.md
@@ -0,0 +1,28 @@
+# How to load a model in TorchServe
+
+There are multiple ways to load to model in TorchServe. The below flowchart tries to simplify the process and shows the various options
+
+```mermaid
+flowchart TD
+    id1[[How to load a model in TorchServe?]] --> id13{Handler has an initialize method?}
+    id13{Handler has an initialize method?} -- No, using BaseHandler initialize method --> id2{Model Type?} --> id3(PyTorch Eager)  & id4(TorchScripted) & id5(ONNX) & id6(TensorRT)
+    id3(PyTorch Eager) --> id7(Model File & weights file)
+    id4(TorchScripted) --> id8(TorchScripted weights ending in '.pt')
+    id5(ONNX) --> id9(Weights ending in '.onnx')
+    id6(TensorRT) --> id10(TensorRT weights ending in '.pt')
+    id7(Model File & weights file) & id8(TorchScripted weights ending in '.pt') &  id9(Weights ending in '.onnx') & id10(TensorRT weights ending in '.pt') --> id11(Created a model archive .mar file)
+    id13{Handler has an initialize method?} -- yes --> id11(Create a model archive .mar file)
+    id15["Pass the weights with --serialized-file option
+    - Completely packaged for production/reproducibility
+    - Model archiving and model loading can be slow for large models"]
+    id16["Pass the path to the weights in model-config.yaml
+    - Extremely fast to create model archive
+    - You can use defered initialization for large models
+    - Model loading can be faster for large model
+    - Model management can be harder"]
+	id11(Create a model archive .mar file) --> id14{Self-contained package} --Yes--> id15
+	id14{Self-contained package} --No--> id16
+	id15 & id16 --> id17[Start TorchServe with mar file]
+	id15 & id16 --> id18[Start TorchServe] --> id19[Register Model with mar file]
+
+```

From 9379ae51c9c9a6928b4b2e1f8cea355400968846 Mon Sep 17 00:00:00 2001
From: agunapal <agunapal@ischool.berkeley.edu>
Date: Wed, 13 Sep 2023 22:36:26 +0000
Subject: [PATCH 2/6] A guide for loading models in TorchServe

---
 docs/model_loading.md | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/docs/model_loading.md b/docs/model_loading.md
index e9b3f39712..045af44ff4 100644
--- a/docs/model_loading.md
+++ b/docs/model_loading.md
@@ -6,10 +6,10 @@ There are multiple ways to load to model in TorchServe. The below flowchart trie
 flowchart TD
     id1[[How to load a model in TorchServe?]] --> id13{Handler has an initialize method?}
     id13{Handler has an initialize method?} -- No, using BaseHandler initialize method --> id2{Model Type?} --> id3(PyTorch Eager)  & id4(TorchScripted) & id5(ONNX) & id6(TensorRT)
-    id3(PyTorch Eager) --> id7(Model File & weights file)
-    id4(TorchScripted) --> id8(TorchScripted weights ending in '.pt')
-    id5(ONNX) --> id9(Weights ending in '.onnx')
-    id6(TensorRT) --> id10(TensorRT weights ending in '.pt')
+    id3(PyTorch Eager) --Required--> id7(Model File & weights file)
+    id4(TorchScripted) --Required--> id8(TorchScripted weights ending in '.pt')
+    id5(ONNX) --Required--> id9(Weights ending in '.onnx')
+    id6(TensorRT) --Required--> id10(TensorRT weights ending in '.pt')
     id7(Model File & weights file) & id8(TorchScripted weights ending in '.pt') &  id9(Weights ending in '.onnx') & id10(TensorRT weights ending in '.pt') --> id11(Created a model archive .mar file)
     id13{Handler has an initialize method?} -- yes --> id11(Create a model archive .mar file)
     id15["Pass the weights with --serialized-file option
@@ -17,7 +17,7 @@ flowchart TD
     - Model archiving and model loading can be slow for large models"]
     id16["Pass the path to the weights in model-config.yaml
     - Extremely fast to create model archive
-    - You can use defered initialization for large models
+    - You can use deferred initialization for large models
     - Model loading can be faster for large model
     - Model management can be harder"]
 	id11(Create a model archive .mar file) --> id14{Self-contained package} --Yes--> id15

From 074a2530c47801673d062a102207d08d8bee448b Mon Sep 17 00:00:00 2001
From: agunapal <agunapal@ischool.berkeley.edu>
Date: Wed, 13 Sep 2023 23:02:06 +0000
Subject: [PATCH 3/6] Based on feedback

---
 docs/model_loading.md | 21 ++++++++++++++-------
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/docs/model_loading.md b/docs/model_loading.md
index 045af44ff4..91856cbff7 100644
--- a/docs/model_loading.md
+++ b/docs/model_loading.md
@@ -2,27 +2,34 @@
 
 There are multiple ways to load to model in TorchServe. The below flowchart tries to simplify the process and shows the various options
 
+`
 ```mermaid
 flowchart TD
     id1[[How to load a model in TorchServe?]] --> id13{Handler has an initialize method?}
-    id13{Handler has an initialize method?} -- No, using BaseHandler initialize method --> id2{Model Type?} --> id3(PyTorch Eager)  & id4(TorchScripted) & id5(ONNX) & id6(TensorRT)
+    id13{"- Handler has an initialize method?
+          - Does the initialize method inherit from BaseHandler?"} -- No, using BaseHandler initialize method --> id2{Model Type?} --> id3(PyTorch Eager)  & id4(TorchScripted) & id5(ONNX) & id6(TensorRT)
     id3(PyTorch Eager) --Required--> id7(Model File & weights file)
     id4(TorchScripted) --Required--> id8(TorchScripted weights ending in '.pt')
-    id5(ONNX) --Required--> id9(Weights ending in '.onnx')
+    id5(ONNX) --Required --> id9(Weights ending in '.onnx')
     id6(TensorRT) --Required--> id10(TensorRT weights ending in '.pt')
     id7(Model File & weights file) & id8(TorchScripted weights ending in '.pt') &  id9(Weights ending in '.onnx') & id10(TensorRT weights ending in '.pt') --> id11(Created a model archive .mar file)
-    id13{Handler has an initialize method?} -- yes --> id11(Create a model archive .mar file)
+    id13{"- Handler has an initialize method?
+          - Does the initialize method inherit from BaseHandler?"} -- yes to both --> id20(Create a custom method to load the model in the handler) --> id11(Create a model archive .mar file)
     id15["Pass the weights with --serialized-file option
     - Completely packaged for production/reproducibility
     - Model archiving and model loading can be slow for large models"]
     id16["Pass the path to the weights in model-config.yaml
     - Extremely fast to create model archive
-    - You can use deferred initialization for large models
-    - Model loading can be faster for large model
+    - You can use defered initialization for large models
+    - Model loading can be faster for large models
     - Model management can be harder"]
-	id11(Create a model archive .mar file) --> id14{Self-contained package} --Yes--> id15
-	id14{Self-contained package} --No--> id16
+	id11(Create a model archive .mar file) --> id14{"Is your model large?
+	Do you care about model archiving and loading time?"} --No--> id15
+	id14{"Is your model large?
+	Do you care about model archiving and loading time?"} --yes to either--> id16
 	id15 & id16 --> id17[Start TorchServe with mar file]
 	id15 & id16 --> id18[Start TorchServe] --> id19[Register Model with mar file]
 
+
+
 ```

From b666326e7021f033cf6d8c76366bf3c1fc425544 Mon Sep 17 00:00:00 2001
From: agunapal <agunapal@ischool.berkeley.edu>
Date: Wed, 13 Sep 2023 23:12:03 +0000
Subject: [PATCH 4/6] Based on feedback

---
 docs/model_loading.md | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/docs/model_loading.md b/docs/model_loading.md
index 91856cbff7..1874cfb3a1 100644
--- a/docs/model_loading.md
+++ b/docs/model_loading.md
@@ -14,7 +14,8 @@ flowchart TD
     id6(TensorRT) --Required--> id10(TensorRT weights ending in '.pt')
     id7(Model File & weights file) & id8(TorchScripted weights ending in '.pt') &  id9(Weights ending in '.onnx') & id10(TensorRT weights ending in '.pt') --> id11(Created a model archive .mar file)
     id13{"- Handler has an initialize method?
-          - Does the initialize method inherit from BaseHandler?"} -- yes to both --> id20(Create a custom method to load the model in the handler) --> id11(Create a model archive .mar file)
+          - Does the initialize method inherit from BaseHandler?"} -- yes to both --> id20("Create a custom method
+          to load the model in the handler") --> id11(Create a model archive .mar file)
     id15["Pass the weights with --serialized-file option
     - Completely packaged for production/reproducibility
     - Model archiving and model loading can be slow for large models"]

From 3a513f9f6210dfa65672865d5778c8c8c6745987 Mon Sep 17 00:00:00 2001
From: agunapal <agunapal@ischool.berkeley.edu>
Date: Wed, 13 Sep 2023 23:27:29 +0000
Subject: [PATCH 5/6] Based on feedback

---
 docs/model_loading.md | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/docs/model_loading.md b/docs/model_loading.md
index 1874cfb3a1..70941b16ac 100644
--- a/docs/model_loading.md
+++ b/docs/model_loading.md
@@ -2,20 +2,20 @@
 
 There are multiple ways to load to model in TorchServe. The below flowchart tries to simplify the process and shows the various options
 
-`
+
 ```mermaid
 flowchart TD
     id1[[How to load a model in TorchServe?]] --> id13{Handler has an initialize method?}
-    id13{"- Handler has an initialize method?
-          - Does the initialize method inherit from BaseHandler?"} -- No, using BaseHandler initialize method --> id2{Model Type?} --> id3(PyTorch Eager)  & id4(TorchScripted) & id5(ONNX) & id6(TensorRT)
+    id13{Handler has an initialize method?} -- No, using BaseHandler initialize method --> id2{Model Type?} --> id3(PyTorch Eager)  & id4(TorchScripted) & id5(ONNX) & id6(TensorRT)
     id3(PyTorch Eager) --Required--> id7(Model File & weights file)
     id4(TorchScripted) --Required--> id8(TorchScripted weights ending in '.pt')
     id5(ONNX) --Required --> id9(Weights ending in '.onnx')
     id6(TensorRT) --Required--> id10(TensorRT weights ending in '.pt')
     id7(Model File & weights file) & id8(TorchScripted weights ending in '.pt') &  id9(Weights ending in '.onnx') & id10(TensorRT weights ending in '.pt') --> id11(Created a model archive .mar file)
-    id13{"- Handler has an initialize method?
-          - Does the initialize method inherit from BaseHandler?"} -- yes to both --> id20("Create a custom method
-          to load the model in the handler") --> id11(Create a model archive .mar file)
+    id13{Handler has an initialize method?} --Yes--> id21{"Does the initialize method inherit from BaseHandler?"}
+    id21{"Does the initialize method inherit from BaseHandler?"} -- Yes --> id2{Model Type?}
+    id21{Does the initialize method inherit from BaseHandler?} -- No --> id20("Create a custom method to
+         load the model in the handler") --> id11(Create a model archive .mar file)
     id15["Pass the weights with --serialized-file option
     - Completely packaged for production/reproducibility
     - Model archiving and model loading can be slow for large models"]
@@ -32,5 +32,4 @@ flowchart TD
 	id15 & id16 --> id18[Start TorchServe] --> id19[Register Model with mar file]
 
 
-
 ```

From f9a53e8bc96fb50bd8974efd4ed48347465995f8 Mon Sep 17 00:00:00 2001
From: agunapal <agunapal@ischool.berkeley.edu>
Date: Thu, 14 Sep 2023 21:38:14 +0000
Subject: [PATCH 6/6] Based on feedback

---
 docs/model_loading.md | 28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/docs/model_loading.md b/docs/model_loading.md
index 70941b16ac..8881f53878 100644
--- a/docs/model_loading.md
+++ b/docs/model_loading.md
@@ -3,6 +3,7 @@
 There are multiple ways to load to model in TorchServe. The below flowchart tries to simplify the process and shows the various options
 
 
+
 ```mermaid
 flowchart TD
     id1[[How to load a model in TorchServe?]] --> id13{Handler has an initialize method?}
@@ -16,20 +17,19 @@ flowchart TD
     id21{"Does the initialize method inherit from BaseHandler?"} -- Yes --> id2{Model Type?}
     id21{Does the initialize method inherit from BaseHandler?} -- No --> id20("Create a custom method to
          load the model in the handler") --> id11(Create a model archive .mar file)
-    id15["Pass the weights with --serialized-file option
-    - Completely packaged for production/reproducibility
-    - Model archiving and model loading can be slow for large models"]
-    id16["Pass the path to the weights in model-config.yaml
-    - Extremely fast to create model archive
-    - You can use defered initialization for large models
-    - Model loading can be faster for large models
-    - Model management can be harder"]
-	id11(Create a model archive .mar file) --> id14{"Is your model large?
-	Do you care about model archiving and loading time?"} --No--> id15
-	id14{"Is your model large?
-	Do you care about model archiving and loading time?"} --yes to either--> id16
-	id15 & id16 --> id17[Start TorchServe with mar file]
-	id15 & id16 --> id18[Start TorchServe] --> id19[Register Model with mar file]
+    id15["Create model archive by passing the
+    weights with --serialized-file option"]
+    id16["Specify path to the weights in model-config.yaml
+    Create model archive by specifying yaml file with --config-file "]
+	id11(Work on creating a model archive .mar file) --> id14{"Is your model large?"} --No--> id22{Do you want a self-contained model artifact}  --Yes--> id15
+	id14{"Is your model large?"} --Yes--> id16
+	id22{Do you want a self-contained model artifact} --No, I want model archieving & loading to be faster--> id16
+	id15 & id16 --> id17["Start TorchServe.
+	Two ways of starting torchserve
+	- Pass the mar file with --models
+	- Start TorchServe and call the register API with mar file"]
+
+
 
 
 ```