Torch Tensor RT example #2483

agunapal · 2023-07-19T21:26:06Z

Description

This PR

Enables Torch TensorRT to work without a custom handler
Example showing how to use Torch TensorRT with TorchServe

Fixes #(issue)

Type of change

Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
New feature (non-breaking change which adds functionality)
This change requires a documentation update

Feature/Issue validation/testing

Before the fix

2023-07-19T21:15:42,182 [INFO ] main org.pytorch.serve.servingsdk.impl.PluginsManager - Initializing plugins manager...
2023-07-19T21:15:42,234 [INFO ] main org.pytorch.serve.metrics.configuration.MetricConfiguration - Successfully loaded metrics configuration from /home/ubuntu/anaconda3/envs/torchserve/lib/python3.10/site-packages/ts/configs/metrics.yaml
2023-07-19T21:15:42,309 [INFO ] main org.pytorch.serve.ModelServer - 
Torchserve version: 0.8.1
TS Home: /home/ubuntu/anaconda3/envs/torchserve/lib/python3.10/site-packages
Current directory: /home/ubuntu/serve
Temp directory: /tmp
Metrics config path: /home/ubuntu/anaconda3/envs/torchserve/lib/python3.10/site-packages/ts/configs/metrics.yaml
Number of GPUs: 1
Number of CPUs: 8
Max heap size: 7936 M
Python executable: /home/ubuntu/anaconda3/envs/torchserve/bin/python
Config file: N/A
Inference address: http://127.0.0.1:8080
Management address: http://127.0.0.1:8081
Metrics address: http://127.0.0.1:8082
Model Store: /home/ubuntu/serve/model_store
Initial Models: res50-trt-fp16=res50-trt-fp16.mar
Log dir: /home/ubuntu/serve/logs
Metrics dir: /home/ubuntu/serve/logs
Netty threads: 0
Netty client threads: 0
Default workers per model: 1
Blacklist Regex: N/A
Maximum Response Size: 6553500
Maximum Request Size: 6553500
Limit Maximum Image Pixels: true
Prefer direct buffer: false
Allowed Urls: [file://.*|http(s)?://.*]
Custom python dependency for model allowed: false
Enable metrics API: true
Metrics mode: log
Disable system metrics: false
Workflow Store: /home/ubuntu/serve/model_store
Model config: N/A
2023-07-19T21:15:42,314 [INFO ] main org.pytorch.serve.servingsdk.impl.PluginsManager -  Loading snapshot serializer plugin...
2023-07-19T21:15:42,328 [INFO ] main org.pytorch.serve.ModelServer - Loading initial models: res50-trt-fp16.mar
2023-07-19T21:15:43,023 [DEBUG] main org.pytorch.serve.wlm.ModelVersionedRefs - Adding new version 1.0 for model res50-trt-fp16
2023-07-19T21:15:43,023 [DEBUG] main org.pytorch.serve.wlm.ModelVersionedRefs - Setting default version to 1.0 for model res50-trt-fp16
2023-07-19T21:15:43,024 [INFO ] main org.pytorch.serve.wlm.ModelManager - Model res50-trt-fp16 loaded.
2023-07-19T21:15:43,024 [DEBUG] main org.pytorch.serve.wlm.ModelManager - updateModel: res50-trt-fp16, count: 1
2023-07-19T21:15:43,029 [DEBUG] W-9000-res50-trt-fp16_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - Worker cmdline: [/home/ubuntu/anaconda3/envs/torchserve/bin/python, /home/ubuntu/anaconda3/envs/torchserve/lib/python3.10/site-packages/ts/model_service_worker.py, --sock-type, unix, --sock-name, /tmp/.ts.sock.9000, --metrics-config, /home/ubuntu/anaconda3/envs/torchserve/lib/python3.10/site-packages/ts/configs/metrics.yaml]
2023-07-19T21:15:43,030 [INFO ] main org.pytorch.serve.ModelServer - Initialize Inference server with: EpollServerSocketChannel.
2023-07-19T21:15:43,083 [INFO ] main org.pytorch.serve.ModelServer - Inference API bind to: http://127.0.0.1:8080
2023-07-19T21:15:43,084 [INFO ] main org.pytorch.serve.ModelServer - Initialize Management server with: EpollServerSocketChannel.
2023-07-19T21:15:43,085 [INFO ] main org.pytorch.serve.ModelServer - Management API bind to: http://127.0.0.1:8081
2023-07-19T21:15:43,085 [INFO ] main org.pytorch.serve.ModelServer - Initialize Metrics server with: EpollServerSocketChannel.
2023-07-19T21:15:43,085 [INFO ] main org.pytorch.serve.ModelServer - Metrics API bind to: http://127.0.0.1:8082
Model server started.
2023-07-19T21:15:43,254 [WARN ] pool-3-thread-1 org.pytorch.serve.metrics.MetricCollector - worker pid is not available yet.
2023-07-19T21:15:43,909 [INFO ] pool-3-thread-1 TS_METRICS - CPUUtilization.Percent:0.0|#Level:Host|#hostname:ip-172-31-7-107,timestamp:1689801343
2023-07-19T21:15:43,910 [INFO ] pool-3-thread-1 TS_METRICS - DiskAvailable.Gigabytes:65.70399475097656|#Level:Host|#hostname:ip-172-31-7-107,timestamp:1689801343
2023-07-19T21:15:43,910 [INFO ] pool-3-thread-1 TS_METRICS - DiskUsage.Gigabytes:127.93020629882812|#Level:Host|#hostname:ip-172-31-7-107,timestamp:1689801343
2023-07-19T21:15:43,910 [INFO ] pool-3-thread-1 TS_METRICS - DiskUtilization.Percent:66.1|#Level:Host|#hostname:ip-172-31-7-107,timestamp:1689801343
2023-07-19T21:15:43,910 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUtilization.Percent:0.0|#Level:Host,DeviceId:0|#hostname:ip-172-31-7-107,timestamp:1689801343
2023-07-19T21:15:43,911 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUsed.Megabytes:0.0|#Level:Host,DeviceId:0|#hostname:ip-172-31-7-107,timestamp:1689801343
2023-07-19T21:15:43,911 [INFO ] pool-3-thread-1 TS_METRICS - GPUUtilization.Percent:0.0|#Level:Host,DeviceId:0|#hostname:ip-172-31-7-107,timestamp:1689801343
2023-07-19T21:15:43,911 [INFO ] pool-3-thread-1 TS_METRICS - MemoryAvailable.Megabytes:29600.74609375|#Level:Host|#hostname:ip-172-31-7-107,timestamp:1689801343
2023-07-19T21:15:43,912 [INFO ] pool-3-thread-1 TS_METRICS - MemoryUsed.Megabytes:1702.80078125|#Level:Host|#hostname:ip-172-31-7-107,timestamp:1689801343
2023-07-19T21:15:43,912 [INFO ] pool-3-thread-1 TS_METRICS - MemoryUtilization.Percent:6.7|#Level:Host|#hostname:ip-172-31-7-107,timestamp:1689801343
2023-07-19T21:15:44,246 [INFO ] W-9000-res50-trt-fp16_1.0-stdout MODEL_LOG - s_name_part0=/tmp/.ts.sock, s_name_part1=9000, pid=37899
2023-07-19T21:15:44,247 [INFO ] W-9000-res50-trt-fp16_1.0-stdout MODEL_LOG - Listening on port: /tmp/.ts.sock.9000
2023-07-19T21:15:44,255 [INFO ] W-9000-res50-trt-fp16_1.0-stdout MODEL_LOG - Successfully loaded /home/ubuntu/anaconda3/envs/torchserve/lib/python3.10/site-packages/ts/configs/metrics.yaml.
2023-07-19T21:15:44,256 [INFO ] W-9000-res50-trt-fp16_1.0-stdout MODEL_LOG - [PID]37899
2023-07-19T21:15:44,256 [INFO ] W-9000-res50-trt-fp16_1.0-stdout MODEL_LOG - Torch worker started.
2023-07-19T21:15:44,256 [INFO ] W-9000-res50-trt-fp16_1.0-stdout MODEL_LOG - Python runtime: 3.10.0
2023-07-19T21:15:44,256 [DEBUG] W-9000-res50-trt-fp16_1.0 org.pytorch.serve.wlm.WorkerThread - W-9000-res50-trt-fp16_1.0 State change null -> WORKER_STARTED
2023-07-19T21:15:44,259 [INFO ] W-9000-res50-trt-fp16_1.0 org.pytorch.serve.wlm.WorkerThread - Connecting to: /tmp/.ts.sock.9000
2023-07-19T21:15:44,265 [INFO ] W-9000-res50-trt-fp16_1.0-stdout MODEL_LOG - Connection accepted: /tmp/.ts.sock.9000.
2023-07-19T21:15:44,266 [INFO ] W-9000-res50-trt-fp16_1.0 org.pytorch.serve.wlm.WorkerThread - Flushing req.cmd LOAD to backend at: 1689801344266
2023-07-19T21:15:44,290 [INFO ] W-9000-res50-trt-fp16_1.0-stdout MODEL_LOG - model_name: res50-trt-fp16, batchSize: 1
2023-07-19T21:15:44,893 [INFO ] W-9000-res50-trt-fp16_1.0-stdout MODEL_LOG - Enabled tensor cores
2023-07-19T21:15:44,907 [INFO ] W-9000-res50-trt-fp16_1.0-stdout MODEL_LOG - ONNX enabled
2023-07-19T21:15:44,941 [INFO ] W-9000-res50-trt-fp16_1.0-stdout MODEL_LOG - Failed to load model res50-trt-fp16, exception 
2023-07-19T21:15:44,942 [INFO ] W-9000-res50-trt-fp16_1.0-stdout MODEL_LOG - Unknown type name '__torch__.torch.classes.tensorrt.Engine':
2023-07-19T21:15:44,942 [INFO ] W-9000-res50-trt-fp16_1.0-stdout MODEL_LOG -   File "code/__torch__/torchvision/models/resnet.py", line 4
2023-07-19T21:15:44,942 [INFO ] W-9000-res50-trt-fp16_1.0-stdout MODEL_LOG -   __parameters__ = []
2023-07-19T21:15:44,942 [INFO ] W-9000-res50-trt-fp16_1.0-stdout MODEL_LOG -   __buffers__ = []
2023-07-19T21:15:44,943 [INFO ] W-9000-res50-trt-fp16_1.0-stdout MODEL_LOG -   __torch___torchvision_models_resnet_ResNet_trt_engine_ : __torch__.torch.classes.tensorrt.Engine
2023-07-19T21:15:44,943 [INFO ] W-9000-res50-trt-fp16_1.0-stdout MODEL_LOG -                                                            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
2023-07-19T21:15:44,943 [INFO ] W-9000-res50-trt-fp16_1.0-stdout MODEL_LOG -   def forward(self_1: __torch__.torchvision.models.resnet.ResNet_trt,
2023-07-19T21:15:44,943 [INFO ] W-9000-res50-trt-fp16_1.0-stdout MODEL_LOG -     input_0: Tensor) -> Tensor:
2023-07-19T21:15:44,943 [INFO ] W-9000-res50-trt-fp16_1.0-stdout MODEL_LOG - Traceback (most recent call last):
2023-07-19T21:15:44,944 [INFO ] W-9000-res50-trt-fp16_1.0-stdout MODEL_LOG -   File "/home/ubuntu/anaconda3/envs/torchserve/lib/python3.10/site-packages/ts/model_service_worker.py", line 131, in load_model
2023-07-19T21:15:44,944 [INFO ] W-9000-res50-trt-fp16_1.0-stdout MODEL_LOG -     service = model_loader.load(
2023-07-19T21:15:44,944 [INFO ] W-9000-res50-trt-fp16_1.0-stdout MODEL_LOG -   File "/home/ubuntu/anaconda3/envs/torchserve/lib/python3.10/site-packages/ts/model_loader.py", line 135, in load
2023-07-19T21:15:44,944 [INFO ] W-9000-res50-trt-fp16_1.0-stdout MODEL_LOG -     initialize_fn(service.context)
2023-07-19T21:15:44,944 [INFO ] W-9000-res50-trt-fp16_1.0-stdout MODEL_LOG -   File "/home/ubuntu/anaconda3/envs/torchserve/lib/python3.10/site-packages/ts/torch_handler/vision_handler.py", line 23, in initialize
2023-07-19T21:15:44,945 [INFO ] W-9000-res50-trt-fp16_1.0-stdout MODEL_LOG -     super().initialize(context)
2023-07-19T21:15:44,945 [INFO ] W-9000-res50-trt-fp16_1.0-stdout MODEL_LOG -   File "/home/ubuntu/anaconda3/envs/torchserve/lib/python3.10/site-packages/ts/torch_handler/base_handler.py", line 172, in initialize
2023-07-19T21:15:44,945 [INFO ] W-9000-res50-trt-fp16_1.0-stdout MODEL_LOG -     self.model = self._load_torchscript_model(self.model_pt_path)
2023-07-19T21:15:44,945 [INFO ] W-9000-res50-trt-fp16_1.0-stdout MODEL_LOG -   File "/home/ubuntu/anaconda3/envs/torchserve/lib/python3.10/site-packages/ts/torch_handler/base_handler.py", line 226, in _load_torchscript_model
2023-07-19T21:15:44,945 [INFO ] W-9000-res50-trt-fp16_1.0-stdout MODEL_LOG -     return torch.jit.load(model_pt_path, map_location=self.device)
2023-07-19T21:15:44,946 [INFO ] W-9000-res50-trt-fp16_1.0-stdout MODEL_LOG -   File "/home/ubuntu/anaconda3/envs/torchserve/lib/python3.10/site-packages/torch/jit/_serialization.py", line 162, in load
2023-07-19T21:15:44,946 [INFO ] W-9000-res50-trt-fp16_1.0-stdout MODEL_LOG -     cpp_module = torch._C.import_ir_module(cu, str(f), map_location, _extra_files, _restore_shapes)  # type: ignore[call-arg]
2023-07-19T21:15:44,946 [INFO ] W-9000-res50-trt-fp16_1.0-stdout MODEL_LOG - RuntimeError: 
2023-07-19T21:15:44,946 [DEBUG] W-9000-res50-trt-fp16_1.0 org.pytorch.serve.wlm.WorkerThread - sent a reply, jobdone: true
2023-07-19T21:15:44,946 [INFO ] W-9000-res50-trt-fp16_1.0-stdout MODEL_LOG - Unknown type name '__torch__.torch.classes.tensorrt.Engine':
2023-07-19T21:15:44,946 [INFO ] W-9000-res50-trt-fp16_1.0 org.pytorch.serve.wlm.WorkerThread - Backend response time: 656
2023-07-19T21:15:44,946 [INFO ] W-9000-res50-trt-fp16_1.0-stdout MODEL_LOG -   File "code/__torch__/torchvision/models/resnet.py", line 4
2023-07-19T21:15:44,947 [INFO ] W-9000-res50-trt-fp16_1.0-stdout MODEL_LOG -   __parameters__ = []

After the fix

2023-07-19T21:13:58,627 [INFO ] main org.pytorch.serve.servingsdk.impl.PluginsManager - Initializing plugins manager...
2023-07-19T21:13:58,680 [INFO ] main org.pytorch.serve.metrics.configuration.MetricConfiguration - Successfully loaded metrics configuration from /home/ubuntu/anaconda3/envs/torchserve/lib/python3.10/site-packages/ts/configs/metrics.yaml
2023-07-19T21:13:58,757 [INFO ] main org.pytorch.serve.ModelServer - 
Torchserve version: 0.8.1
TS Home: /home/ubuntu/anaconda3/envs/torchserve/lib/python3.10/site-packages
Current directory: /home/ubuntu/serve
Temp directory: /tmp
Metrics config path: /home/ubuntu/anaconda3/envs/torchserve/lib/python3.10/site-packages/ts/configs/metrics.yaml
Number of GPUs: 1
Number of CPUs: 8
Max heap size: 7936 M
Python executable: /home/ubuntu/anaconda3/envs/torchserve/bin/python
Config file: N/A
Inference address: http://127.0.0.1:8080
Management address: http://127.0.0.1:8081
Metrics address: http://127.0.0.1:8082
Model Store: /home/ubuntu/serve/model_store
Initial Models: res50=res50-trt-fp16.mar
Log dir: /home/ubuntu/serve/logs
Metrics dir: /home/ubuntu/serve/logs
Netty threads: 0
Netty client threads: 0
Default workers per model: 1
Blacklist Regex: N/A
Maximum Response Size: 6553500
Maximum Request Size: 6553500
Limit Maximum Image Pixels: true
Prefer direct buffer: false
Allowed Urls: [file://.*|http(s)?://.*]
Custom python dependency for model allowed: false
Enable metrics API: true
Metrics mode: log
Disable system metrics: false
Workflow Store: /home/ubuntu/serve/model_store
Model config: N/A
2023-07-19T21:13:58,762 [INFO ] main org.pytorch.serve.servingsdk.impl.PluginsManager -  Loading snapshot serializer plugin...
2023-07-19T21:13:58,776 [INFO ] main org.pytorch.serve.ModelServer - Loading initial models: res50-trt-fp16.mar
2023-07-19T21:13:59,464 [DEBUG] main org.pytorch.serve.wlm.ModelVersionedRefs - Adding new version 1.0 for model res50
2023-07-19T21:13:59,464 [DEBUG] main org.pytorch.serve.wlm.ModelVersionedRefs - Setting default version to 1.0 for model res50
2023-07-19T21:13:59,465 [INFO ] main org.pytorch.serve.wlm.ModelManager - Model res50 loaded.
2023-07-19T21:13:59,465 [DEBUG] main org.pytorch.serve.wlm.ModelManager - updateModel: res50, count: 1
2023-07-19T21:13:59,470 [DEBUG] W-9000-res50_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - Worker cmdline: [/home/ubuntu/anaconda3/envs/torchserve/bin/python, /home/ubuntu/anaconda3/envs/torchserve/lib/python3.10/site-packages/ts/model_service_worker.py, --sock-type, unix, --sock-name, /tmp/.ts.sock.9000, --metrics-config, /home/ubuntu/anaconda3/envs/torchserve/lib/python3.10/site-packages/ts/configs/metrics.yaml]
2023-07-19T21:13:59,472 [INFO ] main org.pytorch.serve.ModelServer - Initialize Inference server with: EpollServerSocketChannel.
2023-07-19T21:13:59,525 [INFO ] main org.pytorch.serve.ModelServer - Inference API bind to: http://127.0.0.1:8080
2023-07-19T21:13:59,526 [INFO ] main org.pytorch.serve.ModelServer - Initialize Management server with: EpollServerSocketChannel.
2023-07-19T21:13:59,527 [INFO ] main org.pytorch.serve.ModelServer - Management API bind to: http://127.0.0.1:8081
2023-07-19T21:13:59,527 [INFO ] main org.pytorch.serve.ModelServer - Initialize Metrics server with: EpollServerSocketChannel.
2023-07-19T21:13:59,528 [INFO ] main org.pytorch.serve.ModelServer - Metrics API bind to: http://127.0.0.1:8082
Model server started.
2023-07-19T21:13:59,676 [WARN ] pool-3-thread-1 org.pytorch.serve.metrics.MetricCollector - worker pid is not available yet.
2023-07-19T21:14:00,283 [INFO ] pool-3-thread-1 TS_METRICS - CPUUtilization.Percent:0.0|#Level:Host|#hostname:ip-172-31-7-107,timestamp:1689801240
2023-07-19T21:14:00,284 [INFO ] pool-3-thread-1 TS_METRICS - DiskAvailable.Gigabytes:65.70406341552734|#Level:Host|#hostname:ip-172-31-7-107,timestamp:1689801240
2023-07-19T21:14:00,285 [INFO ] pool-3-thread-1 TS_METRICS - DiskUsage.Gigabytes:127.93013763427734|#Level:Host|#hostname:ip-172-31-7-107,timestamp:1689801240
2023-07-19T21:14:00,285 [INFO ] pool-3-thread-1 TS_METRICS - DiskUtilization.Percent:66.1|#Level:Host|#hostname:ip-172-31-7-107,timestamp:1689801240
2023-07-19T21:14:00,285 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUtilization.Percent:0.0|#Level:Host,DeviceId:0|#hostname:ip-172-31-7-107,timestamp:1689801240
2023-07-19T21:14:00,285 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUsed.Megabytes:0.0|#Level:Host,DeviceId:0|#hostname:ip-172-31-7-107,timestamp:1689801240
2023-07-19T21:14:00,286 [INFO ] pool-3-thread-1 TS_METRICS - GPUUtilization.Percent:0.0|#Level:Host,DeviceId:0|#hostname:ip-172-31-7-107,timestamp:1689801240
2023-07-19T21:14:00,286 [INFO ] pool-3-thread-1 TS_METRICS - MemoryAvailable.Megabytes:29626.125|#Level:Host|#hostname:ip-172-31-7-107,timestamp:1689801240
2023-07-19T21:14:00,286 [INFO ] pool-3-thread-1 TS_METRICS - MemoryUsed.Megabytes:1677.9296875|#Level:Host|#hostname:ip-172-31-7-107,timestamp:1689801240
2023-07-19T21:14:00,287 [INFO ] pool-3-thread-1 TS_METRICS - MemoryUtilization.Percent:6.7|#Level:Host|#hostname:ip-172-31-7-107,timestamp:1689801240
2023-07-19T21:14:00,710 [INFO ] W-9000-res50_1.0-stdout MODEL_LOG - s_name_part0=/tmp/.ts.sock, s_name_part1=9000, pid=37148
2023-07-19T21:14:00,711 [INFO ] W-9000-res50_1.0-stdout MODEL_LOG - Listening on port: /tmp/.ts.sock.9000
2023-07-19T21:14:00,719 [INFO ] W-9000-res50_1.0-stdout MODEL_LOG - Successfully loaded /home/ubuntu/anaconda3/envs/torchserve/lib/python3.10/site-packages/ts/configs/metrics.yaml.
2023-07-19T21:14:00,719 [INFO ] W-9000-res50_1.0-stdout MODEL_LOG - [PID]37148
2023-07-19T21:14:00,719 [INFO ] W-9000-res50_1.0-stdout MODEL_LOG - Torch worker started.
2023-07-19T21:14:00,719 [INFO ] W-9000-res50_1.0-stdout MODEL_LOG - Python runtime: 3.10.0
2023-07-19T21:14:00,720 [DEBUG] W-9000-res50_1.0 org.pytorch.serve.wlm.WorkerThread - W-9000-res50_1.0 State change null -> WORKER_STARTED
2023-07-19T21:14:00,723 [INFO ] W-9000-res50_1.0 org.pytorch.serve.wlm.WorkerThread - Connecting to: /tmp/.ts.sock.9000
2023-07-19T21:14:00,728 [INFO ] W-9000-res50_1.0-stdout MODEL_LOG - Connection accepted: /tmp/.ts.sock.9000.
2023-07-19T21:14:00,730 [INFO ] W-9000-res50_1.0 org.pytorch.serve.wlm.WorkerThread - Flushing req.cmd LOAD to backend at: 1689801240730
2023-07-19T21:14:00,753 [INFO ] W-9000-res50_1.0-stdout MODEL_LOG - model_name: res50, batchSize: 1
2023-07-19T21:14:01,361 [INFO ] W-9000-res50_1.0-stdout MODEL_LOG - Enabled tensor cores
2023-07-19T21:14:01,375 [INFO ] W-9000-res50_1.0-stdout MODEL_LOG - ONNX enabled
2023-07-19T21:14:01,634 [INFO ] W-9000-res50_1.0-stdout MODEL_LOG - Torch TensorRT enabled
2023-07-19T21:14:03,138 [WARN ] W-9000-res50_1.0-stderr MODEL_LOG - WARNING: [Torch-TensorRT] - TensorRT was linked against cuDNN 8.6.0 but loaded cuDNN 8.5.0
2023-07-19T21:14:03,144 [WARN ] W-9000-res50_1.0-stderr MODEL_LOG - WARNING: [Torch-TensorRT] - TensorRT was linked against cuDNN 8.6.0 but loaded cuDNN 8.5.0
2023-07-19T21:14:03,144 [WARN ] W-9000-res50_1.0-stderr MODEL_LOG - WARNING: [Torch-TensorRT] - CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See `CUDA_MODULE_LOADING` in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars
2023-07-19T21:14:03,158 [DEBUG] W-9000-res50_1.0 org.pytorch.serve.wlm.WorkerThread - sent a reply, jobdone: true
2023-07-19T21:14:03,158 [INFO ] W-9000-res50_1.0 org.pytorch.serve.wlm.WorkerThread - Backend response time: 2405
2023-07-19T21:14:03,159 [DEBUG] W-9000-res50_1.0 org.pytorch.serve.wlm.WorkerThread - W-9000-res50_1.0 State change WORKER_STARTED -> WORKER_MODEL_LOADED

Checklist:

Did you have fun?
Have you added tests that prove your fix is effective or that this feature works?
Has code been commented, particularly in hard-to-understand areas?
Have you made corresponding changes to the documentation?

codecov · 2023-07-19T21:45:18Z

Codecov Report

Merging #2483 (f877efc) into master (7e5857f) will increase coverage by 0.01%.
The diff coverage is 80.00%.

❗ Current head f877efc differs from pull request most recent head 05ed115. Consider uploading reports for the commit 05ed115 to get more accurate results

@@            Coverage Diff             @@
##           master    #2483      +/-   ##
==========================================
+ Coverage   71.89%   71.90%   +0.01%     
==========================================
  Files          78       78              
  Lines        3654     3659       +5     
  Branches       58       58              
==========================================
+ Hits         2627     2631       +4     
- Misses       1023     1024       +1     
  Partials        4        4

Impacted Files	Coverage Δ
ts/torch_handler/base_handler.py	`56.54% <80.00%> (+0.56%)`	⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

examples/torch_tensorrt/resnet_tensorrt.py

Working example with Torch Tensor RT

11d29d8

agunapal requested review from msaroufim and lxning July 19, 2023 21:27

msaroufim reviewed Jul 19, 2023

View reviewed changes

examples/torch_tensorrt/resnet_tensorrt.py Show resolved Hide resolved

lxning approved these changes Jul 19, 2023

View reviewed changes

lint

b32cd26

msaroufim self-requested a review July 20, 2023 00:20

msaroufim approved these changes Jul 20, 2023

View reviewed changes

spellcheck

05ed115

agunapal merged commit b998f8c into master Jul 21, 2023

agunapal deleted the issues/import_trt branch July 21, 2023 19:20

sachanub mentioned this pull request Jul 25, 2023

Import error in torch-tensorrt #2500

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Torch Tensor RT example #2483

Torch Tensor RT example #2483

agunapal commented Jul 19, 2023 •

edited

Loading

codecov bot commented Jul 19, 2023 •

edited

Loading

Torch Tensor RT example #2483

Torch Tensor RT example #2483

Conversation

agunapal commented Jul 19, 2023 • edited Loading

Description

Type of change

Feature/Issue validation/testing

Checklist:

codecov bot commented Jul 19, 2023 • edited Loading

Codecov Report

agunapal commented Jul 19, 2023 •

edited

Loading

codecov bot commented Jul 19, 2023 •

edited

Loading