Release TorchServe v0.9.0 Release Notes · pytorch/serve

This is the release of TorchServe v0.9.0.

Security

Our security process is documented here

We rely heavily on automation to improve the security of torchserve namely by

On a monthly basis updating our gradle and pip dependencies
Docker scanning via Snyk
Code analysis via CodeQL

A key point to remember is that torchserve will allow you to configure things in an unsecure way so make sure to read our security docs and relevant security warnings to make sure your product is secure in production. In general we do not encourage you to download untrusted mar files from the internet, running a .mar file effectively is running arbitrary python code so make sure to unzip mar files and validate whether they are doing anything suspicious.

Code scanning fixes

Used Sha-256 in ziputils #2629 @msaroufim
Verified default hostname in Test #2631 @msaroufim
Fixed zip slip error #2634 @msaroufim
Used string array as Process arguments input #2632 #2635 @msaroufim
Enabled Netty HTTP header validation as default #2630 @msaroufim
Verified 3rd party package installation path #2687 @lxning
Allowed url validation #2685 @lxning including

Disabled loading TS_ALLOWED_URLS from env by default.
Moved the model url validation to last step.
Sanity check model archive name to guard Uncontrolled data used in path expression

Address configuration updates

Updated default address from 0.0.0.0 to 127.0.0.1 #2624 #2704 @namannandan @agunapal
Bind container ports to localhost ports #2646 @namannandan

Documentation improvements

Updated security readme #2643 #2690 @msaroufim @agunapal
Updated security guidance in docker readme #2669 @agunapal

Dependency improvements

Created dependabot.yml #2642 #2675 @msaroufim
Bumped packaging from 23.1 to 23.2
Bumped pygit2 from 1.21.1 to 1.13.1
Bumped com.github.spotbugs from 4.0.2 to 5.1.3
Bumped ONNX from 1.14.0 to 1.14.1
Bumped Pillow from 9.3.0 to 10.0.1
Bumped Bump com.amazonaws:DynamoDBLocal from 1.13.2 to 2.0.0
Upgraded node to version 18 #2663 @agunapal

Blogs

New Features

Support PyTorch 2.1.0 and Python 3.11 #2621 #2691 #2697 @agunapal
Supported continous batching for LLM inference #2628 @mreso @lxning
Supported dynamically loading 3rd party package on SageMaker Multi-Model Endpoint #2535 @lxning
Added DALI handler to handle preprocess and updated Nvidia DALI example #2485 @jagadeeshi2i

New Examples

Deploy Llama2 on Inferentia2 #2458 @namannandan
Using TorchServe on SageMaker Inf2.24xlarge with Llama2-13B @lxning
PyTorch tensor parallel on Llama2 example #2623 #2689 @HamidShojanazeri
Enabled better transformer (ie. flash attention 2) on Llama2 #2700 @HamidShojanazeri @lxning
Llama2 Chatbot on Mac #2618 @agunapal
ASR speech recognition example #2047 @husenzhang

Improvements

Fixed typo in BaseHandler #2547 @a-ys
Create merge_queue workflow for CI #2548 @msaroufim
Fixed typo in artifact terminology unification #2551 @park12sj
Added env hints in model_service_worker #2540 @ZachOBrien
Refactor conda build scripts to publish all binaries #2561 @agunapal
Fixed response return type in KServe #2566 @jagadeeshi2i
Added torchserve-kfs nightly build #2574 @jagadeeshi2i
Added regression for all CPU binaries #2562 @agunapal
Updated CICD runners #2586 #2597 #2636 #2627 #2677 #2710 #2696 @agunapal @msaroufim
Upgraded newman version to 5.3.2 #2598 #2603 @agunapal
Updated opt benchmark config for inf2 #2617 @namannandan
Added ModelRequestEncoderTest #2580 @abergmeier
Added manually dispatch workflow #2686 @msaroufim
Updated test wheels with PyTorch 2.1.0 #2684 @agunapal
Allowed parallel level = 1 to run in torchrun mode #2608 @lxning
Fixed metric unit assignment backward compatibility #2693 @namannandan

Documentation

Updated MPS readme #2543 @sekyondaMeta
Updated large model inference readme #2542 @sekyondaMeta
Fixed bash snippets in examples/image_classifier/mnist/Docker.md #2345 @dmitsf
Fixed typo in kubernetes/autoscale.md #2393 @CandiedCode
Fixed path in examples/image_classifier/resnet_18/README.md #2568 @udaij12
Model Loading Guidance #2592 @agunapal
Updated Metrics readme #2560 @sekyondaMeta
Display nightly workflow status badge in README #2619 #2666 @agunapal @msaroufim
Update torch.compile information in examples/pt2/README.md #2706 @agunapal
Deploy model using TorchServe on SageMaker tutorial @lxning

Platform Support

Ubuntu 16.04, Ubuntu 18.04, Ubuntu 20.04 MacOS 10.14+, Windows 10 Pro, Windows Server 2019, Windows subsystem for Linux (Windows Server 2019, WSLv1, Ubuntu 18.0.4). TorchServe now requires Python 3.8 and above, and JDK17.

GPU Support

Torch 2.1.0 + Cuda 11.8, 12.1
Torch 2.0.1 + Cuda 11.7
Torch 2.0.0 + Cuda 11.7
Torch 1.13 + Cuda 11.7
Torch 1.11 + Cuda 10.2, 11.3, 11.6
Torch 1.9.0 + Cuda 11.1
Torch 1.8.1 + Cuda 9.2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TorchServe v0.9.0 Release Notes