TorchServe v0.6.1 Release Notes
This is the release of TorchServe v0.6.1.
New Features
- Metrics Caching in Python backend - #1954 @maaquib @joshuaan7
- ONNX models served via ORT runtime & docs for TensorRT #1857. @msaroufim
- lPEX launcher core pinning #1401 . @min-jean-cho - to learn more https://pytorch.org/tutorials/intermediate/torchserve_with_ipex.html
New Examples
- DLRM example via torchrec #1648 @mreso
- Scriptable tokenizer example for text classification #1691 @mreso
- Loading large Huggingface models by using accelerate #1933 @jagadeeshi2i
- Stable diffusion Deepspeed MII example #1920 @jagadeeshi2i
- HuggingFace diffuser example #1904 @jagadeeshi2i
- On-premise near real-time video inference #1867 @agunapal
- fsspec for large scale batch inference from cloud buckets #1927 @kirkpa
- Torchdata example for unified training and inference preprocessing pipelines #1940 @PratsBhatt
- Wav2Vec2 SpeechToText from Huggingface #1939 @altre
Dependency Upgrades
- Support PyTorch 1.12 and Cuda 11.6 #1767 @lxning
- Upgraded to JDK17 - #1619 @rohithkrn
- Bumped gson version for security #1650 @lxning
Improvements
- Optimized gRPC workflow performance #1854 for gRPC workflow. @lxning
- Fixed worker shown as ready in DescribeModel endpoint before model is loaded #1679. @lxning
- Gracefully handle decoding exceptions in python backend #1789 @msaroufim
- Added handle OPTIONS in management API #1774 @xyang16
- Fixed model status API in KServe #1773 @jagadeeshi2i
- Fixed process verification in pid file - #1866 @rohithkrn
- Updated Nvidia Waveglow/Tacotron2 #1905 @kbumsik
- Added dev mode in
install_from_src.py
#1856 @msaroufim - Added the PV creation for K8 setup #1751 @jagadeeshi2i
- Fixed volume permission in kubernetes setup #1747 @jagadeeshi2i
- Upgraded hpa with v2beta2 api version #1760 @jagadeeshi2i
- Fixed gradle deprecation method #1936 @lxning
- Updated plugins/gradle.properties #1791 @liyaodev
- Fixed pynvml import failure #1882 @lxning
- Added pynvml exception management #1809 @lromor
- Fixed an erroneous logging format string and pylint pragma #1630 @bradlarsen
- Fixed broken path joins and unclosed files #1709 @DPeled
Build and CI
- Added ubuntu 20.04 GPU in docker build - #1773 @msaroufim
- Added spellchecking and link checking automation #1855 @sadra-barikbin
- Added full release automation #1739 @msaroufim
- Added workflow for pushing Conda nightly binaries #1685 @agunapal
- Added code coverage #1665 in CI build @msaroufim
- Unified documentation build dependencies #1759 @msaroufim
- Added skipping spellcheck if no changed files #1919 for skipping spellcheck if no changed files. @maaquib
- Added skipping flaky Java Windows test cases #1746 @msaroufim
- Added alarm on failed github action #1781 @msaroufim
Documentation
- Updated FAQ #1393 for how to decode international language @lxning
- Improved KServe documentation #1807 @jagadeeshi2i
- Updated
[examples/intel_extension_for_pytorch/README.md
#1816 @min-jean-cho - Fixed typos and dead links in doc.
Deprecations
- Deprecated old
ci/benchmark/buildspec.yml
#1658 @lxning - Deprecated old
docker/Dockerfile.neuron.dev
#1775 in favor of AWS SageMaker DLC. @rohithkrn - Deprecated redundant
LICENSE.txt
#1801 @msaroufim
Platform Support
Ubuntu 16.04, Ubuntu 18.04, MacOS 10.14+, Windows 10 Pro, Windows Server 2019, Windows subsystem for Linux (Windows Server 2019, WSLv1, Ubuntu 18.0.4). TorchServe now requires Python 3.8 and above, and JDK17.
GPU Support
Torch 1.11+ Cuda 10.2, 11.3, 11.6
Torch 1.9.0 + Cuda 11.1
Torch 1.8.1 + Cuda 9.2