Release v14.test: latest TensorRT library · AmusementClub/vs-mlrt

This is a preview release for TensorRT 8.6.1.

It requires Pascal GPUs or later (10 series+) and driver version >= 525. Support for Kepler 2.0 and Maxwell GPUs is dropped.
Add parameters builder_optimization_level and max_aux_streams to the TRT backend.
- builder_optimization_level: "adjust how long TensorRT should spend searching for tactics with potentially better performance" link
- max_aux_streams: Within-inference multi-streaming, "if enabled, TensorRT will run some layers on the auxiliary streams in parallel to the layers running on the main stream, ..., may increase the memory consumption, ..." link
  - It is advised to lower max_aux_streams to 0 on heavy models like Waifu2xModel.swin_unet_art to reduce memory usage. Check the benchmark data at the bottom.
Following TensorRT 8.6.1, cudnn tactic source of the TRT backend is disabled by default. tf32 is also disabled by default in vsmlrt.py.
Add parameter short_path to the TRT backend, which shortens engine path and is enabled on Windows by default.
Model Waifu2xModel.swin_unet_art seems does not work with builder_optimization_level=5 from the TRT backend before TRT 9.0. Use builder_optimization_level=4 or lower instead.

Less than 5% performance improvement among built-in models compared to 13.1/13.2, 24% device memory usage reduction on DPIR and 35% on RealESRGAN.

Version information:

v13.2 release uses trt 8.5.1 + cuda 11.8.0, which can run on driver >= 450 and 900 series and later gpus.
v14.test pre-release uses trt 8.6.1 + cuda 12.1.1, which can only run on driver >= 525 and 10 series and later gpus, with no significant performance improvement measured.
vsmlrt.py in both branches can be used interchangeably.

Added support for RIFE v4.7 model ("optimized for anime scenes"), which is also available for previous vs-mlrt releases (simply download the new model file here and update vsmlrt.py). It is more computational intensive than v4.6.

This pre-release is now feature complete. Development now switch to trt-latest branch and v14.test2 pre-release.

Provide feedback