Skip to content

v14.test: latest TensorRT library

Pre-release
Pre-release
Compare
Choose a tag to compare
@github-actions github-actions released this 13 Mar 12:20
· 190 commits to master since this release

This is a preview release for TensorRT 8.6.1.

  • It requires Pascal GPUs or later (10 series+) and driver version >= 525. Support for Kepler 2.0 and Maxwell GPUs is dropped.

  • Add parameters builder_optimization_level and max_aux_streams to the TRT backend.

    • builder_optimization_level: "adjust how long TensorRT should spend searching for tactics with potentially better performance" link
    • max_aux_streams: Within-inference multi-streaming, "if enabled, TensorRT will run some layers on the auxiliary streams in parallel to the layers running on the main stream, ..., may increase the memory consumption, ..." link
      • It is advised to lower max_aux_streams to 0 on heavy models like Waifu2xModel.swin_unet_art to reduce memory usage. Check the benchmark data at the bottom.
  • Following TensorRT 8.6.1, cudnn tactic source of the TRT backend is disabled by default. tf32 is also disabled by default in vsmlrt.py.

  • Add parameter short_path to the TRT backend, which shortens engine path and is enabled on Windows by default.

  • Model Waifu2xModel.swin_unet_art seems does not work with builder_optimization_level=5 from the TRT backend before TRT 9.0. Use builder_optimization_level=4 or lower instead.

Less than 5% performance improvement among built-in models compared to 13.1/13.2, 24% device memory usage reduction on DPIR and 35% on RealESRGAN.


Version information:

  • v13.2 release uses trt 8.5.1 + cuda 11.8.0, which can run on driver >= 450 and 900 series and later gpus.
  • v14.test pre-release uses trt 8.6.1 + cuda 12.1.1, which can only run on driver >= 525 and 10 series and later gpus, with no significant performance improvement measured.
  • vsmlrt.py in both branches can be used interchangeably.

  • Added support for RIFE v4.7 model ("optimized for anime scenes"), which is also available for previous vs-mlrt releases (simply download the new model file here and update vsmlrt.py). It is more computational intensive than v4.6.

  • This pre-release is now feature complete. Development now switch to trt-latest branch and v14.test2 pre-release.