Skip to content

v6 further performance optimizations of vs-trt and vs-ov&vs-ort bugfix

Compare
Choose a tag to compare
@WolframRhodium WolframRhodium released this 20 Jan 14:31
· 390 commits to master since this release

This release contains some performance optimization of the vs-trt plugin. The general takeaway is that vs-trt can beat all benchmarked solutions on DPIR, waifu2x and RealESRGANv2 models. Specific highlights are as follows:

  • waifu2x: when using CPU, vs-ov beats waifu2x-w2xc by 2.7x (Intel 32C64T); when using GPU, vs-ort/vs-trt beats vulkan-ncnn by ~4x.
  • DPIR: vs-trt beats existing implementations on both Volta (Tesla V100) and Ampere (A10) platforms (by at most 1.5x), and vs-ort saves significant amount of GPU memory (by as much as 3.7x) compared to its counterpart
  • RealESRGANv2: vs-trt, being the only backend that utilizes TensorRT, is up to 3.3x faster than the reference implementation

Please see detailed benchmark results in the wiki:

This release also fixed the following two bugs:

  • vs-ov: some openvino error messages from openvino were sent to stdout, affecting vspipe | x265 usage.
  • vs-ort/vs-ov: error in converting RealESRGANv2 model to fp16 format.