-
Notifications
You must be signed in to change notification settings - Fork 20
waifu2x
Waifu2x is a well-known image super-resolution neural network for anime-style arts.
Link:
- (stable) https://github.com/AmusementClub/vs-mlrt/releases/download/model-20211209/waifu2x_v3.7z
- (
swin_unet
) https://github.com/AmusementClub/vs-mlrt/releases/download/external-models/waifu2x_swin_unet_v4.7z
Includes all known publicly available waifu2x models:
- anime_style_art: requires pre-scaled input for the scaled2.0x variant
- noise1 noise2 noise3 scale2.0x
- anime_style_art_rgb: requires pre-scaled input for the scale2.0x variant
- noise0 noise1 noise2 noise3 scale2.0x
- photo: requires pre-scaled input for the scale2.0x variant
- noise0 noise1 noise2 noise3 scale2.0x
- ukbench: requires pre-scaled input
- scale2.0x
- upconv_7_anime_style_art_rgb
- scale2.0x noise3_scale2.0x noise2_scale2.0x noise1_scale2.0x noise0_scale2.0x
- upconv_7_photo
- scale2.0x noise0_scale2.0x noise1_scale2.0x noise2_scale2.0x noise3_scale2.0x
- cunet: tile size (
block_w
andblock_h
) must be multiples of 4.- noise0 noise1 noise2 noise3
- scale2.0x
- noise0_scale2.0x noise1_scale2.0x noise2_scale2.0x noise3_scale2.0x
- upresnet10
- scale2.0x
- noise0_scale2.0x noise1_scale2.0x noise2_scale2.0x noise3_scale2.0x
In order to simplify usage, we provided a Python wrapper module vsmlrt that provides full functionality of waifu2x caffe but with a more Pythonic interface:
from vsmlrt import Waifu2x, Waifu2xModel, Backend
src = core.std.BlankClip(format=vs.RGBS)
# backend could be:
# - CPU Backend.OV_CPU(): the recommended CPU backend; generally faster than ORT-CPU.
# - CPU Backend.ORT_CPU(num_streams=1, verbosity=2): vs-ort cpu backend.
# - GPU Backend.ORT_CUDA(device_id=0, cudnn_benchmark=True, num_streams=1, verbosity=2)
# - use device_id to select device
# - set cudnn_benchmark=False to reduce script reload latency when debugging, but with slight throughput performance penalty.
# - GPU Backend.TRT(fp16=True, device_id=0, num_streams=1): TensorRT runtime, the fastest NV GPU runtime.
flt = Waifu2x(src, noise=-1, scale=2, model=Waifu2xModel.upconv_7_anime_style_art_rgb, backend=Backend.ORT_CUDA())
This section is mostly for reference purposes as the suggested way is to use the vsmlrt.py.
src = core.std.BlankClip(width=1920, height=1080, format=vs.RGBS)
flt = core.ov.Model(src, "upconv_7_anime_style_art_rgb_scale2.0x.onnx")
anime_style_art, anime_style_art_rgb, photo, ukbench models do not include builtin upscaling. Therefore, you need to upscale 2x using Catmull-Rom (bicubic(b=0, c=0.5)) before feeding the image to the models:
src = core.std.BlankClip(width=1920, height=1080, format=vs.RGBS)
flt = core.ov.Model(src.fmtc.resample(scale=2, kernel="bicubic", a1=0, a2=0.5), "anime_style_art_rgb_scale2.0x.onnx")
- cunet networks work best when the tile size (
block_w
/block_h
) is in range 60 - 150 and multiples of 4.
Measurements: FPS / Device Memory (MB)
Device memory:
- CPU: private memory including VapourSynth
- GPU: device memory including context
Software: VapourSynth R57, Windows 10 LTSC 2021, Graphics Driver 511.23.
Input size: 1920x1080
- vs-mlrt v6
- vapoursynth-waifu2x-ncnn-vulkan r4
- vs-mlrt v8 (driver 511.79)
Model | [1] ort-cuda | [1] trt | [2] vulkan (540p patch) | [3] ort-cuda | [3] trt | [3] trt (no tf32) |
---|---|---|---|---|---|---|
upconv7 | 6.12 / 6592 | 7.22 / 5694 | 2.83 / 10578 | 7.24 / 6408 | 7.99 / 5761 | 7.86 / 5785 |
upresnet10 | 4.72 / 5820 | N/A | N/A | 5.79 / 5634 | N/A | N/A |
cunet | 2.70 / 18624 | N/A | 0.71 / 15082 | 3.28 / 18435 | N/A | N/A |
Model | [1] ort-cuda | [1] trt | [1] trt (2 streams) | [2] vulkan | [3] ort-cuda | [3] trt | [3] trt (2 streams) |
---|---|---|---|---|---|---|---|
upconv7 | 7.64 / 6204 | 13.4 / 4652 | 25.4 / 7852 | 4.20 / 20750 | 10.6 / 5764 | 16.2 / 2385 | 30.1 / 4096 |
upresnet10 | 6.38 / 5818 | N/A | N/A | N/A | 8.15 / 5632 | N/A | N/A |
cunet | 3.55 / 10172 | N/A | N/A | 0.91 / 7696 (540p patch) | 4.53 / 9983 | N/A | N/A |
Software: VapourSynth R57, Windows 10 LTSC 2021, Graphics Driver 511.23.
Input size: 1920x1080
- vs-mlrt v6
- VapourSynth-Waifu2x-caffe r14
- vapoursynth-waifu2x-ncnn-vulkan r4
Model | [1] ort-cuda | [1] trt | [2] caffe (540p patch) | [3] vulkan (540p patch) |
---|---|---|---|---|
upconv7 | 4.36 / 5922 | 4.73 / 5072 | 1.08 / 3159 | 1.40 / 10568 |
upresnet10 | 3.31 / 5150 | N/A | 1.03 / 7280 | N/A |
cunet | 1.77 / 5170 (540p patch) | N/A | 0.73 / 6957 (360p patch) | 0.60 / 6992 (360p patch) |
Model | [1] ort-cuda | [1] trt | [1] trt (2 streams) | [3] vulkan (540p patch) |
---|---|---|---|---|
upconv7 | 5.84 / 5278 | 11.9 / 3055 | 19.2 / 5263 | 2.60 / 5438 |
upresnet10 | 5.14 / 5148 | N/A | N/A | N/A |
cunet | 1.64 / 9502 | N/A | N/A | 0.88 / 7686 |
Software: VapourSynth R57, Windows Server 2019, Graphics Driver 511.23.
Input size: 1920x1080
- vs-mlrt v6
- VapourSynth-Waifu2x-caffe r14
- vapoursynth-waifu2x-ncnn-vulkan r4, Graphics Driver 471.68
Model | [1] ort-cuda | [1] trt | [1] trt (2 streams) | [2] caffe (540p patch) | [3] vulkan (540p patch) |
---|---|---|---|---|---|
upconv7 | 5.98 / 5065 | 6.60 / 5033 | 8.43 / 9253 | 1.63 / 3248 | 1.67 / 11197 |
upresnet10 | 4.36 / 5061 | N/A | N/A | 1.54 / 7232 | N/A |
cunet | 2.58 / 9155 | N/A | N/A | 1.11 / 11657 | 0.53 / 15705 |
Model | [1] ort-cuda | [1] trt | [1] trt (2 streams) | [3] vulkan |
---|---|---|---|---|
upconv7 | 10.4 / 5189 | 13.8 / 3041 | 26.2 / 5253 | 3.97 / 21369 |
upresnet10 | 6.43 / 5059 | N/A | N/A | N/A |
cunet | 4.10 / 9535 | N/A | N/A | 0.86 / 29848 |
Software: VapourSynth R57, Windows Server 2019, Graphics Driver 511.23, lock the GPU clocks at max frequency.
Input size: 1920x1080
- vs-mlrt v6
- vapoursynth-waifu2x-ncnn-vulkan r4, Graphics Driver 471.68
Model | [1] ort-cuda | [1] trt | [1] trt (2 streams) | [2] vulkan (540p patch) |
---|---|---|---|---|
upconv7 | 6.94 / 9765 | 7.83 / 5511 | 8.61 / 9731 | 1.63 / 10892 |
upresnet10 | 3.90 / 5665 | N/A | N/A | N/A |
cunet | 2.20 / 18469 | N/A | N/A | 0.53 / 15397 |
Model | [1] ort-cuda | [1] trt | [1] trt (2 streams) | [2] vulkan |
---|---|---|---|---|
upconv7 | 9.66 / 6049 | 16.1 / 3501 | 19.9 / 5701 | 3.03 / 21075 |
upresnet10 | 6.53 / 5663 | N/A | N/A | N/A |
cunet | 3.26 / 10017 | N/A | N/A | 0.78 / 8011 (540p patch) |
Software: VapourSynth R58, Windows Server 2022, Graphics Driver 511.65, lock the GPU clocks at max frequency.
Input size: 1920x1080
- vs-mlrt v8
Model | [1] trt |
---|---|
upconv7 | 7.20 / 5668 |
Model | [1] trt | [1] trt (2 streams) |
---|---|---|
upconv7 | 16.4 / 2255 | 22.2 / 3981 |
Software: VapourSynth R57, Windows Server 2019, Graphics Driver 511.23.
Input size: 1920x1080
- vs-mlrt v6
Model | [1] ort-cuda | [1] trt | [1] trt (2 streams) |
---|---|---|---|
upconv7 | 17.3 / 9827 | 20.0 / 5713 | 27.2 / 10051 |
upresnet10 | N/A | N/A | N/A |
cunet | N/A | N/A | N/A |
Model | [1] ort-cuda | [1] trt | [1] trt (2 streams) |
---|---|---|---|
upconv7 | 18.3 / 6111 | 32.8 / 4539 | 57.3 / 7719 |
upresnet10 | N/A | N/A | N/A |
cunet | N/A | N/A | N/A |
Software: VapourSynth R57-A4, Windows Server 2022, Graphics Driver 516.94.
Input size: 1920x1080
- vs-mlrt v9
Model | [1] trt | [1] trt (2 streams) |
---|---|---|
upconv7 | 30.4 / 2359 | 57.4 / 4037 |
cunet | 19.4 / 4647 | 26.9 / 8558 |
- vsmlrt v14.test2
- driver 545.84
- Windows Server 2022
- VapourSynth-classic R57.A8
1920x1080 rgbs
Measurements: FPS / Device Memory (MB)
precision | TRT 1 stream | TRT 2 streams | TRT 3 streams |
---|---|---|---|
fp16 | 5.43 / 7623.5 | 5.77 / 14742 | 5.84 / 21857 |
bf16 | 4.69 / 8058.3 | 4.92 / 15591 | 4.98 / 23124 |
Hardware: Xeon Icelake Server 32C64T @2.90 GHz
Software: VapourSynth R57, Windows Server 2019.
Input size: 1920x1080
- vs-mlrt v6
- VapourSynth-Waifu2x-w2xc r8
Model | [1] ov-cpu | [2] w2xc |
---|---|---|
upconv7 | 1.22 / 18750 | N/A |
upresnet10 | 1.40 / 18278 | N/A |
cunet | 0.65 / 22447 | N/A |
anime rgb | 0.69 / 34619 | 0.26 / 7895 |
Hardware: EPYC Milan 32C64T @2.55 GHz
Software: VapourSynth R57, Windows Server 2019.
Input size: 1920x1080
- vs-mlrt v6
- VapourSynth-Waifu2x-w2xc r8
Model | [1] ov-cpu | [2] w2xc |
---|---|---|
upconv7 | 0.36 / 19583 | N/A |
upresnet10 | 0.35 / 18694 | N/A |
cunet | 0.20 / 21644 | N/A |
anime rgb | 0.20 / 34619 | 0.28 / 5398 |
- Runtimes
- Models
- Device-specific benchmarks
- NVIDIA GeForce RTX 4090
- NVIDIA GeForce RTX 3090
- NVIDIA GeForce RTX 2080 Ti
- NVIDIA Quadro P6000
- AMD Radeon RX 7900 XTX
- AMD Radeon Pro V620
- AMD Radeon Pro V520
- AMD Radeon VII
- AMD EPYC Zen4
- Intel Core Ultra 7 155H
- Intel Arc A380
- Intel Arc A770
- Intel Data Center GPU Flex 170
- Intel Data Center GPU Max 1100
- Intel Xeon Sapphire Rapids