Using other SDXL turbo models to optimize the generation speed #45

DmitryVN · 2024-02-28T19:13:02Z

Is it possible to implement support for Turbo SDXL models or Lightning SDXL or TensorRT?

gallojorge · 2024-02-29T20:48:03Z

if it is possible and simple just put in the file that is in the folder option SUPIR_v0.yaml edit the line at the end by the name of your model either turbo or lighting and you can do the test, to me it worked with 24 steps and the model juggernautXL_v9Rdphoto2Lightning.safetensors

gallojorge · 2024-02-29T20:48:15Z

Is it possible to implement support for Turbo SDXL models or Lightning SDXL or TensorRT?

if it is possible and simple just put in the file that is in the folder option SUPIR_v0.yaml edit the line at the end by the name of your model either turbo or lighting and you can do the test, to me it worked with 24 steps and the model juggernautXL_v9Rdphoto2Lightning.safetensors

Gutianpei · 2024-02-29T22:47:22Z

if it is possible and simple just put in the file that is in the folder option SUPIR_v0.yaml edit the line at the end by the name of your model either turbo or lighting and you can do the test, to me it worked with 24 steps and the model juggernautXL_v9Rdphoto2Lightning.safetensors

Have you tried to compare the generation between SDXL and the SDXL-lightning model in 24 steps? I can get decent upscaling resule with SDXL in 24 steps too, and I think the purpose of using SDXL-lightning should result in 1-5 steps with same quality as a 30 steps SDXL model.

gallojorge · 2024-03-01T17:47:57Z

if it is possible and simple just put in the file that is in the folder option SUPIR_v0.yaml edit the line at the end by the name of your model either turbo or lighting and you can do the test, to me it worked with 24 steps and the model juggernautXL_v9Rdphoto2Lightning.safetensors

Have you tried to compare the generation between SDXL and the SDXL-lightning model in 24 steps? I can get decent upscaling resule with SDXL in 24 steps too, and I think the purpose of using SDXL-lightning should result in 1-5 steps with same quality as a 30 steps SDXL model.

https://imgsli.com/MjQzNzI3 new modifications, I changed the code so that it can be done in 8 steps with the juggernautXL_v9Rdphoto2Lightning.safetensors model.

gallojorge · 2024-03-01T17:49:09Z

the calculation time with an acceptable quality is 24 seconds with a 3090 fe

gallojorge · 2024-03-01T18:02:21Z

https://imgsli.com/MjQzOTM5 in this example use the same model with 8 steps and uspcaling at a factor of 3x ,time 60 segundos

gallojorge · 2024-03-01T18:09:18Z

https://imgsli.com/MjQzOTQy factor to 4x,time 150 seg

Fanghua-Yu · 2024-03-02T14:59:05Z

I tried to replace SDXL UNet with Juggernaut_RunDiffusionPhoto2_Lightning_4Steps.safetensors.
With a diffusion step of 8 and CFG in 1.5-2, it shows an acceptable quality.
Currently, I am trying to switch sampler to DPM++ SDE Karras. (as suggested by Juggernaut-XL-Lightning)

JarJarBeatyourattitude · 2024-03-03T20:04:29Z

Were you able to make any progress with the DPM++ SDE Karras integration?

SDXL lightning seems like a fantastic option for SUPIR.

I noticed in the paper that the sampler was modified from EDM to be better at restoration,so I'm guessing it's not as simple as just swapping the existing sampler with default DPM++ SDE Karras.

JarJarBeatyourattitude · 2024-03-03T20:36:18Z

I was also thinking that supporting fp8 for SDXL could help with speed by lowering VRAM consumption, since tiling slows down the process a lot.

For example, right now on a 3090 the highest resolution I can go without tiling is around 3 million pixels. Supporting fp8 would allow me to get to 6 million pixels without tiling and 12 million pixels with just two tiles (assuming correct aspect ratio) rather than 4.

I'm not sure how difficult to implement this is though.

Here's the pull request for fp8 implementation in Auto1111: AUTOMATIC1111/stable-diffusion-webui#14031

JarJarBeatyourattitude · 2024-03-03T21:30:27Z

Although, I guess to really lower VRAM consumption that much you'd need fp8 support for both SDXL and the SUPIR model itself.

isidentical · 2024-03-06T19:28:16Z

Inference with SUPIR seems super slow, any reason not to use something like diffuser's UNET implementation as the backend so people can leverage the existing tooling around it to optimize?

stalestar · 2024-04-15T07:26:10Z

https://imgsli.com/MjQzOTQy factor to 4x,time 150 seg

if only run 8 steps, the picture have some artifacts with mosaic noise and colour noise, that like forward Insufficient.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using other SDXL turbo models to optimize the generation speed #45

Using other SDXL turbo models to optimize the generation speed #45

DmitryVN commented Feb 28, 2024

gallojorge commented Feb 29, 2024

gallojorge commented Feb 29, 2024

Gutianpei commented Feb 29, 2024

gallojorge commented Mar 1, 2024

gallojorge commented Mar 1, 2024

gallojorge commented Mar 1, 2024

gallojorge commented Mar 1, 2024

Fanghua-Yu commented Mar 2, 2024

JarJarBeatyourattitude commented Mar 3, 2024

JarJarBeatyourattitude commented Mar 3, 2024

JarJarBeatyourattitude commented Mar 3, 2024

isidentical commented Mar 6, 2024

stalestar commented Apr 15, 2024

Using other SDXL turbo models to optimize the generation speed #45

Using other SDXL turbo models to optimize the generation speed #45

Comments

DmitryVN commented Feb 28, 2024

gallojorge commented Feb 29, 2024

gallojorge commented Feb 29, 2024

Gutianpei commented Feb 29, 2024

gallojorge commented Mar 1, 2024

gallojorge commented Mar 1, 2024

gallojorge commented Mar 1, 2024

gallojorge commented Mar 1, 2024

Fanghua-Yu commented Mar 2, 2024

JarJarBeatyourattitude commented Mar 3, 2024

JarJarBeatyourattitude commented Mar 3, 2024

JarJarBeatyourattitude commented Mar 3, 2024

isidentical commented Mar 6, 2024

stalestar commented Apr 15, 2024