Alternative refiner implementation #12377

AUTOMATIC1111 · 2023-08-06T19:15:11Z

Description

two settings on Stable Diffusion page: Refiner checkpoint and Refiner switch at.
first lets you select a model.
second lets you select a ratio.
Runs two rounds of sampling: one for Refiner switch at * total steps steps, then switches model to Refiner checkpoint, and then finishes sampling in img2img mode, using remaining number of steps and denoising strength equal to 1 - Refiner switch at.
- for example, with 20 steps and Refiner switch at = 0.25, the first sampling will go for 5 steps and the second - for 15.
switch back to original checkpoint happens after you start generating the next picture (subject to change?)
tested with SD1.
tested with SDXL.
possible to cross-refine SD1 and SD2.
works with kdiffusion samplers.
does not work with DDIM (and compvis samplers still don't work with SDXL).
works with img2img
does not work with hires fix (not sure whether the support should be added)
not tested with medvram/lowvram
infotext support
in future i plan to integrate this in a nice way into main UI, for now you just gotta put it into quicksettings bar
other PR: initial refiner support #12371

zz2222222222222 · 2023-08-07T04:42:25Z

SD Unet not working after switch the model ,
the first SDXL can working with SD Unet, after switch the model the refiner not working with SD Unet .

AUTOMATIC1111 · 2023-08-07T04:57:15Z

would you like to share specifics

zz2222222222222 · 2023-08-07T05:00:43Z

would you like to share specifics

[08/07/2023-12:53:14] [TRT] [W] TensorRT was linked against cuDNN 8.9.0 but loaded cuDNN 8.7.0
33%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████▋ | 8/24 [00:01<00:02, 6.25it/s]
Reusing loaded model gf4.safetensors [57fdfb1fbe] to load my_mix46.safetensors [70aabbd23b]███████▎ | 7/24 [00:01<00:02, 8.06it/s]
Dectivating unet: [TRT] gf4
Loading weights [70aabbd23b] from ./stable-diffusion-webui/models/Stable-diffusion/my_mix46.safetensors
Creating model from config: .d/stable-diffusion-webui/configs/v1-inference.yaml
Applying attention optimization: sdp... done.
Model loaded in 0.6s (create model: 0.1s, apply weights to model: 0.3s, apply half(): 0.1s).
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 16/16 [00:02<00:00, 7.19it/s]
Total progress: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 24/24 [00:06<00:00, 3.54it/s]
Total progress: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 24/24 [00:06<00:00, 8.06it/s]

from the log it not load the second TRT model
the first model is SDXL gf4 (safetensors and trt)
（i cant sure system use safetensors inference or trt ,if without refiner use trt speed 12.00it/s,but in here only 6.25it/s same with use without refiner use safetensors model speed)

the seconds model is SD my_mix46.safetensors
the second model i can sure beacuase it not load the my_mix46.trt model

AUTOMATIC1111 · 2023-08-07T05:16:49Z

pushed a possible solution

zz2222222222222 · 2023-08-07T05:58:49Z

pushed a possible solution

Thank your so much
use this apply unet overrides after switching model

can fix the first model not use SD Unet problem

the second still cant ,and im try modify the code in
/stable-diffusion-webui/modules/sd_models.py

695 load_model(checkpoint_info, already_loaded_state_dict=state_dict)
++696 sd_unet.apply_unet('Automatic') // it working for me
return model_data.sd_model

this code will force use of trt for inference, can fix my solution,but not a good idea for other user.

after log

Activating unet: [TRT] gf4
[08/07/2023-13:51:54] [TRT] [W] TensorRT was linked against cuDNN 8.9.0 but loaded cuDNN 8.7.0
[08/07/2023-13:51:54] [TRT] [W] TensorRT was linked against cuDNN 8.9.0 but loaded cuDNN 8.7.0
33%|████████████████████████████████████████████████████████████████████████████████████████████████▎ | 8/24 [00:00<00:01, 9.38it/s]
Reusing loaded model gf4.safetensors [57fdfb1fbe] to load my_mix46.safetensors [70aabbd23b] | 7/24 [00:00<00:01, 10.46it/s]
Dectivating unet: [TRT] gf4
Loading weights [70aabbd23b] from ../stable-diffusion-webui/models/Stable-diffusion/my_mix46.safetensors
Creating model from config: ../stable-diffusion-webui/configs/v1-inference.yaml
Applying attention optimization: xformers... done.
Model loaded in 1.1s (create model: 0.2s, apply weights to model: 0.4s, apply half(): 0.2s, move model to device: 0.2s).
Activating unet: [TRT] my_mix46
[08/07/2023-13:52:03] [TRT] [W] TensorRT was linked against cuDNN 8.9.0 but loaded cuDNN 8.7.0
[08/07/2023-13:52:03] [TRT] [W] TensorRT was linked against cuDNN 8.9.0 but loaded cuDNN 8.7.0
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 16/16 [00:01<00:00, 14.40it/s]
Total progress: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 24/24 [00:12<00:00, 1.91it/s]
Total progress: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 24/24 [00:12<00:00, 5.94it/s]

zz2222222222222 · 2023-08-07T06:51:30Z

when we test again ,found before all return method ,need use sd_unet.apply_unet() in def reload_model_weights(sd_model=None, info=None): this method

so im modify it like that
def reload_model_weights(sd_model=None,info=None):
model=reload_model_weights_k(sd_model, info)
sd_unet.apply_unet('Automatic')
return model

def reload_model_weights_k(sd_model=None, info=None):

brkirch · 2023-08-07T07:12:15Z

There are a few problems I've found, and unfortunately this approach will need some modifications to get the best quality possible.

First, comparing the results from this PR to what I was getting with #12328 (click the arrows on the left to show the images):

This PR

#12328

This PR produces an image with less fine detail. This is because this PR still adds noise whereas mine does not. That said, if the noise is zeroed, the image is still not quite right:

This PR without adding noise

Looking further I found an off by 1 error in the denoising strength calculation:

            self.denoising_strength = 1.0 - stopped_at / self.steps

stopped_at is one less than the actual step stopped at, so that should be:

            self.denoising_strength = 1.0 - (stopped_at + 1) / self.steps

Unfortunately even with that change this is still not at the quality my PR had:

This PR without adding noise and with correct denoising strength

So what is different from my PR? If I try this PR but prune the sigmas for the first pass by the number of steps early to stop:

This PR without adding noise, with correct denoising strength, and pruning sigmas for the first pass

The official SD XL repo also prunes sigmas, so that may be a requirement for this to work correctly. Unfortunately my testing for this approach was done also by pruning sigmas for the first pass to stop early and running the highres pass normally but without added noise.

AUTOMATIC1111 · 2023-08-07T07:22:14Z

oh, you're right, you're right, adding noise is wrong, and not adding is wrong too, I need to recover the noisy image from the sampler rather than denoised one.

AUTOMATIC1111 · 2023-08-07T09:12:28Z

Changed it to work with noisy latent from kdiffusion, and not add any noise. Looks fine for SDXL->SDXL and SD1->SD1, but pretty bad for SD1->SDXL. Maybe I'm doing something wrong, but if not we will have to revert to adding noise for SD1<->SDXL.

bosima · 2023-08-07T15:35:44Z

the first can working, the second execution to the refiner reports an error:
modules.devices.NansException: A tensor with all NaNs was produced in Unet.

Detail Log:

Reusing loaded model sd_xl_base_1.0.safetensors [31e35c80fc] to load sd_xl_refiner_1.0.safetensors [7440042bbd]
Loading weights [7440042bbd] from cache
Creating model from config: /root/stable-diffusion-webui/repositories/generative-models/configs/inference/sd_xl_refiner.yaml
Applying attention optimization: xformers... done.
Model loaded in 4.9s (create model: 0.2s, apply weights to model: 0.1s, move model to device: 2.8s, calculate empty prompt: 1.7s).
0%| | 0/8 [00:06<?, ?it/s]
*** Error completing request
*** Arguments: ('task(nhw0m2c5haxv0j4)', 'street fashion photography, young female, pale skin, (look at viewer), sexy pose,(pink hair, white hair, blonde hair, long hair), ((high ponytail)),detailed skin, (detailed eyes:1.3), skin pores, (grin:1.1), skin texture, (Hunter green uniform, black skirt:1.4), long green sleeves,8k, real picture, intricate details, ultra-detailed,(photorealistic),film action shot, full body shot, in a shopping mall,realistic, extremely high quality RAW photograph, detailed background, intricate, warm lighting, high resolution,uhd, film grain, Fujifilm XT3', 'text, watermark, disfigured, kitsch, ugly, oversaturated, low-res, blurred, painting, illustration, drawing, sketch, low quality, long exposure, (cape:1.4), cartoon, 3d character,', [], 30, 0, False, False, 1, 1, 7, -1.0, -1.0, 0, 0, 0, False, 1024, 1024, False, 0.7, 2, 'Latent', 0, 0, 0, 'Use same checkpoint', 0, '', '', [], <gradio.routes.Request object at 0x7fd26938b850>, 0, True, False, False, False, 'base', <scripts.controlnet_ui.controlnet_ui_group.UiControlNetUnit object at 0x7fd269c63d60>, <scripts.controlnet_ui.controlnet_ui_group.UiControlNetUnit object at 0x7fd269ad2bc0>, <scripts.controlnet_ui.controlnet_ui_group.UiControlNetUnit object at 0x7fd269ad1570>, <scripts.controlnet_ui.controlnet_ui_group.UiControlNetUnit object at 0x7fd26938aaa0>, False, False, 'positive', 'comma', 0, False, False, '', 1, '', [], 0, '', [], 0, '', [], True, False, False, False, 0, None, None, False, None, None, False, None, None, False, None, None, False, 50) {}
Traceback (most recent call last):
File "/root/stable-diffusion-webui/modules/call_queue.py", line 58, in f
res = list(func(*args, **kwargs))
File "/root/stable-diffusion-webui/modules/call_queue.py", line 37, in f
res = func(*args, **kwargs)
File "/root/stable-diffusion-webui/modules/txt2img.py", line 63, in txt2img
processed = processing.process_images(p)
File "/root/stable-diffusion-webui/modules/processing.py", line 743, in process_images
res = process_images_inner(p)
File "/root/stable-diffusion-webui/extensions/sd-webui-controlnet/scripts/batch_hijack.py", line 42, in processing_process_images_hijack
return getattr(processing, '__controlnet_original_process_images_inner')(p, *args, **kwargs)
File "/root/stable-diffusion-webui/modules/processing.py", line 879, in process_images_inner
samples_ddim = p.run_refiner(samples_ddim)
File "/root/stable-diffusion-webui/modules/processing.py", line 425, in run_refiner
samples = self.sampler.sample_img2img(self, noisy_latent, x, self.c, self.uc, image_conditioning=self.image_conditioning, steps=max(1, self.steps - stopped_at - 1))
File "/root/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 471, in sample_img2img
samples = self.launch_sampling(t_enc + 1, lambda: self.func(self.model_wrap_cfg, xi, extra_args=extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs))
File "/root/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 314, in launch_sampling
return func()
File "/root/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 471, in
samples = self.launch_sampling(t_enc + 1, lambda: self.func(self.model_wrap_cfg, xi, extra_args=extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs))
File "/root/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/root/stable-diffusion-webui/repositories/k-diffusion/k_diffusion/sampling.py", line 145, in sample_euler_ancestral
denoised = model(x, sigmas[i] * s_in, **extra_args)
File "/root/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/root/stable-diffusion-webui/modules/sd_samplers_kdiffusion.py", line 219, in forward
devices.test_for_nans(x_out, "unet")
File "/root/stable-diffusion-webui/modules/devices.py", line 222, in test_for_nans
raise NansException(message)
modules.devices.NansException: A tensor with all NaNs was produced in Unet. Use --disable-nan-check commandline argument to disable this check.

zz2222222222222 · 2023-08-08T02:37:41Z

@bosima AUTOMATIC1111/stable-diffusion-webui-tensorrt#58
as i know current dev dont support SDXL trt model
please follow this modify the code
im a new user dont know how to pull those code because it need modify many project

lllyasviel · 2023-08-08T05:02:37Z

I still believe we may have

if a_is_sdxl != b_is_sdxl:
just swap model without refresh sampler
else:
get clean x0 latent by vae decode+encode, compute cond, swap model, add noise to clean x0, refresh with img2img sampler

I know that we probably do not want two behaviors, but after latest considerations of this pull request, we already seem to get two behaviors.

But feel free to correct me if refreshing with img2img sampler is better than not for XL, but in my opinion, we should always preserve valuable history of sampler as long as we can

lllyasviel · 2023-08-08T14:51:10Z

I am experimenting these behaviors. https://github.com/lllyasviel/Stable-Diffusion-FixedUI
Please hold these two PRs one or two days and we may have more results.

lllyasviel · 2023-08-08T16:33:39Z

Update

OK my experiments finished:

The logic in 12371 is correct. The logic in this PR (12377) is wrong. The results from 12371 are way better.
Using early stop to refine XL with 1.5 is not a thing. The results are pretty bad whatever early stop we apply. Note that I also tried extracting clean latent and add noise (rather than decode+encode noisy latent). It works but results are bad.
However, I recommend to have a fallback on the UI if user select 1.5 models to refine XL, otherwise the UI is not very friendly. I suggest to just fallback to highres fix behavior with same resolution and use float(1-switch_at) as denoising strength.

AUTOMATIC1111 · 2023-08-08T18:43:29Z

Do you think it is a problem with implementation in this PR, or just with general concept? SD's official repo uses something like this.

lllyasviel · 2023-08-08T18:50:31Z

I think #12371 is better than re-initializing samplers.
I am not exactly sure, but it seems that 12371 is even better than SAI official or some node pipelines implementing two independent ksamplers in comfyui.
To validate this, we will need to test a enough number of images with different samplers.
But in my tests, it seems that 12371's native swap inside sampler seems always a bit better

lllyasviel · 2023-08-08T18:56:06Z

Besides, XL base-> early stop->decode vae->encode vae->add noise->sd 1.5->i2i->result seems to produce bad results.

XL base-> early stop->decode vae noisy latent->encode vae->sd 1.5->i2i->result is worse.

Results are better when XL base-> stop at last step->final result->decode vae->encode vae->add noise->go back to previous step->sd 1.5->i2i->result.

Also, XL vae decode + 1.5 vae encode tend to produce some slight ghost or color overflow problems, not sure why.

lllyasviel · 2023-08-08T19:02:02Z

Also, although not super recommended, it seems possible to add comfyui's git to repositories if we want to import some functions to test absolutely same results.

AUTOMATIC1111 · 2023-08-08T19:28:20Z

that is not going to happen

angrysky56 · 2023-08-09T10:28:51Z

If you don't let the base image be noisy the refiner doesn't do much it seems, but it can actually do a lot of work. If you set the refiner up to 50% and let it work on a fuzzy base image, say 17 to 33 steps, it is much faster and makes nice images. The settings aren't right out of the box in the template. I also will suggest a tiled VAE.

example images.
https://civitai.com/models/106747/sdxl-09-modded-workflows-pack-designed-to-run-on-16gb-ram-12gb-vram-1920x1920

This could help with your 1.5 / sdxl issues-
https://civitai.com/models/118811/sd15-with-sdxl-comfyui-workflow-template

alternative refiner implementation

6c7b6ec

AUTOMATIC1111 mentioned this pull request Aug 6, 2023

initial refiner support #12371

Merged

catboxanon added the sdxl Related to SDXL label Aug 7, 2023

apply unet overrides after switching model

3f82820

send noisy latent into refiner without adding noise

6865983

repair img2img

f1d7c07

catboxanon linked an issue Aug 7, 2023 that may be closed by this pull request

[Feature Request]: SDXL refiner support #11919

Closed

1 task

dhwz mentioned this pull request Aug 8, 2023

[Bug]: Refiner, Sizes of tensors must match except in dimension 0. Expected size 1280 but got size 768 for tensor number 1 in the list. #12400

Closed

1 task

catboxanon mentioned this pull request Aug 8, 2023

Change of models in the middle of the image generation #1901

Closed

AUTOMATIC1111 closed this Aug 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Alternative refiner implementation #12377

Alternative refiner implementation #12377

AUTOMATIC1111 commented Aug 6, 2023 •

edited

Loading

zz2222222222222 commented Aug 7, 2023 •

edited

Loading

AUTOMATIC1111 commented Aug 7, 2023

zz2222222222222 commented Aug 7, 2023 •

edited

Loading

AUTOMATIC1111 commented Aug 7, 2023

zz2222222222222 commented Aug 7, 2023 •

edited

Loading

zz2222222222222 commented Aug 7, 2023 •

edited

Loading

brkirch commented Aug 7, 2023 •

edited

Loading

AUTOMATIC1111 commented Aug 7, 2023

AUTOMATIC1111 commented Aug 7, 2023

bosima commented Aug 7, 2023 •

edited

Loading

zz2222222222222 commented Aug 8, 2023 •

edited

Loading

lllyasviel commented Aug 8, 2023 •

edited

Loading

lllyasviel commented Aug 8, 2023 •

edited

Loading

lllyasviel commented Aug 8, 2023

AUTOMATIC1111 commented Aug 8, 2023

lllyasviel commented Aug 8, 2023 •

edited

Loading

lllyasviel commented Aug 8, 2023 •

edited

Loading

lllyasviel commented Aug 8, 2023

AUTOMATIC1111 commented Aug 8, 2023

angrysky56 commented Aug 9, 2023

Alternative refiner implementation #12377

Alternative refiner implementation #12377

Conversation

AUTOMATIC1111 commented Aug 6, 2023 • edited Loading

Description

zz2222222222222 commented Aug 7, 2023 • edited Loading

AUTOMATIC1111 commented Aug 7, 2023

zz2222222222222 commented Aug 7, 2023 • edited Loading

AUTOMATIC1111 commented Aug 7, 2023

zz2222222222222 commented Aug 7, 2023 • edited Loading

after log

zz2222222222222 commented Aug 7, 2023 • edited Loading

brkirch commented Aug 7, 2023 • edited Loading

AUTOMATIC1111 commented Aug 7, 2023

AUTOMATIC1111 commented Aug 7, 2023

bosima commented Aug 7, 2023 • edited Loading

Detail Log:

zz2222222222222 commented Aug 8, 2023 • edited Loading

lllyasviel commented Aug 8, 2023 • edited Loading

lllyasviel commented Aug 8, 2023 • edited Loading

lllyasviel commented Aug 8, 2023

AUTOMATIC1111 commented Aug 8, 2023

lllyasviel commented Aug 8, 2023 • edited Loading

lllyasviel commented Aug 8, 2023 • edited Loading

lllyasviel commented Aug 8, 2023

AUTOMATIC1111 commented Aug 8, 2023

angrysky56 commented Aug 9, 2023

AUTOMATIC1111 commented Aug 6, 2023 •

edited

Loading

zz2222222222222 commented Aug 7, 2023 •

edited

Loading

zz2222222222222 commented Aug 7, 2023 •

edited

Loading

zz2222222222222 commented Aug 7, 2023 •

edited

Loading

zz2222222222222 commented Aug 7, 2023 •

edited

Loading

brkirch commented Aug 7, 2023 •

edited

Loading

bosima commented Aug 7, 2023 •

edited

Loading

zz2222222222222 commented Aug 8, 2023 •

edited

Loading

lllyasviel commented Aug 8, 2023 •

edited

Loading

lllyasviel commented Aug 8, 2023 •

edited

Loading

lllyasviel commented Aug 8, 2023 •

edited

Loading

lllyasviel commented Aug 8, 2023 •

edited

Loading