Corrupted images on Vukan backend #439

wbruna · 2024-10-19T13:09:02Z

I'm getting either corrupted or inconsistent images on Vulkan backend, for any resolution other than 512x512.

My system is a Linux PC with a Ryzen 3400G, almost-vanilla Debian 12 (using the distro graphic stack). All following tests with:
--type f16 --lora-model-dir ./LoRA --model ./SD/dreamshaper_8.safetensors --prompt 'a fantasy character, detailed background, colorful<lora:lcm-lora-sdv1-5:1>' --cfg-scale 1.0 --sampling-method lcm --steps 4 --rng cuda --seed 42 -b 1 --color
, and a script alternating resolution and compiled binary (Vulkan or CPU backend).

320x512:

Vulkan 1	Vulkan 2	CPU

Vulkan images look ok-ish (for such a small resolution anyway), but the same seed should produce the same image. And the CPU render looks quite different.

Second test, 384x384; similar behavior (changes between Vulkan 1 and 2 may not be apparent on the thumbnail):

Vulkan 1	Vulkan 2	CPU

The third test, 448x448, gets weird:

Vulkan 1	Vulkan 2	CPU

At first, I blamed my PC drivers. But then, the 512x512 test:

Vulkan 1	Vulkan 2	CPU

Looks absolutely fine, and identical between Vulkan and CPU.

In summary:

512x512 works fine;
any other resolution produces inconsistent images between runs;
some resolutions introduce artifacts

(related: #122 )

The text was updated successfully, but these errors were encountered:

stduhpf · 2024-10-19T13:33:30Z

Good catch. Using the same prompt, i get a smilar behavior as you.
Though I don't get anything as dramatic as your 448x448 results. (I only get variations of your "Vulkan 1" images, no matter how many times I try).

I might try to investigate what's going on, but I'm not confident I'll figure it out.

stduhpf · 2024-10-19T15:17:08Z

@0cc4m Do you have a clue?

stduhpf · 2024-10-19T15:35:20Z

This also happens with images bigger than 512x512 if the resolution isn't a multiple of 128...

0cc4m · 2024-10-25T10:14:44Z

Thank you for the detailed report, I'll look into it when I find some time.

Fix bad calculation of the end of the range. Add a backend test that covers the bad case (taken from stable diffusion). Fixes leejet/stable-diffusion.cpp#439.

jeffbolznv · 2024-11-25T15:25:13Z

I used GGML_VULKAN_CHECK_RESULTS to narrow down that group_norm was failing, reproduced it with a backend test, and pushed a fix to ggerganov/llama.cpp#10496.

BTW, GGML_VULKAN_CHECK_RESULTS is really helpful, but it looks like it may get harder to build with the recent backend split.

Fix bad calculation of the end of the range. Add a backend test that covers the bad case (taken from stable diffusion). Fixes leejet/stable-diffusion.cpp#439.

0cc4m · 2024-11-26T15:46:17Z

BTW, GGML_VULKAN_CHECK_RESULTS is really helpful, but it looks like it may get harder to build with the recent backend split.

Yeah, I built it to narrow down these kinds of model issues. I hope we can keep it around.

Fix bad calculation of the end of the range. Add a backend test that covers the bad case (taken from stable diffusion). Fixes leejet/stable-diffusion.cpp#439.

wbruna · 2024-12-03T18:40:58Z

I manually applied 56d8a95 on a local build, and it seems to fix this issue. Thanks!

Fix bad calculation of the end of the range. Add a backend test that covers the bad case (taken from stable diffusion). Fixes leejet/stable-diffusion.cpp#439.

jeffbolznv added a commit to jeffbolznv/llama.cpp that referenced this issue Nov 25, 2024

vulkan: fix group_norm

56d8a95

Fix bad calculation of the end of the range. Add a backend test that covers the bad case (taken from stable diffusion). Fixes leejet/stable-diffusion.cpp#439.

jeffbolznv mentioned this issue Nov 25, 2024

vulkan: fix group_norm ggerganov/llama.cpp#10496

Merged

4 tasks

stduhpf mentioned this issue Nov 25, 2024

Refactor: wtype per tensor from file instead of global #455

Closed

0cc4m pushed a commit to ggerganov/llama.cpp that referenced this issue Nov 26, 2024

vulkan: fix group_norm (#10496)

904109e

Fix bad calculation of the end of the range. Add a backend test that covers the bad case (taken from stable diffusion). Fixes leejet/stable-diffusion.cpp#439.

ggerganov pushed a commit to ggerganov/ggml that referenced this issue Dec 3, 2024

vulkan: fix group_norm (llama/10496)

965432c

Fix bad calculation of the end of the range. Add a backend test that covers the bad case (taken from stable diffusion). Fixes leejet/stable-diffusion.cpp#439.

ggerganov pushed a commit to ggerganov/ggml that referenced this issue Dec 3, 2024

vulkan: fix group_norm (llama/10496)

3843328

Fix bad calculation of the end of the range. Add a backend test that covers the bad case (taken from stable diffusion). Fixes leejet/stable-diffusion.cpp#439.

stduhpf linked a pull request Dec 4, 2024 that will close this issue

sync: update ggml #509

Draft

ggerganov pushed a commit to ggerganov/whisper.cpp that referenced this issue Dec 5, 2024

vulkan: fix group_norm (llama/10496)

1e5fe6a

Fix bad calculation of the end of the range. Add a backend test that covers the bad case (taken from stable diffusion). Fixes leejet/stable-diffusion.cpp#439.

SuperUserNameMan mentioned this issue Dec 6, 2024

libstable-diffusion_vulkan.so missing from .AppImage ? fszontagh/sd.cpp.gui.wx#26

Closed

ggerganov pushed a commit to ggerganov/whisper.cpp that referenced this issue Dec 8, 2024

vulkan: fix group_norm (llama/10496)

5e1fcc1

Fix bad calculation of the end of the range. Add a backend test that covers the bad case (taken from stable diffusion). Fixes leejet/stable-diffusion.cpp#439.

arthw pushed a commit to arthw/llama.cpp that referenced this issue Dec 20, 2024

vulkan: fix group_norm (ggerganov#10496)

6649360

Fix bad calculation of the end of the range. Add a backend test that covers the bad case (taken from stable diffusion). Fixes leejet/stable-diffusion.cpp#439.

jeffbolznv mentioned this issue Dec 22, 2024

vulkan: im2col and matmul optimizations for stable diffusion ggerganov/llama.cpp#10942

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Corrupted images on Vukan backend #439

Corrupted images on Vukan backend #439

wbruna commented Oct 19, 2024

stduhpf commented Oct 19, 2024

stduhpf commented Oct 19, 2024

stduhpf commented Oct 19, 2024

0cc4m commented Oct 25, 2024

jeffbolznv commented Nov 25, 2024

0cc4m commented Nov 26, 2024

wbruna commented Dec 3, 2024

Corrupted images on Vukan backend #439

Corrupted images on Vukan backend #439

Comments

wbruna commented Oct 19, 2024

stduhpf commented Oct 19, 2024

stduhpf commented Oct 19, 2024

stduhpf commented Oct 19, 2024

0cc4m commented Oct 25, 2024

jeffbolznv commented Nov 25, 2024

0cc4m commented Nov 26, 2024

wbruna commented Dec 3, 2024