AMD Segmentation Fault #1288

CobeyH · 2023-12-08T16:53:46Z

Describe the problem
I am running Ubuntu with an AMD GPU. I configured my environment variables and set up rocminfo as suggested by this issue: #1079 .

The web page now launches successfully and it no longer shows an error that the GPU isn't detected. However, when I enter a text or image prompt and click the "Generate" button, a segmentation fault occurs.

** System Info ***
System: Ubuntu 22.04.3
CPU: AMD Ryzen 5 3600
GPU: AMD RX 6750XT
Python: 3.10.13
Environment: Venv

HCC_AMDGPU_TARGET=gfx1031
HSA_OVERRIDE_GFX_VERSION=10.3.2

Full Console Log
Update failed.
authentication required but no callback set
Update succeeded.
[System ARGV] ['entry_with_update.py']
Python 3.10.13 (main, Aug 25 2023, 13:20:03) [GCC 9.4.0]
Fooocus version: 2.1.824
Running on local URL: http://127.0.0.1:7865

To create a public link, set share=True in launch().
Total VRAM 12272 MB, total RAM 15903 MB
Set vram state to: NORMAL_VRAM
Disabling smart memory management
Device: cuda:0 AMD Radeon RX 6750 XT : native
VAE dtype: torch.float32
Using sub quadratic optimization for cross attention, if you have memory or speed issues try using: --use-split-cross-attention
Refiner unloaded.
model_type EPS
adm 2816
Using split attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using split attention in VAE
extra keys {'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids', 'cond_stage_model.clip_l.text_projection', 'cond_stage_model.clip_l.logit_scale'}
Base model loaded: /home/cobey/repos/Fooocus/models/checkpoints/juggernautXL_version6Rundiffusion.safetensors
Request to load LoRAs [['sd_xl_offset_example-lora_1.0.safetensors', 0.1], ['None', 1.0], ['None', 1.0], ['None', 1.0], ['None', 1.0]] for model [/home/cobey/repos/Fooocus/models/checkpoints/juggernautXL_version6Rundiffusion.safetensors].
Loaded LoRA [/home/cobey/repos/Fooocus/models/loras/sd_xl_offset_example-lora_1.0.safetensors] for UNet [/home/cobey/repos/Fooocus/models/checkpoints/juggernautXL_version6Rundiffusion.safetensors] with 788 keys at weight 0.1.
Fooocus V2 Expansion: Vocab with 642 words.
Fooocus Expansion engine loaded for cuda:0, use_fp16 = True.
Requested to load SDXLClipModel
Requested to load GPT2LMHeadModel
Loading 2 new models
[Fooocus Model Management] Moving model(s) has taken 1.79 seconds
App started successful. Use the app with http://127.0.0.1:7865/ or 127.0.0.1:7865
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 2
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 4.0
[Parameters] Seed = 4950368496917309143
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 30 - 15
[Fooocus] Initializing ...
[Fooocus] Loading models ...
Refiner unloaded.
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
[1] 10757 segmentation fault (core dumped) python entry_with_update.py

The text was updated successfully, but these errors were encountered:

NL-TCH · 2023-12-09T22:49:48Z

got exactly the same on RX5700XT

python entry_with_update.py
Already up-to-date
Update succeeded.
[System ARGV] ['entry_with_update.py']
Python 3.11.6 (main, Oct  3 2023, 00:00:00) [GCC 13.2.1 20230728 (Red Hat 13.2.1-1)]
Fooocus version: 2.1.824
Running on local URL:  http://127.0.0.1:7865

To create a public link, set `share=True` in `launch()`.
Total VRAM 8176 MB, total RAM 31833 MB
Set vram state to: NORMAL_VRAM
Disabling smart memory management
Device: cuda:0 AMD Radeon RX 5700 XT : native
VAE dtype: torch.float32
Using sub quadratic optimization for cross attention, if you have memory or speed issues try using: --use-split-cross-attention
Refiner unloaded.
model_type EPS
adm 2816
Using split attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using split attention in VAE
extra keys {'cond_stage_model.clip_l.text_projection', 'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids', 'cond_stage_model.clip_l.logit_scale'}
Base model loaded: /home/user/Fooocus/models/checkpoints/juggernautXL_version6Rundiffusion.safetensors
Request to load LoRAs [['sd_xl_offset_example-lora_1.0.safetensors', 0.1], ['None', 1.0], ['None', 1.0], ['None', 1.0], ['None', 1.0]] for model [/home/user/Fooocus/models/checkpoints/juggernautXL_version6Rundiffusion.safetensors].
Loaded LoRA [/home/user/Fooocus/models/loras/sd_xl_offset_example-lora_1.0.safetensors] for UNet [/home/user/Fooocus/models/checkpoints/juggernautXL_version6Rundiffusion.safetensors] with 788 keys at weight 0.1.
Fooocus V2 Expansion: Vocab with 642 words.
Fooocus Expansion engine loaded for cuda:0, use_fp16 = True.
Requested to load SDXLClipModel
Requested to load GPT2LMHeadModel
Loading 2 new models
[Fooocus Model Management] Moving model(s) has taken 1.57 seconds
App started successful. Use the app with http://127.0.0.1:7865/ or 127.0.0.1:7865
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 2
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 4.0
[Parameters] Seed = 7295514245041223923
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 30 - 15
[Fooocus] Initializing ...
[Fooocus] Loading models ...
Refiner unloaded.
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
Segmentation fault (core dumped)

Khoraji · 2023-12-10T04:34:24Z

Same fault, core dumped 5700XT

L226 · 2023-12-10T14:19:19Z

Same here, followed #1079 successfully.

However I'm using Radeon graphics with my R7 pro 5850U. Tried with and without --use-split-cross-attention

Ubuntu 22.04.3, 6.1.66
AMD Ryzen 7 Pro 5850U
AMD Radeon Graphics
48 GB RAM

python entry_with_update.py --preset realistic --use-split-cross-attention
Update failed.
authentication required but no callback set
Update succeeded.
[System ARGV] ['entry_with_update.py', '--preset', 'realistic', '--use-split-cross-attention']
Loaded preset: /home/user/genai/Fooocus/presets/realistic.json
Python 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0]
Fooocus version: 2.1.824
Running on local URL:  http://127.0.0.1:7866

To create a public link, set `share=True` in `launch()`.
Total VRAM 4096 MB, total RAM 43960 MB
Trying to enable lowvram mode because your GPU seems to have 4GB or less. If you don't want this use: --normalvram
Set vram state to: LOW_VRAM
Disabling smart memory management
Device: cuda:0 AMD Radeon Graphics : native
VAE dtype: torch.float32
Using split optimization for cross attention
Refiner unloaded.
model_type EPS
adm 2816
Using split attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using split attention in VAE
extra keys {'cond_stage_model.clip_l.text_projection', 'cond_stage_model.clip_l.logit_scale', 'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids'}
Base model loaded: /home/user/genai/Fooocus/models/checkpoints/realisticStockPhoto_v10.safetensors
Request to load LoRAs [['SDXL_FILM_PHOTOGRAPHY_STYLE_BetaV0.4.safetensors', 0.25], ['None', 1.0], ['None', 1.0], ['None', 1.0], ['None', 1.0]] for model [/home/user/genai/Fooocus/models/checkpoints/realisticStockPhoto_v10.safetensors].
Loaded LoRA [/home/user/genai/Fooocus/models/loras/SDXL_FILM_PHOTOGRAPHY_STYLE_BetaV0.4.safetensors] for UNet [/home/user/genai/Fooocus/models/checkpoints/realisticStockPhoto_v10.safetensors] with 788 keys at weight 0.25.
Loaded LoRA [/home/user/genai/Fooocus/models/loras/SDXL_FILM_PHOTOGRAPHY_STYLE_BetaV0.4.safetensors] for CLIP [/home/user/genai/Fooocus/models/checkpoints/realisticStockPhoto_v10.safetensors] with 264 keys at weight 0.25.
Fooocus V2 Expansion: Vocab with 642 words.
Fooocus Expansion engine loaded for cpu, use_fp16 = False.
Requested to load SDXLClipModel
Requested to load GPT2LMHeadModel
Loading 2 new models
[Fooocus Model Management] Moving model(s) has taken 2.21 seconds
App started successful. Use the app with http://127.0.0.1:7866/ or 127.0.0.1:7866
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 2
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 3.0
[Parameters] Seed = 6293613909801716834
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 30 - 15
[Fooocus] Initializing ...
[Fooocus] Loading models ...
Refiner unloaded.
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
[Prompt Expansion] ship on fire, dramatic, intricate, elegant, highly detailed, extremely new, professional, cinematic, artistic, sharp focus, color light, winning, romantic, smart, cute, epic, creative, cool, loving, attractive, pretty, charming, complex, amazing, passionate, charismatic, colorful, coherent, iconic, fine, vibrant, incredible, beautiful, awesome, pure
[Fooocus] Preparing Fooocus text #2 ...
[Prompt Expansion] ship on fire, full color, cinematic, stunning, highly detailed, formal, serious, determined, elegant, professional, artistic, emotional, pretty, attractive, smart, charming, best, dramatic, sharp focus, beautiful, cute, modern, futuristic, surreal, iconic, fine detail, colorful, ambient light, dynamic, amazing, symmetry, intricate, elite, magical
[Fooocus] Encoding positive #1 ...
[Fooocus] Encoding positive #2 ...
[Fooocus] Encoding negative #1 ...
[Fooocus] Encoding negative #2 ...
[Parameters] Denoising Strength = 1.0
[Parameters] Initial Latent shape: Image Space (1152, 896)
Preparation time: 8.59 seconds
[Sampler] refiner_swap_method = joint
[Sampler] sigma_min = 0.0291671771556139, sigma_max = 14.614643096923828
Segmentation fault (core dumped)

Khoraji · 2023-12-10T15:09:18Z

Pretty much exactly how mine shakes out after the big string of adjectives then core dump. I also have the same exact thing on my integrated R5 laptop and my 5700xt desktop :/

…

On Sun, 10 Dec 2023, 14:19 L226, ***@***.***> wrote: Same here, followed #1079 <#1079> successfully. However I'm using Radeon graphics with my R7 pro 5850U. Tried with and without --use-split-cross-attention Ubuntu 22.04.3, 6.1.66 AMD Ryzen 7 Pro 5850U AMD Radeon Graphics 48 GB RAM python entry_with_update.py --preset realistic --use-split-cross-attention Update failed. authentication required but no callback set Update succeeded. [System ARGV] ['entry_with_update.py', '--preset', 'realistic', '--use-split-cross-attention'] Loaded preset: /home/user/genai/Fooocus/presets/realistic.json Python 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] Fooocus version: 2.1.824 Running on local URL: http://127.0.0.1:7866 To create a public link, set `share=True` in `launch()`. Total VRAM 4096 MB, total RAM 43960 MB Trying to enable lowvram mode because your GPU seems to have 4GB or less. If you don't want this use: --normalvram Set vram state to: LOW_VRAM Disabling smart memory management Device: cuda:0 AMD Radeon Graphics : native VAE dtype: torch.float32 Using split optimization for cross attention Refiner unloaded. model_type EPS adm 2816 Using split attention in VAE Working with z of shape (1, 4, 32, 32) = 4096 dimensions. Using split attention in VAE extra keys {'cond_stage_model.clip_l.text_projection', 'cond_stage_model.clip_l.logit_scale', 'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids'} Base model loaded: /home/user/genai/Fooocus/models/checkpoints/realisticStockPhoto_v10.safetensors Request to load LoRAs [['SDXL_FILM_PHOTOGRAPHY_STYLE_BetaV0.4.safetensors', 0.25], ['None', 1.0], ['None', 1.0], ['None', 1.0], ['None', 1.0]] for model [/home/user/genai/Fooocus/models/checkpoints/realisticStockPhoto_v10.safetensors]. Loaded LoRA [/home/user/genai/Fooocus/models/loras/SDXL_FILM_PHOTOGRAPHY_STYLE_BetaV0.4.safetensors] for UNet [/home/user/genai/Fooocus/models/checkpoints/realisticStockPhoto_v10.safetensors] with 788 keys at weight 0.25. Loaded LoRA [/home/user/genai/Fooocus/models/loras/SDXL_FILM_PHOTOGRAPHY_STYLE_BetaV0.4.safetensors] for CLIP [/home/user/genai/Fooocus/models/checkpoints/realisticStockPhoto_v10.safetensors] with 264 keys at weight 0.25. Fooocus V2 Expansion: Vocab with 642 words. Fooocus Expansion engine loaded for cpu, use_fp16 = False. Requested to load SDXLClipModel Requested to load GPT2LMHeadModel Loading 2 new models [Fooocus Model Management] Moving model(s) has taken 2.21 seconds App started successful. Use the app with http://127.0.0.1:7866/ or 127.0.0.1:7866 [Parameters] Adaptive CFG = 7 [Parameters] Sharpness = 2 [Parameters] ADM Scale = 1.5 : 0.8 : 0.3 [Parameters] CFG = 3.0 [Parameters] Seed = 6293613909801716834 [Parameters] Sampler = dpmpp_2m_sde_gpu - karras [Parameters] Steps = 30 - 15 [Fooocus] Initializing ... [Fooocus] Loading models ... Refiner unloaded. [Fooocus] Processing prompts ... [Fooocus] Preparing Fooocus text #1 ... [Prompt Expansion] ship on fire, dramatic, intricate, elegant, highly detailed, extremely new, professional, cinematic, artistic, sharp focus, color light, winning, romantic, smart, cute, epic, creative, cool, loving, attractive, pretty, charming, complex, amazing, passionate, charismatic, colorful, coherent, iconic, fine, vibrant, incredible, beautiful, awesome, pure [Fooocus] Preparing Fooocus text #2 ... [Prompt Expansion] ship on fire, full color, cinematic, stunning, highly detailed, formal, serious, determined, elegant, professional, artistic, emotional, pretty, attractive, smart, charming, best, dramatic, sharp focus, beautiful, cute, modern, futuristic, surreal, iconic, fine detail, colorful, ambient light, dynamic, amazing, symmetry, intricate, elite, magical [Fooocus] Encoding positive #1 ... [Fooocus] Encoding positive #2 ... [Fooocus] Encoding negative #1 ... [Fooocus] Encoding negative #2 ... [Parameters] Denoising Strength = 1.0 [Parameters] Initial Latent shape: Image Space (1152, 896) Preparation time: 8.59 seconds [Sampler] refiner_swap_method = joint [Sampler] sigma_min = 0.0291671771556139, sigma_max = 14.614643096923828 Segmentation fault (core dumped) — Reply to this email directly, view it on GitHub <#1288 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/A4TKOCAYQQZYW7FFJRJ6QE3YIXAHFAVCNFSM6AAAAABAM5PT3WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNBYHE3TQNBZGI> . You are receiving this because you commented.Message ID: ***@***.***>

galvani4987 · 2023-12-10T16:44:38Z

I'm running mint linux fully update and all.
Running on a AMD Ryzen 5600G + RX 5600 XT 6 Gb + 32gb DDR4.
I get the exact same segmentation fault (core dumped).
I have tested a bunch of args and VARIABLES but no luck.
I have installed rocm 5.7 but for every test i get a different error message and end up with a fail.
So i got back to the start and to this thread.
I hope someone figures it out.
Thanks a lot everyone, this is great and we are pretty close to making it work... i hope.

galvani4987 · 2023-12-11T04:09:01Z

This has been published by lllyasviel:
#1327
I did enlarge my swapfile to 64G using this tutorial: https://linuxhandbook.com/increase-swap-ubuntu/
Reinstalled Fooocus from scratch and ran it.
About a minute or so after i hit Generate it gets stuck in "[Fooocus] Preparing Fooocus text #1 ..."
Then it segfaults.

Robin-qwerty · 2023-12-13T12:29:42Z

I have the same issue. Running arch and I have a RX 6750 XT, 32GB ram and 40GB swap

(fooocus_env) [root@ArchLinuxRobin Fooocus]# python entry_with_update.py
Already up-to-date
Update succeeded.
[System ARGV] ['entry_with_update.py']
Python 3.10.10 (main, Mar  5 2023, 22:26:53) [GCC 12.2.1 20230201]
Fooocus version: 2.1.835
Running on local URL:  http://127.0.0.1:7865

To create a public link, set `share=True` in `launch()`.
Total VRAM 12272 MB, total RAM 31955 MB
Set vram state to: NORMAL_VRAM
Always offload VRAM
Device: cuda:0 AMD Radeon RX 6750 XT : native
VAE dtype: torch.float32
Using sub quadratic optimization for cross attention, if you have memory or speed issues try using: --attention-split
Refiner unloaded.
model_type EPS
UNet ADM Dimension 2816
Using split attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using split attention in VAE
extra {'cond_stage_model.clip_l.logit_scale', 'cond_stage_model.clip_l.text_projection'}
left over keys: dict_keys(['cond_stage_model.clip_l.transformer.text_model.embeddings.position_ids'])
Base model loaded: /root/Fooocus/models/checkpoints/juggernautXL_version6Rundiffusion.safetensors
Request to load LoRAs [['sd_xl_offset_example-lora_1.0.safetensors', 0.1], ['None', 1.0], ['None', 1.0], ['None', 1.0], ['None', 1.0]] for model [/root/Fooocus/models/checkpoints/juggernautXL_version6Rundiffusion.safetensors].
Loaded LoRA [/root/Fooocus/models/loras/sd_xl_offset_example-lora_1.0.safetensors] for UNet [/root/Fooocus/models/checkpoints/juggernautXL_version6Rundiffusion.safetensors] with 788 keys at weight 0.1.
Fooocus V2 Expansion: Vocab with 642 words.
Fooocus Expansion engine loaded for cuda:0, use_fp16 = True.
Requested to load SDXLClipModel
Requested to load GPT2LMHeadModel
Loading 2 new models
[Fooocus Model Management] Moving model(s) has taken 0.24 seconds
App started successful. Use the app with http://127.0.0.1:7865/ or 127.0.0.1:7865
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 2
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 4.0
[Parameters] Seed = 7930202201705363266
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 30 - 15
[Fooocus] Initializing ...
[Fooocus] Loading models ...
Refiner unloaded.
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
Segmentation fault (core dumped)
(fooocus_env) [root@ArchLinuxRobin Fooocus]#

All I get to see in the browser is 'Waiting for task to start ...'

And my memory is barely used

wnm210 · 2023-12-18T18:50:35Z

This has been published by lllyasviel: #1327 I did enlarge my swapfile to 64G using this tutorial: https://linuxhandbook.com/increase-swap-ubuntu/ Reinstalled Fooocus from scratch and ran it. About a minute or so after i hit Generate it gets stuck in "[Fooocus] Preparing Fooocus text #1 ..." Then it segfaults.

same here, and it's stuck

L226 · 2023-12-19T12:28:01Z

Tried increasing swapfile (in my case - disabling existing 1G swap partition and creating /activating new 40G swap file) with cache pressure = 100, swappiness = 60), still segfaults:

...
[Fooocus] Encoding positive #2 ...
[Fooocus] Encoding negative #1 ...
[Fooocus] Encoding negative #2 ...
[Parameters] Denoising Strength = 1.0
[Parameters] Initial Latent shape: Image Space (1152, 896)
Preparation time: 10.88 seconds
[Sampler] refiner_swap_method = joint
[Sampler] sigma_min = 0.0291671771556139, sigma_max = 14.614643096923828
Segmentation fault (core dumped)

Looking at swap usage it didn't really use anything, also RAM util looked pretty low.

Running strace on the process showed some funny lookups, so I guess the AMD integration still needs work or I need to re-install some packages;

e.g.

[pid ****] access("/usr/local/games/amdgcn-amd-amdhsa-ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
[pid ****] access("/snap/bin/amdgcn-amd-amdhsa-ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)
[pid ****] access("/snap/bin/amdgcn-amd-amdhsa-ld.lld", R_OK|X_OK) = -1 ENOENT (No such file or directory)

I will try to look more deeply into it after the break

eVen-gits · 2023-12-21T09:38:48Z

Getting segfault as well. I don't think it's a RAM issue (128GB).

Kernel: 6.6.7-4-MANJARO 
Uptime: 1 day, 22 hours, 59 mins 
Packages: 1184 (pacman), 11 (flatpak) 
Shell: bash 5.2.21 
Resolution: 3840x1600 
DE: Plasma 5.27.10 
WM: kwin 
Theme: [Plasma], Breeze [GTK2/3] 
Icons: [Plasma], breeze [GTK2/3] 
Terminal: konsole 
CPU: AMD Ryzen 5 5600X (12) @ 3.700GHz 
GPU: AMD ATI Radeon RX 5600 OEM/5600 XT / 5700/5700 XT 
Memory: 19460MiB / 128710MiB ```

WYOhellboy · 2023-12-24T02:36:32Z

Also getting segmentation fault:
CPU: AMD Ryzen 7 2700x
RAM: 48GB
GPU: AMD Radeon RX 7800xt
Swap: 55GB
Using Manjaro with Gnome as DE.

klassiker · 2023-12-30T17:40:56Z

Got segfaults as well, but managed to fix it. Here is what I found:

With whl/rocm5.6, I've got a plain segfault with no information. Excerpt from strace right before the segfault:

strace -ff python entry_with_update.py --preset realistic
.....
[pid  XXXX] ioctl(6, AMDKFD_IOC_MAP_MEMORY_TO_GPU, ...) = 0
[pid  XXXX] ioctl(6, AMDKFD_IOC_CREATE_QUEUE, ...) = 0
[pid  XXXX] ioctl(6, AMDKFD_IOC_CREATE_EVENT, ...) = 0
[pid  XXXX] ioctl(6, AMDKFD_IOC_CREATE_EVENT, ...) = 0
[pid  XXXX] --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x40} ---

After trying whl/nightly/rocm5.7, I've got a little bit of error information:

[pid  XXXX] ioctl(6, AMDKFD_IOC_MAP_MEMORY_TO_GPU, ...) = 0
[pid  XXXX] ioctl(6, AMDKFD_IOC_CREATE_QUEUE, ...) = 0
[pid  XXXX] ioctl(6, AMDKFD_IOC_CREATE_EVENT, ...) = 0
[pid  XXXX] ioctl(6, AMDKFD_IOC_CREATE_EVENT, ...) = 0
[pid  XXXX] futex(..., FUTEX_WAKE_PRIVATE, 2147483647) = 0
[pid  XXXX] write(2, "Exception in thread Thread-2 (wo"..., 39Exception in thread Thread-2 (worker):
...
RuntimeError: HIP error: invalid device function

After finding ROCm/ROCm#2536 and trying strace -ff python -c 'import torch; torch.rand(3,3).to(torch.device("cuda"))', the same error appeared.

Using export HSA_OVERRIDE_GFX_VERSION=11.0.0 for gfx1100 from rocminfo both the simple test and entry_with_update.py run successfully. The segfault happened to me at the same locations, either on startup using preset realistic or when clicking Generate without a preset, so I guess it's the same issue as here.

For debugging the output of rocminfo | grep Name might help, also try all of whl/rocm5.6, whl/nightly/rocm5.6 and whl/nightly/rocm5.7 with the simple pytorch command in a clean environment using env -i bash, exporting HSA_OVERRIDE_GFX_VERSION to the appropriate value for your GPU. Also verify and check you are using the correct GPU if you have an iGPU. Also check if you can find the error at the same location with strace.

I guess #627 is related.

Hope this helps.

merlinblack · 2024-01-09T07:00:02Z

After reinstalling the dependencies today, I can run this without needing any env vars to override anything.
python -c 'import torch; torch.rand(3,3).to(torch.device("cuda"))'

However I still get a Segfault after clicking 'Generate'

#> python entry_with_update.py
Already up-to-date
Update succeeded.
[System ARGV] ['entry_with_update.py']
Python 3.11.7 (main, Dec 18 2023, 00:00:00) [GCC 13.2.1 20231205 (Red Hat 13.2.1-6)]
Fooocus version: 2.1.862
Running on local URL:  http://127.0.0.1:7865

To create a public link, set `share=True` in `launch()`.
Total VRAM 12272 MB, total RAM 32035 MB
Set vram state to: NORMAL_VRAM
Always offload VRAM
Device: cuda:0 AMD Radeon RX 6700 XT : native
VAE dtype: torch.float32
Using sub quadratic optimization for cross attention, if you have memory or speed issues try using: --attention-split
Refiner unloaded.
model_type EPS
UNet ADM Dimension 2816
Using split attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using split attention in VAE
extra {'cond_stage_model.clip_l.text_projection', 'cond_stage_model.clip_l.logit_scale', 'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids'}
Base model loaded: /home/nigel/prog/Fooocus/models/checkpoints/juggernautXL_version6Rundiffusion.safetensors
Request to load LoRAs [['sd_xl_offset_example-lora_1.0.safetensors', 0.1], ['None', 1.0], ['None', 1.0], ['None', 1.0], ['None', 1.0]] for model [/home/nigel/prog/Fooocus/models/checkpoints/juggernautXL_version6Rundiffusion.safetensors].
Loaded LoRA [/home/nigel/prog/Fooocus/models/loras/sd_xl_offset_example-lora_1.0.safetensors] for UNet [/home/nigel/prog/Fooocus/models/checkpoints/juggernautXL_version6Rundiffusion.safetensors] with 788 keys at weight 0.1.
Fooocus V2 Expansion: Vocab with 642 words.
Fooocus Expansion engine loaded for cuda:0, use_fp16 = True.
Requested to load SDXLClipModel
Requested to load GPT2LMHeadModel
Loading 2 new models
[Fooocus Model Management] Moving model(s) has taken 0.68 seconds
App started successful. Use the app with http://127.0.0.1:7865/ or 127.0.0.1:7865
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 2
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 4.0
[Parameters] Seed = 8321946732629474494
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 30 - 15
[Fooocus] Initializing ...
[Fooocus] Loading models ...
Refiner unloaded.
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
Segmentation fault (core dumped)

It does take a moment to break - but watching my ram usage, both vram and ram usage go up a little on startup, but not any higher after clicking generate.

AstroJMo · 2024-01-13T17:13:47Z

I have a 7950x and 7900 xtx. I disabled integrated graphics in my bios and I no longer get the segmentation fault. Running the test-rocm.py was showing that I had two rocm devices. I read on another forum that this might cause problems. Seems it was true for me at least.

PiotrCe · 2024-01-14T03:25:26Z

I'm using:
Ubuntu 22.04.3 LTS
RX 5700 XT

my rocminfo output:

Agent 2

Name: gfx1010
Uuid: GPU-XX
Marketing Name: AMD Radeon RX 5700 XT
Vendor Name: AMD
Feature: KERNEL_DISPATCH
Profile: BASE_PROFILE
Float Round Mode: NEAR
Max Queue Number: 128(0x80)
Queue Min Size: 64(0x40)
Queue Max Size: 131072(0x20000)
Queue Type: MULTI
Node: 1
Device Type: GPU

I had the Segmentation fault (core dumped) while using miniconda3. After switching to anaconda this error never appeared again. Now when I run HSA_OVERRIDE_GFX_VERSION=10.3.0 python entry_with_update.py the app starts and after clicking "Generate" I'm getting:

[Fooocus Model Management] Moving model(s) has taken 1.49 seconds
App started successful. Use the app with http://127.0.0.1:7865/ or 127.0.0.1:7865
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 2
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 4.0
[Parameters] Seed = 3497165507932006909
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 30 - 15
[Fooocus] Initializing ...
[Fooocus] Loading models ...
Refiner unloaded.
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
:0:rocdevice.cpp :2692: 2014655231 us: [pid:6187 tid:0x7fac53fff640] Callback: Queue 0x7fa9bdf00000 aborting with error : HSA_STATUS_ERROR_MEMORY_APERTURE_VIOLATION: The agent attempted to access memory beyond the largest legal address. code: 0x29
Aborted (core dumped)

Laurent-VueJS · 2024-01-17T15:07:53Z

I have a 7950x and 7900 xtx. I disabled integrated graphics in my bios and I no longer get the segmentation fault. Running the test-rocm.py was showing that I had two rocm devices. I read on another forum that this might cause problems. Seems it was true for me at least.

Just my 2 cents : For info I use the iGPU of the ryzen r9 7900x and I always have segment fault (or other errors) while I have only this (i)GPU. So multiple GPU might not be the problem but the iGPU might well be (?). I have seen on AMD specs that iGPU's are not officially supported by ROCM :-( NB : on windows (with directml) I can sometimes generate one picture on iGPU but only on "extreme speed" that uses about 40GB of Vram (my limit). Other settings use more than 40GB and the process stops when I reach this limit (probably due to a memory leak (?)

Schweeeeeeeeeeeeeeee · 2024-01-24T22:28:43Z

Same problem

ttio2tech · 2024-02-11T14:04:15Z

My 5700XT can run Fooocus without issue. Although it's slow (2 minutes an image for extreme mode, 3 minutes an image for Speed mode). I also made a video at https://youtu.be/HgGZyNRA1Ns

mashb1t · 2024-02-22T22:24:26Z

@CobeyH is this issue still present for you using the latest version of Fooocus or can it be closed?

Schweeeeeeeeeeeeeeee · 2024-02-23T04:59:26Z

Still present
$ python entry_with_update.py --preset realistic
Already up-to-date
Update succeeded.
[System ARGV] ['entry_with_update.py', '--preset', 'realistic']
Python 3.11.7 (main, Jan 29 2024, 16:03:57) [GCC 13.2.1 20230801]
Fooocus version: 2.1.865
Loaded preset: /home/boobs/Fooocus/presets/realistic.json
Running on local URL: http://127.0.0.1:7865

To create a public link, set share=True in launch().
Total VRAM 12272 MB, total RAM 31235 MB
Set vram state to: NORMAL_VRAM
Always offload VRAM
Device: cuda:0 AMD Radeon RX 6700 XT : native
VAE dtype: torch.float32
Using sub quadratic optimization for cross attention, if you have memory or speed issues try using: --attention-split
Refiner unloaded.
model_type EPS
UNet ADM Dimension 2816
Using split attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using split attention in VAE
extra {'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids', 'cond_stage_model.clip_l.logit_scale', 'cond_stage_model.clip_l.text_projection'}
Base model loaded: /home/boobs/Fooocus/models/checkpoints/realisticStockPhoto_v20.safetensors
Request to load LoRAs [['SDXL_FILM_PHOTOGRAPHY_STYLE_BetaV0.4.safetensors', 0.25], ['None', 1.0], ['None', 1.0], ['None', 1.0], ['None', 1.0]] for model [/home/boobs/Fooocus/models/checkpoints/realisticStockPhoto_v20.safetensors].
Loaded LoRA [/home/boobs/Fooocus/models/loras/SDXL_FILM_PHOTOGRAPHY_STYLE_BetaV0.4.safetensors] for UNet [/home/boobs/Fooocus/models/checkpoints/realisticStockPhoto_v20.safetensors] with 788 keys at weight 0.25.
Loaded LoRA [/home/boobs/Fooocus/models/loras/SDXL_FILM_PHOTOGRAPHY_STYLE_BetaV0.4.safetensors] for CLIP [/home/boobs/Fooocus/models/checkpoints/realisticStockPhoto_v20.safetensors] with 264 keys at weight 0.25.
Fooocus V2 Expansion: Vocab with 642 words.
Fooocus Expansion engine loaded for cuda:0, use_fp16 = True.
Requested to load SDXLClipModel
Requested to load GPT2LMHeadModel
Loading 2 new models
Segmentation fault (core dumped)

hqnicolas · 2024-03-10T01:57:11Z

Runing here without no problem
https://gist.github.com/hqnicolas/5fbb9c37dcfc29c9a0ffe50fbcb35bdd
to RX6000 use:
HSA_OVERRIDE_GFX_VERSION=10.3.0

Schweeeeeeeeeeeeeeee · 2024-03-21T08:51:07Z

Runing here without no problem https://gist.github.com/hqnicolas/5fbb9c37dcfc29c9a0ffe50fbcb35bdd to RX6000 use: HSA_OVERRIDE_GFX_VERSION=10.3.0

How would i use HSA_OVERRIDE_GFX_VERSION=10.3.0

Laurent-VueJS · 2024-03-21T15:39:59Z

HSA_OVERRIDE_GFX_VERSION=xxxx must be placed before the command every time - on a single command (or you can make it permanent in your environment variables - google can tell you how :-) ). Pay attention that the number depends on your card model. Most common are 10.3.0 or 11.0.0 > lookup your card on the internet to be sure (or just try the 2 most common settings and you have 99% chance that one will work). nb : for me I tried the correct value and it still fails. Apparently ROCm does not provide support for some older or integrated AMD GPU's like mine (see the list of supported models on ROCm page). But CPU works very well and my other PC with Nvidia GPU also very well. I love Fooocus :-)

Tedris · 2024-05-02T03:20:16Z

I am getting the same running in Ubuntu with RX5700 and a 40GB swap, it gets stuck on Preparing Fooocus text 1 before coming back with Segfault

It works fine on Windows but I wanted to see if it would run faster on Linux.

mikwee · 2024-07-09T21:43:33Z

I'm on Fedora, GPU is Radeon RX 6600, CPU is Intel(R) Core(TM) i5-4690, RAM is 16GB. After I click "Generate", it takes a long time and then segfaults. I increased my swap size to 40GB (with a 32GB file added to a 8GB partition), restarted, and nothing changed. My console output is pretty much identical, but I'll copy-paste it anyway:

Already up-to-date
Update succeeded.
[System ARGV] ['entry_with_update.py']
Python 3.10.14 (main, Jun  3 2024, 17:19:22) [GCC 14.1.1 20240522 (Red Hat 14.1.1-4)]
Fooocus version: 2.4.3
[Cleanup] Attempting to delete content of temp dir /tmp/fooocus
[Cleanup] Cleanup successful
Total VRAM 8176 MB, total RAM 15917 MB
Set vram state to: NORMAL_VRAM
Always offload VRAM
Device: cuda:0 AMD Radeon RX 6600 : native
VAE dtype: torch.float32
Using sub quadratic optimization for cross attention, if you have memory or speed issues try using: --attention-split
Refiner unloaded.
IMPORTANT: You are using gradio version 3.41.2, however version 4.29.0 is available, please upgrade.
--------
Running on local URL:  http://127.0.0.1:7865

To create a public link, set `share=True` in `launch()`.
model_type EPS
UNet ADM Dimension 2816
Using split attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using split attention in VAE
extra {'cond_stage_model.clip_l.logit_scale', 'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids', 'cond_stage_model.clip_l.text_projection'}
Base model loaded: /home/testuser/Fooocus/models/checkpoints/juggernautXL_v8Rundiffusion.safetensors
VAE loaded: None
Request to load LoRAs [('sd_xl_offset_example-lora_1.0.safetensors', 0.1)] for model [/home/testuser/Fooocus/models/checkpoints/juggernautXL_v8Rundiffusion.safetensors].
Loaded LoRA [/home/testuser/Fooocus/models/loras/sd_xl_offset_example-lora_1.0.safetensors] for UNet [/home/testuser/Fooocus/models/checkpoints/juggernautXL_v8Rundiffusion.safetensors] with 788 keys at weight 0.1.
Fooocus V2 Expansion: Vocab with 642 words.
Fooocus Expansion engine loaded for cuda:0, use_fp16 = True.
Requested to load SDXLClipModel
Requested to load GPT2LMHeadModel
Loading 2 new models
[Fooocus Model Management] Moving model(s) has taken 1.66 seconds
Started worker with PID 4277
App started successful. Use the app with http://127.0.0.1:7865/ or 127.0.0.1:7865
[Parameters] Adaptive CFG = 7
[Parameters] CLIP Skip = 2
[Parameters] Sharpness = 2
[Parameters] ControlNet Softness = 0.25
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 4.0
[Parameters] Seed = 7406799653888165672
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 30 - 15
[Fooocus] Initializing ...
[Fooocus] Loading models ...
Refiner unloaded.
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
Segmentation fault (core dumped)

Hope this gets solved soon!

mashb1t added bug Something isn't working help wanted Extra attention is needed labels Dec 29, 2023

mashb1t mentioned this issue Jan 1, 2024

Segmentation fault (core dumped) #627

Closed

mashb1t added question Further information is requested and removed help wanted Extra attention is needed labels Feb 22, 2024

mashb1t added bug (AMD) Something isn't working (AMD specific) and removed question Further information is requested bug Something isn't working labels Feb 23, 2024

mashb1t mentioned this issue May 4, 2024

Segmentation fault #2859

Closed

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AMD Segmentation Fault #1288

AMD Segmentation Fault #1288

CobeyH commented Dec 8, 2023

NL-TCH commented Dec 9, 2023

Khoraji commented Dec 10, 2023

L226 commented Dec 10, 2023

Khoraji commented Dec 10, 2023 via email

galvani4987 commented Dec 10, 2023

galvani4987 commented Dec 11, 2023 •

edited

Loading

Robin-qwerty commented Dec 13, 2023

wnm210 commented Dec 18, 2023

L226 commented Dec 19, 2023 •

edited

Loading

eVen-gits commented Dec 21, 2023

WYOhellboy commented Dec 24, 2023

klassiker commented Dec 30, 2023

merlinblack commented Jan 9, 2024

AstroJMo commented Jan 13, 2024

PiotrCe commented Jan 14, 2024 •

edited

Loading

Laurent-VueJS commented Jan 17, 2024 •

edited

Loading

Schweeeeeeeeeeeeeeee commented Jan 24, 2024

ttio2tech commented Feb 11, 2024

mashb1t commented Feb 22, 2024

Schweeeeeeeeeeeeeeee commented Feb 23, 2024

hqnicolas commented Mar 10, 2024

Schweeeeeeeeeeeeeeee commented Mar 21, 2024

Laurent-VueJS commented Mar 21, 2024

Tedris commented May 2, 2024 •

edited

Loading

mikwee commented Jul 9, 2024

AMD Segmentation Fault #1288

AMD Segmentation Fault #1288

Comments

CobeyH commented Dec 8, 2023

NL-TCH commented Dec 9, 2023

Khoraji commented Dec 10, 2023

L226 commented Dec 10, 2023

Khoraji commented Dec 10, 2023 via email

galvani4987 commented Dec 10, 2023

galvani4987 commented Dec 11, 2023 • edited Loading

Robin-qwerty commented Dec 13, 2023

wnm210 commented Dec 18, 2023

L226 commented Dec 19, 2023 • edited Loading

eVen-gits commented Dec 21, 2023

WYOhellboy commented Dec 24, 2023

klassiker commented Dec 30, 2023

merlinblack commented Jan 9, 2024

AstroJMo commented Jan 13, 2024

PiotrCe commented Jan 14, 2024 • edited Loading

Laurent-VueJS commented Jan 17, 2024 • edited Loading

Schweeeeeeeeeeeeeeee commented Jan 24, 2024

ttio2tech commented Feb 11, 2024

mashb1t commented Feb 22, 2024

Schweeeeeeeeeeeeeeee commented Feb 23, 2024

hqnicolas commented Mar 10, 2024

Schweeeeeeeeeeeeeeee commented Mar 21, 2024

Laurent-VueJS commented Mar 21, 2024

Tedris commented May 2, 2024 • edited Loading

mikwee commented Jul 9, 2024

galvani4987 commented Dec 11, 2023 •

edited

Loading

L226 commented Dec 19, 2023 •

edited

Loading

PiotrCe commented Jan 14, 2024 •

edited

Loading

Laurent-VueJS commented Jan 17, 2024 •

edited

Loading

Tedris commented May 2, 2024 •

edited

Loading