[Feature Request]: Support for Intel Oneapi/Vulkan versions of pytorch as well #6417

Vidyut · 2023-01-06T08:29:59Z

Is there an existing issue for this?

I have searched the existing issues and checked the recent builds/commits

What would your feature do ?

This is a brilliant project and I like that it supports most versions of pytorch.

A large group of users on unsupported machines, intel, windows, etc get excluded from the performance options (which are basically cuda and wannabe-cuda) . Many of these machines have fairly decent hardware, just that it doesn't run cuda/rocm. There are pytorch versions like oneapi or vulkan, etc that would really take the reach of this project out to those with lesser machines, so to say. https://pytorch.org/tutorials/recipes/recipes/intel_extension_for_pytorch.html

I'm not a coder, but they have a pytorch version in the works similar to cuda/rocm, but it seems to support a lot of intel CPUs and GPUs, including discrete GPUs and older ones abandoned by ROCm https://github.com/intel/intel-extension-for-pytorch/tree/xpu-master

and adapting the code doesn't seem to be excessively complicated.
https://intel.github.io/intel-extension-for-pytorch/xpu/1.10.200+gpu/tutorials/examples.html
https://intel.github.io/intel-extension-for-pytorch/xpu/1.10.200+gpu/tutorials/api_doc.html

It would make the project accessible to those with simpler laptops/desktops.

https://towardsdatascience.com/pytorch-stable-diffusion-using-hugging-face-and-intel-arc-77010e9eead6

Proposed workflow

Go to ....
Press ....
...

Additional information

No response

uxdesignerhector · 2023-01-06T19:48:43Z

Yes, it would be nice to squeeze those 16gb from the intel arc 770. It seems that the problem resides at PyTorch pytorch/pytorch#30029. PyTorch will need to support One API. But it seems it's possible to run PyTorch with intel GPUs through extensions as is stated in https://github.com/intel/intel-extension-for-pytorch/tree/xpu-master

This thread from Reddit has useful information about the possible ways Intel could approach Stable Diffusion /PyTorch problem https://www.reddit.com/r/intel/comments/xvbmif/will_intel_arc_support_stable_diffusion/

Vidyut · 2023-01-06T23:52:08Z

Hi @uxdesignerhector,

Pytorch HAS the intel extension, though unlike ROCm, it requires code changes as it stands. It is just a couple of lines - this project already seems to do similar to integrate mps, which is why I suggested here. But it can accelerate CPUs and the unreleased version runs on older GPUs and what not, which is great! I wouldn't be surprised if such an integration made this version of Stable Diffusion the staple implementation.

Stable Diffusion runs on TesorFlow, I think, which supports OneAPI - so this is less an Intel issue and more of one for those who love this project, with its well designed implementation, but would like to not wait ages while their hardware twiddles its thumbs. Almost nothing (that wouldn't crash at the task) would be left out, since it would also automatically support OpenCL I think.

Not to mention I am fed up of these elitist projects refusing to recognise anything not CUDA as not GPU!!! (this includes Intel's openvino-gpu runtime - which is basically for cuda/rocm!!!) This repository with its inclusion of everything it can lay its hands on is literally the only reason I bother with pytorch. (that said, not a coder, so not like I'm using all sorts of other technologies overwhelmingly)

V

Vidyut · 2023-01-09T15:24:26Z

The intel extension for gpu now supports pytorch 1.13.10 https://github.com/intel/intel-extension-for-pytorch/releases/tag/v1.13.10%2Bxpu

rahulunair · 2023-01-12T23:18:03Z

For anyone looking for working code for Stable Diffusion on Intel dGPUs (Arc Alchemist ) and iGPUs with PyTorch and TensorFlow, please check this out: https://github.com/rahulunair/stable_diffusion_arc or my blog: https://blog.rahul.onl/posts/2022-09-06-arc-dgpu-stable-diffusion.html

For context, oneAPI is already part of PyTorch and TensorFlow as oneDNN, which is a oneAPI library is the default accelerator for CPUs that both the frameworks uses. And Intel extensions for PyTorch (ipex) are kernels that support further optimizations and Intel GPU backend. Eventually most of the code from ipex would be merged into mainline PyTorch.

uxdesignerhector · 2023-01-13T20:12:00Z

For anyone looking for working code for Stable Diffusion on Intel dGPUs (Arc Alchemist ) and iGPUs with PyTorch and TensorFlow, please check this out: https://github.com/rahulunair/stable_diffusion_arc or my blog: https://blog.rahul.onl/posts/2022-09-06-arc-dgpu-stable-diffusion.html

For context, oneAPI is already part of PyTorch and TensorFlow as oneDNN, which is a oneAPI library is the default accelerator for CPUs that both the frameworks uses. And Intel extensions for PyTorch (ipex) are kernels that support further optimizations and Intel GPU backend. Eventually most of the code from ipex would be merged into mainline PyTorch.

Thank you for your clarification.

jbaboval · 2023-01-17T18:18:24Z

#4690

jbaboval · 2023-01-17T18:18:58Z

I'm going to take a stab at putting together a PR for this.....

jbaboval · 2023-01-18T02:07:57Z

Unfortunately it's more than a few lines of code. And getting the intel libraries and drivers setup isn't well integrated with distributions.

This is a work in progress, but it shows signs of life:
https://github.com/jbaboval/stable-diffusion-webui/tree/oneapi

jbaboval · 2023-01-21T01:48:02Z

I'm still having some issues. One is seeding. I can't get reproducible output. I thought it might be the seeding in pytorch_lightning, but at this point I have implemented full support in pytorch_lightning and instrumented the seeding code there - it never gets called. All the seeding happens in sd-webui. I've also instrumented sd-webui to validate repeatability of noise and subnoise, and it's fully repeatable. Not sure what gives yet.

The other issue is that batches always have junk for the second image.

On the plus side, it's really fast. Especially compared to my old GTX1660.

jbaboval · 2023-01-22T16:32:12Z

intel/intel-extension-for-pytorch#252

Vidyut · 2023-01-22T17:23:22Z

I'm not a coder. I can't even begin to figure this out, but I'd happy to test if you've uploaded what you have to github.

jbaboval · 2023-01-22T21:12:27Z

It's linked above. I made some notes in ArcNotes.txt that might help get you set up.

jbaboval · 2023-01-22T21:51:33Z

If you're going to try the branch above:

It might not work without my pytorch_lightning branch. I think it will, but if not let me know. I can test later and fix it.
Turn up the batch size
Pass --use-intel-oneapi to launch.py
Pass --config configs/v1-inference-xpu.yaml to launch.py

Vidyut · 2023-01-23T10:40:55Z

Saw your comment just now and tried it.

I had everything installed and the preparation was fine as per your test.

--use-intel-oneapi wasn't recognised. So I probably did something wrong.

The command to make the Intel version of python a system default is problematic and I almost broke other python stuff going on. Better to use the setup vars in a launcher for use only for this or add to .bashrc (and comment out when not needed...).

Something like a small script:
#!/bin/bash
#That way you can comment out code options to comment or uncomment quickly to test also (and if you're like me, not forget commands)
. /opt/intel/oneapi/setvars.sh
TORCH_COMMAND='pip install torch torchvision' python launch.py --medvram --precision full --no-half --skip-torch-cuda-test

That said:

Stable diffusion started without trouble, loaded webpage. The code didn't break.
Takes a long time to draw a single image
But I'm not sure it is using the XPU
Will need more investigation and tweaking. Work in progress. Will update.

For reference (the conspicuous lack of xpu, etc words suggests I missed a trick somewhere)

`:: oneAPI environment initialized ::

Python 3.9.15 (main, Nov 11 2022, 13:58:57)
[GCC 11.2.0]
Commit hash: 3a0d6b7
Installing requirements for Web UI
Launching Web UI with arguments: --medvram --precision full --no-half --ckpt /home/[stuff]/TEST/stable-diffusion-webui/models/Stable-diffusion/sd-v1-4.ckpt
No module 'xformers'. Proceeding without it.
Warning: caught exception 'Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx', memory monitor disabled
LatentDiffusion: Running in eps-prediction mode
DiffusionWrapper has 859.52 M params.
Loading weights [fe4efff1e1] from /home/[stuff]/TEST/stable-diffusion-webui/models/Stable-diffusion/sd-v1-4.ckpt
Applying cross attention optimization (InvokeAI).
Textual inversion embeddings loaded(0):
Model loaded in 119.4s (1.8s create model, 109.9s load weights).
Running on local URL: http://127.0.0.1:7860

To create a public link, set share=True in launch().
100%|███████████████████████████████████████████| 20/20 [09:44<00:00, 29.21s/it]
Total progress: 100%|███████████████████████████| 20/20 [09:18<00:00, 32.14s/it]
`
Will be able to spend time properly on this in a day or so and update any enlightenment that follows.

Vidyut · 2023-01-23T11:14:47Z

Arrrgh. Never mind. The torch version was wrong (I accidentally installed it in the regular python, so the script installed the regular torch in intel's python...). Now sorted. And now I have problems with Intel's torch and torchvision playing nice with each other... trying with intel's torch and regular torchvision. Sigh.

Update: Fails with Intel's torchvision, but works with Intel's torch and regular torchvision. But still takes too long. Probably because I can't convince it to use the parameters you said to pass. The xpu test returns true, so requirements are installed. But I don't think it is using the xpu still.

This is currently slower than untampered CPU.
100%|███████████████████████████████████████████| 20/20 [10:11<00:00, 30.56s/it]
Total progress: 100%|███████████████████████████| 20/20 [10:27<00:00, 31.38s/it]

jbaboval · 2023-01-23T13:57:53Z

Can you check what branch of the fork you're on? It should be oneapi.

If it's not recognizing the command line option, it's definitely not running the right code.

Vidyut · 2023-01-23T16:12:18Z

Okay, you were right. It was the wrong branch. facepalm. And I downloaded the zip and I still think it is the master. Not sure how to get the oneapi branch (I'm a champion copy-paster, but don't actually know a lot). Figuring it out.

jbaboval · 2023-01-23T17:33:59Z

I'm not sure how you get the branch with the zip download. I just grabbed the zip and it doesn't include the .git directory.

Try git clone -b oneapi https://github.com/jbaboval/stable-diffusion-webui.git

Nathan-dm · 2023-01-24T08:22:17Z

Unfortunately it's more than a few lines of code. And getting the intel libraries and drivers setup isn't well integrated with distributions.

This is a work in progress, but it shows signs of life: https://github.com/jbaboval/stable-diffusion-webui/tree/oneapi

will it work using intel igpu?

Vidyut · 2023-01-24T11:34:17Z

Try git clone -b oneapi https://github.com/jbaboval/stable-diffusion-webui.git

I'm fairly certain I have the right branch now. It has the Arcnotes.txt - so what am I doing wrong?

Launching Web UI with arguments: --medvram --precision full --no-half --ckpt /home/vidyut/AI/TEST/stable-diffusion-webui/models/Stable-diffusion/sd-v1-4.ckpt --config configs/v1-inference-xpu.yaml
/home/vidyut/.local/lib/python3.9/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension:
warn(f"Failed to load image Python extension: {e}")
'NoneType' object has no attribute 'enable_tf32': str
Traceback (most recent call last):
File "/home/vidyut/AI/TEST/stable-diffusion-webui/modules/errors.py", line 29, in run
code()
File "/home/vidyut/AI/TEST/stable-diffusion-webui/modules/accelerator.py", line 58, in enable_tf32
return impl.enable_tf32()
AttributeError: 'NoneType' object has no attribute 'enable_tf32'

2023-01-24 16:59:58,437 - torch.distributed.nn.jit.instantiator - INFO - Created a temporary directory at /tmp/tmpsg3pgt5w
2023-01-24 16:59:58,438 - torch.distributed.nn.jit.instantiator - INFO - Writing /tmp/tmpsg3pgt5w/_remote_module_non_scriptable.py
2023-01-24 16:59:58,634 - root - WARNING - Pytorch pre-release version 1.13.0a0+gitb1dde16 - assuming intent to test it
2023-01-24 16:59:58,648 - root - WARNING - Pytorch pre-release version 1.13.0a0+gitb1dde16 - assuming intent to test it
No module 'xformers'. Proceeding without it.
Traceback (most recent call last):
File "/home/vidyut/AI/TEST/stable-diffusion-webui/launch.py", line 315, in
start()
File "/home/vidyut/AI/TEST/stable-diffusion-webui/launch.py", line 306, in start
import webui
File "/home/vidyut/AI/TEST/stable-diffusion-webui/webui.py", line 13, in
from modules.call_queue import wrap_queued_call, queue_lock, wrap_gradio_gpu_call
File "/home/vidyut/AI/TEST/stable-diffusion-webui/modules/call_queue.py", line 7, in
from modules import shared
File "/home/vidyut/AI/TEST/stable-diffusion-webui/modules/shared.py", line 131, in
devices.device, devices.device_interrogate, devices.device_gfpgan, devices.device_esrgan, devices.device_codeformer =
File "/home/vidyut/AI/TEST/stable-diffusion-webui/modules/shared.py", line 132, in
(devices.cpu if any(y in cmd_opts.use_cpu for y in [x, 'all']) else devices.get_optimal_device() for x in ['sd', 'interrogate', 'gfpgan', 'esrgan', 'codeformer'])
File "/home/vidyut/AI/TEST/stable-diffusion-webui/modules/devices.py", line 29, in get_optimal_device
accelerator_device = accelerator.get_device()
File "/home/vidyut/AI/TEST/stable-diffusion-webui/modules/accelerator.py", line 25, in get_device
return impl.get_device()
AttributeError: 'NoneType' object has no attribute 'get_device'

Vidyut · 2023-01-24T12:11:47Z

Reinstalled everything. Different error.

Launching Web UI with arguments: --medvram --precision full --no-half --ckpt /home/vidyut/AI/TEST/stable-diffusion-webui/models/Stable-diffusion/sd-v1-4.ckpt --config configs/v1-inference-xpu.yaml
'NoneType' object has no attribute 'enable_tf32': str
Traceback (most recent call last):
File "/home/vidyut/AI/TEST/stable-diffusion-webui/modules/errors.py", line 29, in run
code()
File "/home/vidyut/AI/TEST/stable-diffusion-webui/modules/accelerator.py", line 58, in enable_tf32
return impl.enable_tf32()
AttributeError: 'NoneType' object has no attribute 'enable_tf32'

2023-01-24 17:38:30,868 - torch.distributed.nn.jit.instantiator - INFO - Created a temporary directory at /tmp/tmpnapm8vcl
2023-01-24 17:38:30,869 - torch.distributed.nn.jit.instantiator - INFO - Writing /tmp/tmpnapm8vcl/_remote_module_non_scriptable.py
2023-01-24 17:38:31,054 - root - WARNING - Pytorch pre-release version 1.13.0a0+gitb1dde16 - assuming intent to test it
2023-01-24 17:38:31,075 - root - WARNING - Pytorch pre-release version 1.13.0a0+gitb1dde16 - assuming intent to test it
No module 'xformers'. Proceeding without it.
Traceback (most recent call last):
File "/home/vidyut/AI/TEST/stable-diffusion-webui/launch.py", line 315, in
start()
File "/home/vidyut/AI/TEST/stable-diffusion-webui/launch.py", line 306, in start
import webui
File "/home/vidyut/AI/TEST/stable-diffusion-webui/webui.py", line 13, in
from modules.call_queue import wrap_queued_call, queue_lock, wrap_gradio_gpu_call
File "/home/vidyut/AI/TEST/stable-diffusion-webui/modules/call_queue.py", line 7, in
from modules import shared
File "/home/vidyut/AI/TEST/stable-diffusion-webui/modules/shared.py", line 131, in
devices.device, devices.device_interrogate, devices.device_gfpgan, devices.device_esrgan, devices.device_codeformer =
File "/home/vidyut/AI/TEST/stable-diffusion-webui/modules/shared.py", line 132, in
(devices.cpu if any(y in cmd_opts.use_cpu for y in [x, 'all']) else devices.get_optimal_device() for x in ['sd', 'interrogate', 'gfpgan', 'esrgan', 'codeformer'])
File "/home/vidyut/AI/TEST/stable-diffusion-webui/modules/devices.py", line 29, in get_optimal_device
accelerator_device = accelerator.get_device()
File "/home/vidyut/AI/TEST/stable-diffusion-webui/modules/accelerator.py", line 25, in get_device
return impl.get_device()
AttributeError: 'NoneType' object has no attribute 'get_device'

At this point I'm not sure this is within my ability.

jbaboval · 2023-01-24T13:06:08Z

It should be telling you right at the beginning that it's using OneAPI:

Launching Web UI with arguments: --config configs/v1-inference-xpu.yaml --listen
OneAPI is available
2023-01-24 08:03:28,418 - torch.distributed.nn.jit.instantiator - INFO - Created a temporary directory at /tmp/tmpsh22b93t
2023-01-24 08:03:28,419 - torch.distributed.nn.jit.instantiator - INFO - Writing /tmp/tmpsh22b93t/_remote_module_non_scriptable.py
2023-01-24 08:03:28,468 - root - WARNING - Pytorch pre-release version 1.13.0a0+gitb1dde16 - assuming intent to test it
2023-01-24 08:03:28,501 - root - WARNING - Pytorch pre-release version 1.13.0a0+gitb1dde16 - assuming intent to test it
No module 'xformers'. Proceeding without it.
Device is xpu

However it shouldn't crash out with an exception if it's not working. I'll have to fix that.

In the meantime you'll have to figure out how to get your OneAPI environment working before I can help with the webui. There's a section in the notes about how to validate

> python3
Python 3.9.15 (main, Nov 11 2022, 13:58:57) 
[GCC 11.2.0] :: Intel Corporation on linux
Type "help", "copyright", "credits" or "license" for more information.
Intel(R) Distribution for Python is brought to you by Intel Corporation.
Please check out: https://software.intel.com/en-us/python-distribution
>>> import torch
>>> import intel_extension_for_pytorch
[W OperatorEntry.cpp:150] Warning: Overriding a previously registered kernel for the same operator and the same dispatch key
  operator: torchvision::nms
    no debug info
  dispatch key: CPU
  previous kernel: registered at /build/intel-pytorch-extension/csrc/cpu/aten/TorchVisionNms.cpp:47
       new kernel: registered at /opt/workspace/vision/torchvision/csrc/ops/cpu/nms_kernel.cpp:112 (function registerKernel)
>>> torch.xpu.is_available()
True

jbaboval · 2023-01-24T13:17:51Z

There's a new branch: rebase. It has a fix for the above exception (your GPU still won't work if you don't get the "OneAPI is Available" message). And it includes the latest upstream changes

Vidyut · 2023-01-24T14:25:11Z

Contents of test.sh

#!/bin/bash
. /opt/intel/oneapi/setvars.sh
sycl-ls
pip list|grep torch
python -c 'import torch; import intel_extension_for_pytorch; print(torch.xpu.is_available())'

Result:

vidyut@saaki:~/AI/TEST/stable-diffusion-webui$ sh test.sh

:: initializing oneAPI environment ...
test.sh: SH_VERSION = unknown
args: Using "$@" for setvars.sh arguments:
:: advisor -- latest
:: ccl -- latest
:: compiler -- latest
:: dal -- latest
:: debugger -- latest
:: dev-utilities -- latest
:: dnnl -- latest
:: dpcpp-ct -- latest
:: dpl -- latest
:: embree -- latest
:: inspector -- latest
:: intelpython -- latest
:: ipp -- latest
:: ippcp -- latest
:: ipp -- latest
:: ispc -- latest
:: mkl -- latest
:: modelzoo -- latest
:: modin -- latest
:: mpi -- latest
:: neural-compressor -- latest
:: oidn -- latest
:: openvkl -- latest
:: ospray -- latest
:: ospray_studio -- latest
:: pytorch -- latest
:: rkcommon -- latest
:: rkutil -- latest
:: tbb -- latest
:: tensorflow -- latest
:: vpl -- latest
:: vtune -- latest
:: oneAPI environment initialized ::

[opencl:acc:0] Intel(R) FPGA Emulation Platform for OpenCL(TM), Intel(R) FPGA Emulation Device 1.2 [2022.15.12.0.01_081451]
[opencl:cpu:1] Intel(R) OpenCL, Intel(R) Core(TM) i7-6500U CPU @ 2.50GHz 3.0 [2022.15.12.0.01_081451]
[opencl:gpu:2] Intel(R) OpenCL HD Graphics, Intel(R) HD Graphics 520 [0x1916] 3.0 [22.43.24595.35]
[ext_oneapi_level_zero:gpu:0] Intel(R) Level-Zero, Intel(R) HD Graphics 520 [0x1916] 1.3 [1.3.24595]
intel-extension-for-pytorch 1.13.10+xpu
open-clip-torch 2.7.0
pytorch-lightning 1.7.6
torch 1.13.0a0+gitb1dde16
torchdiffeq 0.2.3
torchmetrics 0.11.0
torchsde 0.2.5
torchvision 0.14.1a0+0504df5
[W OperatorEntry.cpp:150] Warning: Overriding a previously registered kernel for the same operator and the same dispatch key
operator: torchvision::nms
no debug info
dispatch key: CPU
previous kernel: registered at /build/intel-pytorch-extension/csrc/cpu/aten/TorchVisionNms.cpp:47
new kernel: registered at /opt/workspace/vision/torchvision/csrc/ops/cpu/nms_kernel.cpp:112 (function registerKernel)
True

Are you passing the original suggested arguments to launch.py? Because I am, but they aren't showing in your example. Maybe that's the issue? Update: Not working without them and just passing the two you said either. The "OneAPI is available" message doesn't show.

I'm able to set up the environment as far as I can tell, but I can't get the code to run. Maybe there's a missing dependency...

Vidyut · 2023-01-24T14:33:06Z

I've got other work I'm doing now. Will test more when I get time.

jbaboval · 2023-01-25T01:23:24Z

Can you try running source ./venv/bin/activate in your webui tree to activate the virtual environment and then run your test script again?

I think that the problem now might be a difference between your system python environment and the venv environment.

Vidyut · 2023-01-25T05:44:34Z

nope :(

jbaboval · 2023-01-25T13:46:10Z

Sorry I couldn't get you working. I'm going to try and tidy this stuff up and submit it back. So hopefully you'll have better luck when it's properly integrated.

Nathan-dm · 2023-01-26T12:35:11Z

could you provide installation tutorial on windows os? i would like to try it on my laptop, coz im sick of waiting my cpu to generate images. my laptop spec is i5 1135g7,16gb ddr4 ram, intel xe graphic (80cu),and intel xe max graphic (DG1)

jbaboval · 2023-02-26T17:24:56Z

There's still a lot of stuff broken, but at this point it's hard to tell the difference between bugs here, and driver issues, and pytorch extension issues. I'm also unsure that my SD2.1 fix is the right fix, though it works. I wish I had a CUDA system along side my A770 to compare. Is batch matrix multiplication on cuda just more automatically adaptable? Because otherwise I don't understand what is OneAPI specific about the change I made, or why it works elsewhere without it.

I agree about the messy setup, however it's also pretty typical for Intel's experimental GPU work. It's a mess in early days, but once everything matures they get it all upstream and things are easier. By then, though, somebody else did all the fun hacking.

genewitch · 2023-03-01T05:32:50Z

@jbaboval
i guess a git pull in my directory with your repo will get me up to date?

i have half a mind to wipe all drives and start from scratch. If you want to have access to a decent platform message me. I'm this username most places, including gmail.

edit: i have a system with only cuda - 3060 and 3070. I also have a system with a 1050ti and an arc 770 or whatever. the $350 one.

genewitch · 2023-03-01T05:37:07Z

edit: after updating the intel drivers the first 2 images of any batch size (including 1 and 2) are blank or garbage. Good times.

Yes, this seems to have broken with the 1.13 extensions release and the corresponding driver. (Though I only get image 2 in batches as corrupt, not 1 & 2).

is there a way to roll back packages in ubuntu? i actually had a decently working system and decided to update the graphics drivers. (i use gentoo, and i know how to do this there.)

I'll google it if i don't have a reply by tomorrow.

jbaboval · 2023-03-04T00:06:57Z

@jbaboval i guess a git pull in my directory with your repo will get me up to date?

Yeah, but make sure you're on the oneapi branch. The master branch is the same as upstream.

If you want to run it on your cuda system and tell me what I broke I'll fix that up too. I'll need to make it work everywhere if it's ever going to get merged.

genewitch · 2023-03-04T01:40:33Z

unfortunately i returned the A770. The performance was fine, 7.12it/s after the latest intel drivers; but it just started returning garbled, kaleidoscope, paint-spray looking images, and nothing but. Not even a hint that it was doing stable diffusion things.

I told the retailer (and intel, as this was an intel branded card) that i suspected a memory issue. I'm searching for a replacement as we speak, and i still have your oneapi branch checked out on the drive, so when something comes in i'll test again.

heopfully my "how to install" guide above comes in handy, and someone else has a better experience than i did.

jbaboval · 2023-03-04T16:55:04Z

Probably is a memory issue, but I think driver, not hardware.

I'm hoping this means progress soon: intel/intel-extension-for-pytorch#302 (comment)

neggles · 2023-03-05T07:33:48Z

FWIW, I have a number of nVidia systems available, a ROCm system, and now an A770 16GB (though the system for that is currently lashed together on a bench, it does work, so whatever). Will be attempting to have another go at it sometime in the next day-ish.

Glubb · 2023-03-08T19:52:42Z

update, it works on gentoo using kernel 6.2.2, so we don't have to use ubuntu with intel's kernel for those on other operating systems
also I have the same problems as genewitch so I can confirm the hardware is fine it must be the memory issue
also I noticed at least with my configuration and images I'm producing, blitter has gone down from 70%-100% to 30%-55% compared to the original ubuntu 22.04 install, so something has improved somewhere, but this is with --no-half --no-half-vae, without these arguments it's not stable

neggles · 2023-03-09T12:52:09Z

It looks like something is just straight broken in the intel pytorch extension - I can't get the damn thing to build, even following their instructions, using their python distribution and build from conda, or using the script in the repo that it looks they used to build the release artifacts. And based on the comments on some of the issues in the repo (see the one @jbaboval linked above) it seems like they know there are Problems and they're working on it.

Think I've bashed my head against this wall enough for a little while... going to wait for another release from Intel. But yes, it does work on latest mainline - I'm on 6.3.0-rc1 on fedora 38 prelease atm, but in an ubuntu 22.04 docker container - just... for a given value of "work". With no-half and no-half-vae i get all kinds of fun casting problems, fp32 stuff seems to just eat itself... something is broken here :(

genewitch · 2023-03-09T21:50:27Z

It looks like something is just straight broken in the intel pytorch extension -
...
With no-half and no-half-vae i get all kinds of fun casting problems, fp32 stuff seems to just eat itself... something is broken here :(

i think it's actually the latest intel driver, because i had SD working "fine" before i tried to do a driver update. By "fine" i mean it would only garble 1/12 or 1/15 images, usually because "a tensor has produced all NaNs, try no-half/no-half-vae" - but setting those flags made no difference. So i updated, and now it was 10/10 images garbled, blank, or kaleidoscope.

I also couldn't get intel pytorch to build, but i did get the intel tensorflow to build. I think their automated build system is missing some configuration file, because i was getting errors about "missing prerequisites" - that's the whole point of a build system, isn't it? >.<

genewitch · 2023-03-13T06:32:06Z

for the record:

paid for, received and installed a 12gig 3060, went into my ssd mountpoint and did a git clone automatic1111 or whatever, and i synced the embeddings and models folders, and everything just works.

intel arc is just broken.

neggles · 2023-03-19T09:51:22Z

update: have managed to build intel's patched torch and the IPEX extension from source, with much pain. It still can't actually run a generation - something goes screwy in the scheduler and it hard-locks the GPU - but I suspect half of the problem there is that I've been building on GCC 13 which changed a whole bunch of stuff and throws errors all over the place because of missing headers.

Will be making another attempt with some older GCC versions (probably just admit defeat and use Ubuntu 22.04) at some point soonish, or possibly just waiting for intel to give us some slightly newer versions.

jbaboval · 2023-03-23T18:51:40Z

New IPEX release today, but it's CPU only again. I wonder if they can't find the bug?

whchan05 · 2023-04-04T08:38:59Z

Came across SD2.0 OpenVINO implementation by Intel. Any chance of it being integrated in Windows version of Web UI?

jbaboval · 2023-04-29T21:00:16Z

New IPEX release for xpu today! No wheels though... have to wait for it to build from source.

jbaboval · 2023-05-07T22:32:06Z

Intel finally published wheels.

It looks like the new version fixes the major issues. It also introduces some new ones. I've worked around enough to get SD1.5 working and pushed it (with updated instructions) to my fork.

I'll try to rebase closer to AUTOMATIC1111's tip soon, but since this project has moved on to torch 2.0 and the IPEX repo is still on 1.13.x, there will be yet more waiting for releases...

Nathan-dm · 2023-05-10T09:10:50Z

Intel finally published wheels.

It looks like the new version fixes the major issues. It also introduces some new ones. I've worked around enough to get SD1.5 working and pushed it (with updated instructions) to my fork.

I'll try to rebase closer to AUTOMATIC1111's tip soon, but since this project has moved on to torch 2.0 and the IPEX repo is still on 1.13.x, there will be yet more waiting for releases...

good news, but IPEX doesn't support my gpu yet (Intel xe graphics), im still waiting intel to give support to they IGPU lineup/pre ARC gpu

RigoLigoRLC · 2023-05-30T21:52:02Z

@jbaboval Your branch has no issue section so I had to try my luck here. Installed the oneAPI kit as in ArcNotes.txt, verified xpu is available. But I would keep getting "OpenCL error -6". I'm not sure where things has gone wrong but the presence of OpenCL seems sussy.

Logs

rigoligo@RIGO-DESKTOP:~/gitcode/stable-diffusion-webui$ python3 ./launch.py --use-intel-oneapi
Python 3.9.16 (main, Feb 22 2023, 01:57:33)
[GCC 11.2.0]
Commit hash: 2b316c206c84221b94e67456c3811f4df3f699e9
Installing requirements
Launching Web UI with arguments: --use-intel-oneapi
/home/rigoligo/.local/lib/python3.9/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension:
  warn(f"Failed to load image Python extension: {e}")
OneAPI is available
Device is xpu
No module 'xformers'. Proceeding without it.
...
2023-05-31 05:46:51,081 - httpx - INFO - HTTP Request: POST http://127.0.0.1:7860/api/predict "HTTP/1.1 200 OK"
2023-05-31 05:46:51,083 - httpx - INFO - HTTP Request: POST http://127.0.0.1:7860/reset "HTTP/1.1 200 OK"
Error completing request
Arguments: ('task(cai83ytctr6u3cn)', '', '', [], 20, 0, False, False, 1, 1, 7, -1.0, -1.0, 0, 0, 0, False, 512, 512, False, 0.7, 2, 'Latent', 0, 0, 0, [], 0, False, False, 'positive', 'comma', 0, False, False, '', 1, '', [], 0, '', [], 0, '', [], True, False, False, False, 0) {}
Traceback (most recent call last):
  File "/home/rigoligo/gitcode/stable-diffusion-webui/modules/call_queue.py", line 57, in f
    res = list(func(*args, **kwargs))
  File "/home/rigoligo/gitcode/stable-diffusion-webui/modules/call_queue.py", line 37, in f
    res = func(*args, **kwargs)
  File "/home/rigoligo/gitcode/stable-diffusion-webui/modules/txt2img.py", line 56, in txt2img
    processed = process_images(p)
  File "/home/rigoligo/gitcode/stable-diffusion-webui/modules/processing.py", line 517, in process_images
    res = process_images_inner(p)
  File "/home/rigoligo/gitcode/stable-diffusion-webui/modules/processing.py", line 660, in process_images_inner
    uc = get_conds_with_caching(prompt_parser.get_learned_conditioning, negative_prompts, p.steps * step_multiplier, cached_uc)
  File "/home/rigoligo/gitcode/stable-diffusion-webui/modules/processing.py", line 599, in get_conds_with_caching
    cache[1] = function(shared.sd_model, required_prompts, steps)
  File "/home/rigoligo/gitcode/stable-diffusion-webui/modules/prompt_parser.py", line 140, in get_learned_conditioning
    conds = model.get_learned_conditioning(texts)
  File "/home/rigoligo/gitcode/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/models/diffusion/ddpm.py", line 669, in get_learned_conditioning
    c = self.cond_stage_model(c)
  File "/home/rigoligo/.local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/rigoligo/gitcode/stable-diffusion-webui/modules/sd_hijack_clip.py", line 229, in forward
    z = self.process_tokens(tokens, multipliers)
  File "/home/rigoligo/gitcode/stable-diffusion-webui/modules/sd_hijack_clip.py", line 254, in process_tokens
    z = self.encode_with_transformers(tokens)
  File "/home/rigoligo/gitcode/stable-diffusion-webui/modules/sd_hijack_clip.py", line 302, in encode_with_transformers
    outputs = self.wrapped.transformer(input_ids=tokens, output_hidden_states=-opts.CLIP_stop_at_last_layers)
  File "/home/rigoligo/.local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/rigoligo/.local/lib/python3.9/site-packages/transformers/models/clip/modeling_clip.py", line 811, in forward
    return self.text_model(
  File "/home/rigoligo/.local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/rigoligo/.local/lib/python3.9/site-packages/transformers/models/clip/modeling_clip.py", line 721, in forward
    encoder_outputs = self.encoder(
  File "/home/rigoligo/.local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/rigoligo/.local/lib/python3.9/site-packages/transformers/models/clip/modeling_clip.py", line 650, in forward
    layer_outputs = encoder_layer(
  File "/home/rigoligo/.local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/rigoligo/.local/lib/python3.9/site-packages/transformers/models/clip/modeling_clip.py", line 378, in forward
    hidden_states = self.layer_norm1(hidden_states)
  File "/home/rigoligo/.local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/rigoligo/.local/lib/python3.9/site-packages/torch/nn/modules/normalization.py", line 190, in forward
    return F.layer_norm(
  File "/home/rigoligo/.local/lib/python3.9/site-packages/torch/nn/functional.py", line 2515, in layer_norm
    return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
RuntimeError: An OpenCL error occurred: -6

eddiesmithgit · 2023-06-28T07:44:09Z

I know this might be a dumb question, but just to save me some trouble (if someone knows this might not work for sure), will this work on Mac OS 10.15, I am using :
MacBook Pro (Retina, 15-inch, Mid 2015)
Processor : 2.2 GHz Quad-Core Intel Core i7 (it's 4th gen by the way)
Graphics : Intel Iris Pro 1536 MB
Memory : 16 GB 1600 MHz DDR3

RigoLigoRLC · 2023-06-28T20:54:00Z

You should look for a Metal solution. But if you're not on latest OS, necessary Metal APIs may be missing. eddiesmithgit ***@***.***> 于 2023年6月28日周三下午3:44写道：

…

I know this might be a dumb question, but just to save me some trouble (if someone knows this might not work for sure), will this work on Mac OS 10.15, I am using : MacBook Pro (Retina, 15-inch, Mid 2015) Processor : 2.2 GHz Quad-Core Intel Core i7 (it's 4th gen by the way) Graphics : Intel Iris Pro 1536 MB Memory : 16 GB 1600 MHz DDR3 — Reply to this email directly, view it on GitHub <#6417 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ALYTVUTDHQRRS2F7VCIICGLXNPOFNANCNFSM6AAAAAATS2Z5SY> . You are receiving this because you commented.Message ID: ***@***.***>

eddiesmithgit · 2023-06-29T07:06:56Z

You should look for a Metal solution. But if you're not on latest OS,

necessary Metal APIs may be missing.

eddiesmithgit @.***> 于 2023年6月28日周三下午3:44写道：

I know this might be a dumb question, but just to save me some trouble (if

someone knows this might not work for sure), will this work on Mac OS

10.15, I am using :

MacBook Pro (Retina, 15-inch, Mid 2015)

Processor : 2.2 GHz Quad-Core Intel Core i7 (it's 4th gen by the way)

Graphics : Intel Iris Pro 1536 MB

Memory : 16 GB 1600 MHz DDR3

—

Reply to this email directly, view it on GitHub

#6417 (comment),

or unsubscribe

https://github.com/notifications/unsubscribe-auth/ALYTVUTDHQRRS2F7VCIICGLXNPOFNANCNFSM6AAAAAATS2Z5SY

.

You are receiving this because you commented.Message ID:

@.***>

Thanks for pointing me in the right direction, so I did some digging in web and found this repo , hopefully this should work I suppose

https://github.com/soten355/MetalDiffusion

mindplay-dk · 2023-07-31T20:38:03Z

Is Intel support coming?

(I gave the Vlad fork a go - they claim they've got Intel support, but I couldn't make it work...)

Nuullll · 2023-08-01T06:10:06Z

Is Intel support coming?

(I gave the Vlad fork a go - they claim they've got Intel support, but I couldn't make it work...)

If it was a problem of environment setup, you could give my docker image a try: https://github.com/Nuullll/ipex-sd-docker-for-arc-gpu

mindplay-dk · 2023-08-18T05:28:42Z

Going to try this today, I guess 🙂

https://github.com/openvinotoolkit/stable-diffusion-webui/wiki/Installation-on-Intel-Silicon

Nuullll · 2024-01-06T11:22:43Z

FYI. IPEX is supported since 1.7.0: #14171

uxdesignerhector · 2024-01-20T18:12:05Z

FYI. IPEX is supported since 1.7.0: #14171

I can confirm it is working in Windows! it is really fast! to make use of it you must append --use-ipex to COMMANDLINE_ARGS in webui-user.bat or if you are ussing Stability Matrix add it to extra launch arguments.

Also make sure to follow these steps #14171 (comment) if you are using an old installation (you don't need to be on dev branch as it was merged already in 1.7.0)

I needed to disable my iGPU (UHD, Iris) in Device Manager and delete my old venv folder and after that launch Stable Diffusion WebUI with the extra launch argument --use-ipex

@Vidyut

mezotaken added the enhancement New feature or request label Jan 12, 2023

kwaa mentioned this issue Apr 4, 2023

[Feature Request] Support Intel Extension for PyTorch (IPEX) comfyanonymous/ComfyUI#387

Closed

akx added the platform:mac Issues that apply to Apple OS X, M1, M2, etc label Jun 13, 2023

mindplay-dk mentioned this issue Aug 4, 2023

didn't work Nuullll/ipex-sd-docker-for-arc-gpu#13

Open

Nuullll mentioned this issue Dec 2, 2023

Initial IPEX support for Intel Arc GPU #14171

Merged

4 tasks

[Feature Request]: Support for Intel Oneapi/Vulkan versions of pytorch as well #6417

[Feature Request]: Support for Intel Oneapi/Vulkan versions of pytorch as well #6417

Comments

Vidyut commented Jan 6, 2023 • edited Loading

Is there an existing issue for this?

What would your feature do ?

Proposed workflow

Additional information

uxdesignerhector commented Jan 6, 2023

Vidyut commented Jan 6, 2023 • edited Loading

Vidyut commented Jan 9, 2023

rahulunair commented Jan 12, 2023 • edited Loading

uxdesignerhector commented Jan 13, 2023

jbaboval commented Jan 17, 2023

jbaboval commented Jan 17, 2023

jbaboval commented Jan 18, 2023

jbaboval commented Jan 21, 2023

jbaboval commented Jan 22, 2023

Vidyut commented Jan 22, 2023

jbaboval commented Jan 22, 2023

jbaboval commented Jan 22, 2023 • edited Loading

Vidyut commented Jan 23, 2023 • edited Loading

Vidyut commented Jan 23, 2023 • edited Loading

jbaboval commented Jan 23, 2023

Vidyut commented Jan 23, 2023

jbaboval commented Jan 23, 2023

Nathan-dm commented Jan 24, 2023

Vidyut commented Jan 24, 2023 • edited Loading

Vidyut commented Jan 24, 2023

jbaboval commented Jan 24, 2023

jbaboval commented Jan 24, 2023

Vidyut commented Jan 24, 2023 • edited Loading

Vidyut commented Jan 24, 2023

jbaboval commented Jan 25, 2023

Vidyut commented Jan 25, 2023

jbaboval commented Jan 25, 2023

Nathan-dm commented Jan 26, 2023

jbaboval commented Feb 26, 2023

genewitch commented Mar 1, 2023 • edited Loading

genewitch commented Mar 1, 2023

jbaboval commented Mar 4, 2023

genewitch commented Mar 4, 2023

jbaboval commented Mar 4, 2023

neggles commented Mar 5, 2023

Glubb commented Mar 8, 2023 • edited Loading

neggles commented Mar 9, 2023

genewitch commented Mar 9, 2023 • edited Loading

genewitch commented Mar 13, 2023

neggles commented Mar 19, 2023

jbaboval commented Mar 23, 2023

whchan05 commented Apr 4, 2023

jbaboval commented Apr 29, 2023

jbaboval commented May 7, 2023

Nathan-dm commented May 10, 2023

RigoLigoRLC commented May 30, 2023 • edited Loading

eddiesmithgit commented Jun 28, 2023

RigoLigoRLC commented Jun 28, 2023 via email

eddiesmithgit commented Jun 29, 2023

mindplay-dk commented Jul 31, 2023

Nuullll commented Aug 1, 2023

mindplay-dk commented Aug 18, 2023

Nuullll commented Jan 6, 2024 • edited Loading

uxdesignerhector commented Jan 20, 2024 • edited Loading

Vidyut commented Jan 6, 2023 •

edited

Loading

Vidyut commented Jan 6, 2023 •

edited

Loading

rahulunair commented Jan 12, 2023 •

edited

Loading

jbaboval commented Jan 22, 2023 •

edited

Loading

Vidyut commented Jan 23, 2023 •

edited

Loading

Vidyut commented Jan 23, 2023 •

edited

Loading

Vidyut commented Jan 24, 2023 •

edited

Loading

Vidyut commented Jan 24, 2023 •

edited

Loading

genewitch commented Mar 1, 2023 •

edited

Loading

Glubb commented Mar 8, 2023 •

edited

Loading

genewitch commented Mar 9, 2023 •

edited

Loading

RigoLigoRLC commented May 30, 2023 •

edited

Loading

Nuullll commented Jan 6, 2024 •

edited

Loading

uxdesignerhector commented Jan 20, 2024 •

edited

Loading