-
Notifications
You must be signed in to change notification settings - Fork 27.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request]: Support for Intel Oneapi/Vulkan versions of pytorch as well #6417
Comments
Yes, it would be nice to squeeze those 16gb from the intel arc 770. It seems that the problem resides at PyTorch pytorch/pytorch#30029. PyTorch will need to support One API. But it seems it's possible to run PyTorch with intel GPUs through extensions as is stated in https://github.com/intel/intel-extension-for-pytorch/tree/xpu-master This thread from Reddit has useful information about the possible ways Intel could approach Stable Diffusion /PyTorch problem https://www.reddit.com/r/intel/comments/xvbmif/will_intel_arc_support_stable_diffusion/ |
Pytorch HAS the intel extension, though unlike ROCm, it requires code changes as it stands. It is just a couple of lines - this project already seems to do similar to integrate mps, which is why I suggested here. But it can accelerate CPUs and the unreleased version runs on older GPUs and what not, which is great! I wouldn't be surprised if such an integration made this version of Stable Diffusion the staple implementation. Stable Diffusion runs on TesorFlow, I think, which supports OneAPI - so this is less an Intel issue and more of one for those who love this project, with its well designed implementation, but would like to not wait ages while their hardware twiddles its thumbs. Almost nothing (that wouldn't crash at the task) would be left out, since it would also automatically support OpenCL I think. Not to mention I am fed up of these elitist projects refusing to recognise anything not CUDA as not GPU!!! (this includes Intel's openvino-gpu runtime - which is basically for cuda/rocm!!!) This repository with its inclusion of everything it can lay its hands on is literally the only reason I bother with pytorch. (that said, not a coder, so not like I'm using all sorts of other technologies overwhelmingly) V |
The intel extension for gpu now supports pytorch 1.13.10 https://github.com/intel/intel-extension-for-pytorch/releases/tag/v1.13.10%2Bxpu |
For anyone looking for working code for Stable Diffusion on Intel dGPUs (Arc Alchemist ) and iGPUs with PyTorch and TensorFlow, please check this out: https://github.com/rahulunair/stable_diffusion_arc or my blog: https://blog.rahul.onl/posts/2022-09-06-arc-dgpu-stable-diffusion.html For context, oneAPI is already part of PyTorch and TensorFlow as oneDNN, which is a oneAPI library is the default accelerator for CPUs that both the frameworks uses. And Intel extensions for PyTorch (ipex) are kernels that support further optimizations and Intel GPU backend. Eventually most of the code from ipex would be merged into mainline PyTorch. |
Thank you for your clarification. |
I'm going to take a stab at putting together a PR for this..... |
Unfortunately it's more than a few lines of code. And getting the intel libraries and drivers setup isn't well integrated with distributions. This is a work in progress, but it shows signs of life: |
I'm still having some issues. One is seeding. I can't get reproducible output. I thought it might be the seeding in pytorch_lightning, but at this point I have implemented full support in pytorch_lightning and instrumented the seeding code there - it never gets called. All the seeding happens in sd-webui. I've also instrumented sd-webui to validate repeatability of noise and subnoise, and it's fully repeatable. Not sure what gives yet. The other issue is that batches always have junk for the second image. On the plus side, it's really fast. Especially compared to my old GTX1660. |
I'm not a coder. I can't even begin to figure this out, but I'd happy to test if you've uploaded what you have to github. |
It's linked above. I made some notes in ArcNotes.txt that might help get you set up. |
If you're going to try the branch above:
|
Saw your comment just now and tried it. I had everything installed and the preparation was fine as per your test. --use-intel-oneapi wasn't recognised. So I probably did something wrong. The command to make the Intel version of python a system default is problematic and I almost broke other python stuff going on. Better to use the setup vars in a launcher for use only for this or add to .bashrc (and comment out when not needed...). Something like a small script: That said:
For reference (the conspicuous lack of xpu, etc words suggests I missed a trick somewhere) `:: oneAPI environment initialized :: Python 3.9.15 (main, Nov 11 2022, 13:58:57) To create a public link, set |
Arrrgh. Never mind. The torch version was wrong (I accidentally installed it in the regular python, so the script installed the regular torch in intel's python...). Now sorted. And now I have problems with Intel's torch and torchvision playing nice with each other... trying with intel's torch and regular torchvision. Sigh. Update: Fails with Intel's torchvision, but works with Intel's torch and regular torchvision. But still takes too long. Probably because I can't convince it to use the parameters you said to pass. The xpu test returns true, so requirements are installed. But I don't think it is using the xpu still. This is currently slower than untampered CPU. |
Can you check what branch of the fork you're on? It should be If it's not recognizing the command line option, it's definitely not running the right code. |
Okay, you were right. It was the wrong branch. facepalm. And I downloaded the zip and I still think it is the master. Not sure how to get the oneapi branch (I'm a champion copy-paster, but don't actually know a lot). Figuring it out. |
I'm not sure how you get the branch with the zip download. I just grabbed the zip and it doesn't include the Try |
will it work using intel igpu? |
I'm fairly certain I have the right branch now. It has the Arcnotes.txt - so what am I doing wrong? Launching Web UI with arguments: --medvram --precision full --no-half --ckpt /home/vidyut/AI/TEST/stable-diffusion-webui/models/Stable-diffusion/sd-v1-4.ckpt --config configs/v1-inference-xpu.yaml 2023-01-24 16:59:58,437 - torch.distributed.nn.jit.instantiator - INFO - Created a temporary directory at /tmp/tmpsg3pgt5w |
Reinstalled everything. Different error. Launching Web UI with arguments: --medvram --precision full --no-half --ckpt /home/vidyut/AI/TEST/stable-diffusion-webui/models/Stable-diffusion/sd-v1-4.ckpt --config configs/v1-inference-xpu.yaml 2023-01-24 17:38:30,868 - torch.distributed.nn.jit.instantiator - INFO - Created a temporary directory at /tmp/tmpnapm8vcl At this point I'm not sure this is within my ability. |
It should be telling you right at the beginning that it's using OneAPI:
However it shouldn't crash out with an exception if it's not working. I'll have to fix that. In the meantime you'll have to figure out how to get your OneAPI environment working before I can help with the webui. There's a section in the notes about how to validate
|
There's a new branch: |
Contents of test.sh #!/bin/bash Result: vidyut@saaki:~/AI/TEST/stable-diffusion-webui$ sh test.sh :: initializing oneAPI environment ... [opencl:acc:0] Intel(R) FPGA Emulation Platform for OpenCL(TM), Intel(R) FPGA Emulation Device 1.2 [2022.15.12.0.01_081451] Are you passing the original suggested arguments to launch.py? Because I am, but they aren't showing in your example. Maybe that's the issue? Update: Not working without them and just passing the two you said either. The "OneAPI is available" message doesn't show. I'm able to set up the environment as far as I can tell, but I can't get the code to run. Maybe there's a missing dependency... |
I've got other work I'm doing now. Will test more when I get time. |
Can you try running I think that the problem now might be a difference between your system python environment and the venv environment. |
nope :( |
Sorry I couldn't get you working. I'm going to try and tidy this stuff up and submit it back. So hopefully you'll have better luck when it's properly integrated. |
could you provide installation tutorial on windows os? i would like to try it on my laptop, coz im sick of waiting my cpu to generate images. my laptop spec is i5 1135g7,16gb ddr4 ram, intel xe graphic (80cu),and intel xe max graphic (DG1) |
There's still a lot of stuff broken, but at this point it's hard to tell the difference between bugs here, and driver issues, and pytorch extension issues. I'm also unsure that my SD2.1 fix is the right fix, though it works. I wish I had a CUDA system along side my A770 to compare. Is batch matrix multiplication on cuda just more automatically adaptable? Because otherwise I don't understand what is OneAPI specific about the change I made, or why it works elsewhere without it. I agree about the messy setup, however it's also pretty typical for Intel's experimental GPU work. It's a mess in early days, but once everything matures they get it all upstream and things are easier. By then, though, somebody else did all the fun hacking. |
@jbaboval i have half a mind to wipe all drives and start from scratch. If you want to have access to a decent platform message me. I'm this username most places, including gmail. edit: i have a system with only cuda - 3060 and 3070. I also have a system with a 1050ti and an arc 770 or whatever. the $350 one. |
is there a way to roll back packages in ubuntu? i actually had a decently working system and decided to update the graphics drivers. (i use gentoo, and i know how to do this there.) I'll google it if i don't have a reply by tomorrow. |
Yeah, but make sure you're on the If you want to run it on your cuda system and tell me what I broke I'll fix that up too. I'll need to make it work everywhere if it's ever going to get merged. |
unfortunately i returned the A770. The performance was fine, 7.12it/s after the latest intel drivers; but it just started returning garbled, kaleidoscope, paint-spray looking images, and nothing but. Not even a hint that it was doing stable diffusion things. I told the retailer (and intel, as this was an intel branded card) that i suspected a memory issue. I'm searching for a replacement as we speak, and i still have your oneapi branch checked out on the drive, so when something comes in i'll test again. heopfully my "how to install" guide above comes in handy, and someone else has a better experience than i did. |
Probably is a memory issue, but I think driver, not hardware. I'm hoping this means progress soon: intel/intel-extension-for-pytorch#302 (comment) |
FWIW, I have a number of nVidia systems available, a ROCm system, and now an A770 16GB (though the system for that is currently lashed together on a bench, it does work, so whatever). Will be attempting to have another go at it sometime in the next day-ish. |
update, it works on gentoo using kernel 6.2.2, so we don't have to use ubuntu with intel's kernel for those on other operating systems |
It looks like something is just straight broken in the intel pytorch extension - I can't get the damn thing to build, even following their instructions, using their python distribution and build from conda, or using the script in the repo that it looks they used to build the release artifacts. And based on the comments on some of the issues in the repo (see the one @jbaboval linked above) it seems like they know there are Problems and they're working on it. Think I've bashed my head against this wall enough for a little while... going to wait for another release from Intel. But yes, it does work on latest mainline - I'm on 6.3.0-rc1 on fedora 38 prelease atm, but in an ubuntu 22.04 docker container - just... for a given value of "work". With |
i think it's actually the latest intel driver, because i had SD working "fine" before i tried to do a driver update. By "fine" i mean it would only garble 1/12 or 1/15 images, usually because "a tensor has produced all NaNs, try no-half/no-half-vae" - but setting those flags made no difference. So i updated, and now it was 10/10 images garbled, blank, or kaleidoscope. I also couldn't get intel pytorch to build, but i did get the intel tensorflow to build. I think their automated build system is missing some configuration file, because i was getting errors about "missing prerequisites" - that's the whole point of a build system, isn't it? >.< |
for the record: paid for, received and installed a 12gig 3060, went into my ssd mountpoint and did a git clone automatic1111 or whatever, and i synced the embeddings and models folders, and everything just works. intel arc is just broken. |
update: have managed to build intel's patched torch and the IPEX extension from source, with much pain. It still can't actually run a generation - something goes screwy in the scheduler and it hard-locks the GPU - but I suspect half of the problem there is that I've been building on GCC 13 which changed a whole bunch of stuff and throws errors all over the place because of missing headers. Will be making another attempt with some older GCC versions (probably just admit defeat and use Ubuntu 22.04) at some point soonish, or possibly just waiting for intel to give us some slightly newer versions. |
New IPEX release today, but it's CPU only again. I wonder if they can't find the bug? |
Came across SD2.0 OpenVINO implementation by Intel. Any chance of it being integrated in Windows version of Web UI? |
New IPEX release for xpu today! No wheels though... have to wait for it to build from source. |
Intel finally published wheels. It looks like the new version fixes the major issues. It also introduces some new ones. I've worked around enough to get SD1.5 working and pushed it (with updated instructions) to my fork. I'll try to rebase closer to AUTOMATIC1111's tip soon, but since this project has moved on to torch 2.0 and the IPEX repo is still on 1.13.x, there will be yet more waiting for releases... |
good news, but IPEX doesn't support my gpu yet (Intel xe graphics), im still waiting intel to give support to they IGPU lineup/pre ARC gpu |
@jbaboval Your branch has no issue section so I had to try my luck here. Installed the oneAPI kit as in ArcNotes.txt, verified xpu is available. But I would keep getting "OpenCL error -6". I'm not sure where things has gone wrong but the presence of OpenCL seems sussy. Logs
|
I know this might be a dumb question, but just to save me some trouble (if someone knows this might not work for sure), will this work on Mac OS 10.15, I am using : |
You should look for a Metal solution. But if you're not on latest OS,
necessary Metal APIs may be missing.
eddiesmithgit ***@***.***> 于 2023年6月28日周三 下午3:44写道:
… I know this might be a dumb question, but just to save me some trouble (if
someone knows this might not work for sure), will this work on Mac OS
10.15, I am using :
MacBook Pro (Retina, 15-inch, Mid 2015)
Processor : 2.2 GHz Quad-Core Intel Core i7 (it's 4th gen by the way)
Graphics : Intel Iris Pro 1536 MB
Memory : 16 GB 1600 MHz DDR3
—
Reply to this email directly, view it on GitHub
<#6417 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ALYTVUTDHQRRS2F7VCIICGLXNPOFNANCNFSM6AAAAAATS2Z5SY>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Thanks for pointing me in the right direction, so I did some digging in web and found this repo , hopefully this should work I suppose |
Is Intel support coming? (I gave the Vlad fork a go - they claim they've got Intel support, but I couldn't make it work...) |
If it was a problem of environment setup, you could give my docker image a try: https://github.com/Nuullll/ipex-sd-docker-for-arc-gpu |
Going to try this today, I guess 🙂 https://github.com/openvinotoolkit/stable-diffusion-webui/wiki/Installation-on-Intel-Silicon |
FYI. IPEX is supported since 1.7.0: #14171 |
I can confirm it is working in Windows! it is really fast! to make use of it you must append Also make sure to follow these steps #14171 (comment) if you are using an old installation (you don't need to be on dev branch as it was merged already in I needed to disable my iGPU (UHD, Iris) in Device Manager and delete my old |
Is there an existing issue for this?
What would your feature do ?
This is a brilliant project and I like that it supports most versions of pytorch.
A large group of users on unsupported machines, intel, windows, etc get excluded from the performance options (which are basically cuda and wannabe-cuda) . Many of these machines have fairly decent hardware, just that it doesn't run cuda/rocm. There are pytorch versions like oneapi or vulkan, etc that would really take the reach of this project out to those with lesser machines, so to say. https://pytorch.org/tutorials/recipes/recipes/intel_extension_for_pytorch.html
I'm not a coder, but they have a pytorch version in the works similar to cuda/rocm, but it seems to support a lot of intel CPUs and GPUs, including discrete GPUs and older ones abandoned by ROCm https://github.com/intel/intel-extension-for-pytorch/tree/xpu-master
and adapting the code doesn't seem to be excessively complicated.
https://intel.github.io/intel-extension-for-pytorch/xpu/1.10.200+gpu/tutorials/examples.html
https://intel.github.io/intel-extension-for-pytorch/xpu/1.10.200+gpu/tutorials/api_doc.html
It would make the project accessible to those with simpler laptops/desktops.
https://towardsdatascience.com/pytorch-stable-diffusion-using-hugging-face-and-intel-arc-77010e9eead6
Proposed workflow
Additional information
No response
The text was updated successfully, but these errors were encountered: