AnimateDiff for Stable Diffusion WebUI

This extension aim for integrating AnimateDiff into AUTOMATIC1111 Stable Diffusion WebUI. You can generate GIFs in exactly the same way as generating images after enabling this extension.

This extension implements AnimateDiff in a different way. It does not require you to clone the whole SD1.5 repository. It also applied (probably) the least modification to ldm, so that you do not need to reload your model weights if you don't want to.

Batch size on WebUI will be replaced by GIF frame number internally: 1 full GIF generated in 1 batch. If you want to generate multiple GIF at once, please change batch number.

Batch number is NOT the same as batch size. In A1111 WebUI, batch number is above batch size. Batch number means the number of sequential steps, but batch size means the number of parallel steps. You do not have to worry too much when you increase batch number, but you do need to worry about your VRAM when you increase your batch size (where in this extension, video frame number). You do not need to change batch size at all when you are using this extension.

You might also be interested in another extension I created: Segment Anything for Stable Diffusion WebUI.

How to Use

Install this extension via link.
Download motion modules and put the model weights under stable-diffusion-webui/extensions/sd-webui-animatediff/model/. If you want to use another directory to save the model weights, please go to Settings/AnimateDiff. See model zoo for a list of available motion modules.
Enable Pad prompt/negative prompt to be same length and Batch cond/uncond and click Apply settings in Settings. You must do this to prevent generating two separate unrelated GIFs.

WebUI

Go to txt2img if you want to try txt2gif and img2img if you want to try img2gif.
Choose an SD1.5 checkpoint, write prompts, set configurations such as image width/height. If you want to generate multiple GIFs at once, please change batch number, instead of batch size.
Enable AnimateDiff extension, and set up each parameter, and click Generate.
1. Number of frames — The model is trained with 16 frames, so it’ll give the best results when the number of frames is set to 16.
2. Frames per second — How many frames (images) are shown every second. If 16 frames are generated at 8 frames per second, your GIF’s duration is 2 seconds.
3. Loop number — How many times the GIF is played. A value of 0 means the GIF never stops playing.
You should see the output GIF on the output gallery. You can access GIF output at stable-diffusion-webui/outputs/{txt2img or img2img}-images/AnimateDiff. You can also access image frames at stable-diffusion-webui/outputs/{txt2img or img2img}-images/{date}.

API

#42

Motion Module Model Zoo

mm_sd_v14.ckpt & mm_sd_v15.ckpt & mm_sd_v15_v2.ckpt by @guoyww: Google Drive | HuggingFace | CivitAI | Baidu NetDisk
mm-Stabilized_high.pth & mm-Stabbilized_mid.pth by @manshoety: HuggingFace
temporaldiff-v1-animatediff.ckpt by @CiaraRowles: HuggingFace

Update

2023/07/20 v1.1.0: fix gif duration, add loop number, remove auto-download, remove xformers, remove instructions on gradio UI, refactor README, add sponsor QR code.
2023/07/24 v1.2.0: fix incorrect insertion of motion modules, add option to change path to save motion modules in Settings/AnimateDiff, fix loading different motion modules.
2023/09/04 v1.3.0: support any community models with the same architecture; fix grey problem via #63 (credit to @TDS4874 and @opparco)
2023/09/11 v1.4.0: support official v2 motion module (different architecture: GroupNorm not hacked, UNet middle layer has motion module).
- If you are using V1 motion modules: starting from this version, you will be able to disable hacking GroupNorm in Settings/AnimateDiff. If you disable hacking GruopNorm, you will be able to use this extension in img2img in all settings, but the generated GIFs will have flickers. In WebUI >=v1.6.0, even if GroupNorm is hacked, you can still use this extension in img2img with --no-half-vae enabled.
- If you are using V2 motion modules: you will always be able to use this extension in img2img, regardless of changing that setting or not.
2023/09/14: v1.4.1: always change beta, alpha_comprod and alpha_comprod_prev to resolve grey problem in other samplers.

FAQ

Q: How much VRAM do I need?

A: Currently, you can run WebUI with this extension via NVIDIA 3090/4090. I cannot guarantee any other variations of GPU. Actual VRAM usage depends on your image size and video frame number. You can try to reduce image size or video frame number to reduce VRAM usage. The default setting (displayed in Samples/txt2img section) consumes 12GB VRAM. More VRAM info will be added later.
Q: Can I use SDXL to generate GIFs?

A: You will have to wait for someone to train SDXL-specific motion modules which will have a different model architecture. This extension essentially inject multiple motion modules into SD1.5 UNet. It does not work for other variations of SD, such as SD2.1 and SDXL.
Q: Can I generate a video instead a GIF?

A: Not at this time, but will be supported via a very huge output format pull request in the near future. I will merge with some other huge updates.
Q: Can I use this extension to do GIF2GIF? Can I apply ControlNet to this extension? Can I override the limitation of 24/32 frames per generation?

A: Not at this time, but will be supported via supporting AnimateDIFF CLI Prompt Travel in the near future. This is a huge amount of work and life is busy, so expect to wait for a long time before updating.
Q: Can I use xformers, sdp or some other attention optimizations?

A: Attention optimizations are currently not applied to motion modules, but will applied after a pull request in the near future.
Q: Can I use this extension to do img2GIF? The current generation result seems pretty static.

A: The current performance for img2GIF is indeed static. I will look into a forked AnimateDiff repository and see how I can resolve this problem in the near future.

Q: How can I reproduce the result in Samples/txt2img section?

A: You must use this logic to initialize random tensors:

    torch.manual_seed(<seed>)
    from einops import rearrange
    x = rearrange(torch.randn((4, 16, 64, 64), device=shared.device), 'c f h w -> f c h w')

Samples

txt2img

AnimateDiff	Extension v1.2.0	Extension v1.3.0

Note that I did not modify random tensor generation when producing v1.3.0 samples.

Sponsor

You can sponsor me via WeChat, AliPay or Paypal.

WeChat	AliPay	Paypal

Name		Name	Last commit message	Last commit date
Latest commit History 96 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
model		model
scripts		scripts
.gitignore		.gitignore
README.md		README.md
motion_module.py		motion_module.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AnimateDiff for Stable Diffusion WebUI

How to Use

WebUI

API

Motion Module Model Zoo

Update

FAQ

Samples

txt2img

Sponsor

About

Releases

Packages

Languages

JaredTherriault/sd-webui-animatediff

Folders and files

Latest commit

History

Repository files navigation

AnimateDiff for Stable Diffusion WebUI

How to Use

WebUI

API

Motion Module Model Zoo

Update

FAQ

Samples

txt2img

Sponsor

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages