Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AutoencodeTiny doesn't work for LCM img2img when passing an image to encode #5619

Closed
aifartist opened this issue Nov 2, 2023 · 9 comments
Closed
Labels
bug Something isn't working stale Issues that haven't received updates

Comments

@aifartist
Copy link

Describe the bug

AttributeError: 'AutoencoderTinyOutput' object has no attribute 'latent_dist'

A normal(?) VAE has 'latent_dist'. Tiny has "latents" instead.

The custom extension: latent_consistency_img2img.py does:

self.vae.encode(image[i : i + 1]).latent_dist.sample(generator[i]) for i in range(batch_size)

and

init_latents = self.vae.encode(image).latent_dist.sample(generator)

resulting in the error. @vladmandic says that when he directly uses TAESD he doesn't have a problem on the img2img encode. Either he is passing a latent, instead of an image to pipe() or the real TAESD has changes not present in the diffusers' Tiny VAE.

The Tiny VAE has allow me to hit 22 LCM 512x512 4 step txt2img images per second and 15 LCM 512x512 4 step img2img images per second. I got this to work by using "latents" instead of "latent_dist.sample(generator)".
I don't know if this is the correct fix but I get good, for LCM images. And fast because of the TAESD.

Reproduction

You need torch, PIL and diffusers

import torch
from PIL import Image
from diffusers import DiffusionPipeline

pipe = DiffusionPipeline.from_pretrained("SimianLuo/LCM_Dreamshaper_v7", custom_pipeline="latent_consistency_img2img", safety_checker=None, custom_revision="main")

pipe.to(torch_device="cuda", torch_dtype=torch.float16)
from diffusers import AutoencoderTiny
pipe.vae = AutoencoderTiny.from_pretrained("madebyollin/taesd", torch_device="cuda", torch_dtype=torch.float16)
pipe.vae = pipe.vae.cuda()

images = pipe(
    prompt = 'Women wearing fancy dress, intricate jewlery',
    image = Image.new('RGB', (512, 512)),
    width=512, height=512,
    strength=.5, guidance_scale=8,
    num_inference_steps=4, num_images_per_prompt=1,
    lcm_origin_steps=50,
    output_type="pil",
).images

Logs

No response

System Info

  • diffusers version: 0.21.4
  • Platform: Linux-6.2.0-34-generic-x86_64-with-glibc2.35
  • Python version: 3.10.12
  • PyTorch version (GPU?): 2.2.0.dev20231030+cu121 (True)
  • Huggingface_hub version: 0.17.3
  • Transformers version: 4.34.1
  • Accelerate version: 0.24.1
  • xFormers version: not installed
  • Using GPU in script?: 4090 as device='cuda'
  • Using distributed or parallel set-up in script?: I have no idea what this mean. It is my home PC. I use venv

Who can help?

@sayakpaul @patrickvonplaten

@aifartist aifartist added the bug Something isn't working label Nov 2, 2023
@sayakpaul
Copy link
Member

This should solve it for you: https://colab.research.google.com/gist/sayakpaul/fa95a41beb5fea6d830324cbf6a8e8f4/scratchpad.ipynb

We have included Latency Consistency Models officially as a part of diffusers. See here: https://huggingface.co/docs/diffusers/main/en/api/pipelines/latent_consistency_models.

For now, it's needed to specify a revision while loading the pipeline (as you'd notice in the Colab above). But after we release a new version of diffusers that won't be needed.

@radames
Copy link
Contributor

radames commented Nov 2, 2023

hi @sayakpaul , I think the issue is with the image-to-image pipeline, do you know if we can already use AutoPipelineForImage2Image for LCM?

@aifartist
Copy link
Author

This should solve it for you: https://colab.research.google.com/gist/sayakpaul/fa95a41beb5fea6d830324cbf6a8e8f4/scratchpad.ipynb

We have included Latency Consistency Models officially as a part of diffusers. See here: https://huggingface.co/docs/diffusers/main/en/api/pipelines/latent_consistency_models.

For now, it's needed to specify a revision while loading the pipeline (as you'd notice in the Colab above). But after we release a new version of diffusers that won't be needed.

That example isn't img2img with an image input(not an input latent which doesn't need encoding).

@sayakpaul
Copy link
Member

Now I understand. We had a similar issue for this #4720 and @slep0v had already proposed a nice solution here: #4891. Let me try to reopen and see if I get it done.

@sayakpaul
Copy link
Member

We recently added support for an LCM Img2Img pipeline. #5636 enables inference for the major image-to-image pipelines with the tiny Autoencoder. Could you give it a look?

Copy link

github-actions bot commented Dec 2, 2023

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@github-actions github-actions bot added the stale Issues that haven't received updates label Dec 2, 2023
@aifartist
Copy link
Author

I've been successfully using TinyVAE now that LCM support is in diffusers.
It is the only way I can hit 167 images per second with sd-turbo,

@aifartist
Copy link
Author

I've been successfully using TinyVAE now that LCM support is in diffusers.
It is the only way I can hit 167 images per second with sd-turbo, the stable-fast compiler, and my optimization set. Just hit this number this morning with batching on my 4090.

@sayakpaul
Copy link
Member

That is amazing. Feel free to share your code and results. If you have shared it on Social Media, feel free to let us know, we can try to amplify :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working stale Issues that haven't received updates
Projects
None yet
Development

No branches or pull requests

3 participants