Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Latent Consistency Models Pipeline #5448

Merged

Conversation

dg845
Copy link
Contributor

@dg845 dg845 commented Oct 19, 2023

What does this PR do?

This PR adds a pipeline and scheduler for Latent Consistency Models (LCM; paper, project page) based on the LCM community pipeline added in #5438, originally authored by @luosiallen.

Latent consistency models are an extension of consistency models which operate in the latent space of a VAE. Analogous to how diffusion models can be distilled to a consistency model for fast onestep or few-step generation, latent diffusion models such as Stable Diffusion can be distilled to a latent consistency model for fast onestep or few-step generation.

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@patrickvonplaten
@williamberman
@luosiallen

def __call__(
self,
prompt: Union[str, List[str]] = None,
height: Optional[int] = 768,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the default 512 or 768?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the SimianLuo/LCM_Dreamshaper_v7 checkpoint the VAE sample_size is 768 and the community pipeline also uses 768:

height: Optional[int] = 768,
width: Optional[int] = 768,

@luosiallen to confirm what the best default resolution is.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can set the default as 768 has the best generation quality, most of the SD model like SD-V2.1 or Dreamshaper is trained on 768 resolution.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I think it might be more convenient to set the default height and width to None and let __call__ infer them.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea.

@patrickvonplaten
Copy link
Contributor

Very clean! Think we can merge this soon :-)

prompt,
device,
num_images_per_prompt,
False, # Don't need to get negative prompts due to LCM guided distillation

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify: LCM currently supports only the unconditional prompt "" and does not support any other types of negative prompts.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. So if I understand correctly, the pipeline __call__ method should not take in negative_prompt or negative_prompt_embeds arguments, but we should still allow encode_prompt to prepare the default unconditional prompt "" if guidance_scale > 0.0, and perform CFG as normal in the rest of the pipeline?

[My understanding is that the diffusers convention is to use the Imagen CFG formulation $\tilde{\epsilon}_\theta(z_t, c) = \omega'\epsilon_\theta(z_t, c) + (1 - \omega')\epsilon_\theta(z_t, \varnothing)$ where $\omega' = \omega + 1$ and $\omega$ is the guidance scale used in the LCM paper, so I will probably change the guidance scale to follow the above definition.]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, i think currently setting the negative prompt as False is correct. All we need to do is to change the comment for better clarification: # LCM guided distillation currently only support unconditional prompt "", don't support other types of negative prompt.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unconditional prompt is not used in LCM actually. Because it has already been distilled in LCM. So we only take input of prompt_embeds. Also LCM does not support negative prompts currently, maybe try to fix it in the next version of LCM?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated the comment to provide a (hopefully) better description about negative prompt support.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[My understanding is that the diffusers convention is to use the Imagen CFG formulation
where
and
is the guidance scale used in the LCM paper, so I will probably change the guidance scale to follow the above definition.]

Indeed.

Copy link

@luosiallen luosiallen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice clarification!

prompt,
device,
num_images_per_prompt,
False, # Don't need to get negative prompts due to LCM guided distillation

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unconditional prompt is not used in LCM actually. Because it has already been distilled in LCM. So we only take input of prompt_embeds. Also LCM does not support negative prompts currently, maybe try to fix it in the next version of LCM?

def __call__(
self,
prompt: Union[str, List[str]] = None,
height: Optional[int] = 768,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can set the default as 768 has the best generation quality, most of the SD model like SD-V2.1 or Dreamshaper is trained on 768 resolution.

Copy link

@luosiallen luosiallen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good!

Copy link
Member

@a-r-r-o-w a-r-r-o-w left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great overall! It was fun reading through the paper, although I don't fully comprehend it, and quite awesome to see fast inference in such few steps yielding great results.

Apologies about the aggressive type hinting suggestions 😅 I'm trying to help make everything typed for better static type analysis and docs support for those digging deeper into the intrinsics of diffusers.

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Oct 24, 2023

The documentation is not available anymore as the PR was closed or merged.

@dg845 dg845 changed the title [WIP] Add Latent Consistency Models Pipeline Add Latent Consistency Models Pipeline Oct 24, 2023
patrickvonplaten and others added 3 commits October 24, 2023 20:34
Co-authored-by: Aryan V S <avs050602@gmail.com>
Co-authored-by: Aryan V S <avs050602@gmail.com>
Co-authored-by: Aryan V S <avs050602@gmail.com>
@patrickvonplaten patrickvonplaten merged commit 958e17d into huggingface:main Oct 24, 2023
11 checks passed
@dg845 dg845 deleted the latent-consistency-models-pipeline branch October 25, 2023 13:15
kashif pushed a commit to kashif/diffusers that referenced this pull request Nov 11, 2023
* initial commit for LatentConsistencyModelPipeline and LCMScheduler based on the community pipeline

* Add callback and freeu support.

* apply suggestions from review

* Clean up LCMScheduler

* Remove timeindex argument to LCMScheduler.step.

* Add support for clipping or thresholding the predicted original sample.

* Remove unused methods and arguments in LCMScheduler.

* Improve comment about (lack of) negative prompt support.

* Change input guidance_scale to match the StableDiffusionPipeline (Imagen) CFG formulation.

* Move lcm_origin_steps from pipeline __call__ to LCMScheduler.__init__/config (as origin_steps).

* Fix typo when clipping/thresholding in LCMScheduler.

* Add some initial LCMScheduler tests.

* add type annotations from review

* Fix type annotation bug.

* Override test_add_noise_device in LCMSchedulerTest since hardcoded timesteps doesn't work under default settings.

* Add generator argument pipeline prepare_latents call.

* Cast LCMScheduler.timesteps to long in set_timesteps.

* Add onestep and multistep full loop scheduler tests.

* Set default height/width to None and don't hardcode guidance scale embedding dim.

* Add initial LatentConsistencyPipeline fast and slow tests.

* Add initial documentation for LatentConsistencyModelPipeline and LCMScheduler.

* Make remaining failing fast tests pass.

* make style

* Make original_inference_steps configurable from pipeline __call__ again.

* make style

* Remove guidance_rescale arg from pipeline __call__ since LCM currently doesn't support CFG.

* Make LCMScheduler defaults match config of LCM_Dreamshaper_v7 checkpoint.

* Fix LatentConsistencyPipeline slow tests and add dummy expected slices.

* Add checks for original_steps in LCMScheduler.set_timesteps.

* make fix-copies

* Improve LatentConsistencyModelPipeline docs.

* Apply suggestions from code review

Co-authored-by: Aryan V S <avs050602@gmail.com>

* Apply suggestions from code review

Co-authored-by: Aryan V S <avs050602@gmail.com>

* Apply suggestions from code review

Co-authored-by: Aryan V S <avs050602@gmail.com>

* Update src/diffusers/schedulers/scheduling_lcm.py

* Apply suggestions from code review

Co-authored-by: Aryan V S <avs050602@gmail.com>

* finish

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Aryan V S <avs050602@gmail.com>
yoonseokjin pushed a commit to yoonseokjin/diffusers that referenced this pull request Dec 25, 2023
* initial commit for LatentConsistencyModelPipeline and LCMScheduler based on the community pipeline

* Add callback and freeu support.

* apply suggestions from review

* Clean up LCMScheduler

* Remove timeindex argument to LCMScheduler.step.

* Add support for clipping or thresholding the predicted original sample.

* Remove unused methods and arguments in LCMScheduler.

* Improve comment about (lack of) negative prompt support.

* Change input guidance_scale to match the StableDiffusionPipeline (Imagen) CFG formulation.

* Move lcm_origin_steps from pipeline __call__ to LCMScheduler.__init__/config (as origin_steps).

* Fix typo when clipping/thresholding in LCMScheduler.

* Add some initial LCMScheduler tests.

* add type annotations from review

* Fix type annotation bug.

* Override test_add_noise_device in LCMSchedulerTest since hardcoded timesteps doesn't work under default settings.

* Add generator argument pipeline prepare_latents call.

* Cast LCMScheduler.timesteps to long in set_timesteps.

* Add onestep and multistep full loop scheduler tests.

* Set default height/width to None and don't hardcode guidance scale embedding dim.

* Add initial LatentConsistencyPipeline fast and slow tests.

* Add initial documentation for LatentConsistencyModelPipeline and LCMScheduler.

* Make remaining failing fast tests pass.

* make style

* Make original_inference_steps configurable from pipeline __call__ again.

* make style

* Remove guidance_rescale arg from pipeline __call__ since LCM currently doesn't support CFG.

* Make LCMScheduler defaults match config of LCM_Dreamshaper_v7 checkpoint.

* Fix LatentConsistencyPipeline slow tests and add dummy expected slices.

* Add checks for original_steps in LCMScheduler.set_timesteps.

* make fix-copies

* Improve LatentConsistencyModelPipeline docs.

* Apply suggestions from code review

Co-authored-by: Aryan V S <avs050602@gmail.com>

* Apply suggestions from code review

Co-authored-by: Aryan V S <avs050602@gmail.com>

* Apply suggestions from code review

Co-authored-by: Aryan V S <avs050602@gmail.com>

* Update src/diffusers/schedulers/scheduling_lcm.py

* Apply suggestions from code review

Co-authored-by: Aryan V S <avs050602@gmail.com>

* finish

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Aryan V S <avs050602@gmail.com>
AmericanPresidentJimmyCarter pushed a commit to AmericanPresidentJimmyCarter/diffusers that referenced this pull request Apr 26, 2024
* initial commit for LatentConsistencyModelPipeline and LCMScheduler based on the community pipeline

* Add callback and freeu support.

* apply suggestions from review

* Clean up LCMScheduler

* Remove timeindex argument to LCMScheduler.step.

* Add support for clipping or thresholding the predicted original sample.

* Remove unused methods and arguments in LCMScheduler.

* Improve comment about (lack of) negative prompt support.

* Change input guidance_scale to match the StableDiffusionPipeline (Imagen) CFG formulation.

* Move lcm_origin_steps from pipeline __call__ to LCMScheduler.__init__/config (as origin_steps).

* Fix typo when clipping/thresholding in LCMScheduler.

* Add some initial LCMScheduler tests.

* add type annotations from review

* Fix type annotation bug.

* Override test_add_noise_device in LCMSchedulerTest since hardcoded timesteps doesn't work under default settings.

* Add generator argument pipeline prepare_latents call.

* Cast LCMScheduler.timesteps to long in set_timesteps.

* Add onestep and multistep full loop scheduler tests.

* Set default height/width to None and don't hardcode guidance scale embedding dim.

* Add initial LatentConsistencyPipeline fast and slow tests.

* Add initial documentation for LatentConsistencyModelPipeline and LCMScheduler.

* Make remaining failing fast tests pass.

* make style

* Make original_inference_steps configurable from pipeline __call__ again.

* make style

* Remove guidance_rescale arg from pipeline __call__ since LCM currently doesn't support CFG.

* Make LCMScheduler defaults match config of LCM_Dreamshaper_v7 checkpoint.

* Fix LatentConsistencyPipeline slow tests and add dummy expected slices.

* Add checks for original_steps in LCMScheduler.set_timesteps.

* make fix-copies

* Improve LatentConsistencyModelPipeline docs.

* Apply suggestions from code review

Co-authored-by: Aryan V S <avs050602@gmail.com>

* Apply suggestions from code review

Co-authored-by: Aryan V S <avs050602@gmail.com>

* Apply suggestions from code review

Co-authored-by: Aryan V S <avs050602@gmail.com>

* Update src/diffusers/schedulers/scheduling_lcm.py

* Apply suggestions from code review

Co-authored-by: Aryan V S <avs050602@gmail.com>

* finish

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Aryan V S <avs050602@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants