-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Latent Consistency Models Pipeline #5448
Add Latent Consistency Models Pipeline #5448
Conversation
…sed on the community pipeline
src/diffusers/pipelines/latent_consistency_models/pipeline_latent_consistency_models.py
Outdated
Show resolved
Hide resolved
src/diffusers/pipelines/latent_consistency_models/pipeline_latent_consistency_models.py
Outdated
Show resolved
Hide resolved
def __call__( | ||
self, | ||
prompt: Union[str, List[str]] = None, | ||
height: Optional[int] = 768, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the default 512 or 768?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the SimianLuo/LCM_Dreamshaper_v7
checkpoint the VAE sample_size
is 768 and the community pipeline also uses 768:
diffusers/examples/community/latent_consistency_txt2img.py
Lines 202 to 203 in e516858
height: Optional[int] = 768, | |
width: Optional[int] = 768, |
@luosiallen to confirm what the best default resolution is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can set the default as 768 has the best generation quality, most of the SD model like SD-V2.1 or Dreamshaper is trained on 768 resolution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I think it might be more convenient to set the default height
and width
to None
and let __call__
infer them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good idea.
Very clean! Think we can merge this soon :-) |
src/diffusers/pipelines/latent_consistency_models/pipeline_latent_consistency_models.py
Outdated
Show resolved
Hide resolved
src/diffusers/pipelines/latent_consistency_models/pipeline_latent_consistency_models.py
Outdated
Show resolved
Hide resolved
prompt, | ||
device, | ||
num_images_per_prompt, | ||
False, # Don't need to get negative prompts due to LCM guided distillation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify: LCM currently supports only the unconditional prompt "" and does not support any other types of negative prompts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. So if I understand correctly, the pipeline __call__
method should not take in negative_prompt
or negative_prompt_embeds
arguments, but we should still allow encode_prompt
to prepare the default unconditional prompt ""
if guidance_scale > 0.0
, and perform CFG as normal in the rest of the pipeline?
[My understanding is that the diffusers
convention is to use the Imagen CFG formulation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, i think currently setting the negative prompt as False is correct. All we need to do is to change the comment for better clarification: # LCM guided distillation currently only support unconditional prompt "", don't support other types of negative prompt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The unconditional prompt is not used in LCM actually. Because it has already been distilled in LCM. So we only take input of prompt_embeds. Also LCM does not support negative prompts currently, maybe try to fix it in the next version of LCM?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated the comment to provide a (hopefully) better description about negative prompt support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[My understanding is that the diffusers convention is to use the Imagen CFG formulation
where
and
is the guidance scale used in the LCM paper, so I will probably change the guidance scale to follow the above definition.]
Indeed.
…gen) CFG formulation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice clarification!
prompt, | ||
device, | ||
num_images_per_prompt, | ||
False, # Don't need to get negative prompts due to LCM guided distillation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The unconditional prompt is not used in LCM actually. Because it has already been distilled in LCM. So we only take input of prompt_embeds. Also LCM does not support negative prompts currently, maybe try to fix it in the next version of LCM?
def __call__( | ||
self, | ||
prompt: Union[str, List[str]] = None, | ||
height: Optional[int] = 768, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can set the default as 768 has the best generation quality, most of the SD model like SD-V2.1 or Dreamshaper is trained on 768 resolution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good!
…/config (as origin_steps).
src/diffusers/pipelines/latent_consistency_models/pipeline_latent_consistency_models.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great overall! It was fun reading through the paper, although I don't fully comprehend it, and quite awesome to see fast inference in such few steps yielding great results.
Apologies about the aggressive type hinting suggestions 😅 I'm trying to help make everything typed for better static type analysis and docs support for those digging deeper into the intrinsics of diffusers.
src/diffusers/pipelines/latent_consistency_models/pipeline_latent_consistency_models.py
Outdated
Show resolved
Hide resolved
src/diffusers/pipelines/latent_consistency_models/pipeline_latent_consistency_models.py
Outdated
Show resolved
Hide resolved
src/diffusers/pipelines/latent_consistency_models/pipeline_latent_consistency_models.py
Outdated
Show resolved
Hide resolved
src/diffusers/pipelines/latent_consistency_models/pipeline_latent_consistency_models.py
Outdated
Show resolved
Hide resolved
src/diffusers/pipelines/latent_consistency_models/pipeline_latent_consistency_models.py
Outdated
Show resolved
Hide resolved
…mesteps doesn't work under default settings.
…y doesn't support CFG.
The documentation is not available anymore as the PR was closed or merged. |
Co-authored-by: Aryan V S <avs050602@gmail.com>
Co-authored-by: Aryan V S <avs050602@gmail.com>
Co-authored-by: Aryan V S <avs050602@gmail.com>
Co-authored-by: Aryan V S <avs050602@gmail.com>
* initial commit for LatentConsistencyModelPipeline and LCMScheduler based on the community pipeline * Add callback and freeu support. * apply suggestions from review * Clean up LCMScheduler * Remove timeindex argument to LCMScheduler.step. * Add support for clipping or thresholding the predicted original sample. * Remove unused methods and arguments in LCMScheduler. * Improve comment about (lack of) negative prompt support. * Change input guidance_scale to match the StableDiffusionPipeline (Imagen) CFG formulation. * Move lcm_origin_steps from pipeline __call__ to LCMScheduler.__init__/config (as origin_steps). * Fix typo when clipping/thresholding in LCMScheduler. * Add some initial LCMScheduler tests. * add type annotations from review * Fix type annotation bug. * Override test_add_noise_device in LCMSchedulerTest since hardcoded timesteps doesn't work under default settings. * Add generator argument pipeline prepare_latents call. * Cast LCMScheduler.timesteps to long in set_timesteps. * Add onestep and multistep full loop scheduler tests. * Set default height/width to None and don't hardcode guidance scale embedding dim. * Add initial LatentConsistencyPipeline fast and slow tests. * Add initial documentation for LatentConsistencyModelPipeline and LCMScheduler. * Make remaining failing fast tests pass. * make style * Make original_inference_steps configurable from pipeline __call__ again. * make style * Remove guidance_rescale arg from pipeline __call__ since LCM currently doesn't support CFG. * Make LCMScheduler defaults match config of LCM_Dreamshaper_v7 checkpoint. * Fix LatentConsistencyPipeline slow tests and add dummy expected slices. * Add checks for original_steps in LCMScheduler.set_timesteps. * make fix-copies * Improve LatentConsistencyModelPipeline docs. * Apply suggestions from code review Co-authored-by: Aryan V S <avs050602@gmail.com> * Apply suggestions from code review Co-authored-by: Aryan V S <avs050602@gmail.com> * Apply suggestions from code review Co-authored-by: Aryan V S <avs050602@gmail.com> * Update src/diffusers/schedulers/scheduling_lcm.py * Apply suggestions from code review Co-authored-by: Aryan V S <avs050602@gmail.com> * finish --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Aryan V S <avs050602@gmail.com>
* initial commit for LatentConsistencyModelPipeline and LCMScheduler based on the community pipeline * Add callback and freeu support. * apply suggestions from review * Clean up LCMScheduler * Remove timeindex argument to LCMScheduler.step. * Add support for clipping or thresholding the predicted original sample. * Remove unused methods and arguments in LCMScheduler. * Improve comment about (lack of) negative prompt support. * Change input guidance_scale to match the StableDiffusionPipeline (Imagen) CFG formulation. * Move lcm_origin_steps from pipeline __call__ to LCMScheduler.__init__/config (as origin_steps). * Fix typo when clipping/thresholding in LCMScheduler. * Add some initial LCMScheduler tests. * add type annotations from review * Fix type annotation bug. * Override test_add_noise_device in LCMSchedulerTest since hardcoded timesteps doesn't work under default settings. * Add generator argument pipeline prepare_latents call. * Cast LCMScheduler.timesteps to long in set_timesteps. * Add onestep and multistep full loop scheduler tests. * Set default height/width to None and don't hardcode guidance scale embedding dim. * Add initial LatentConsistencyPipeline fast and slow tests. * Add initial documentation for LatentConsistencyModelPipeline and LCMScheduler. * Make remaining failing fast tests pass. * make style * Make original_inference_steps configurable from pipeline __call__ again. * make style * Remove guidance_rescale arg from pipeline __call__ since LCM currently doesn't support CFG. * Make LCMScheduler defaults match config of LCM_Dreamshaper_v7 checkpoint. * Fix LatentConsistencyPipeline slow tests and add dummy expected slices. * Add checks for original_steps in LCMScheduler.set_timesteps. * make fix-copies * Improve LatentConsistencyModelPipeline docs. * Apply suggestions from code review Co-authored-by: Aryan V S <avs050602@gmail.com> * Apply suggestions from code review Co-authored-by: Aryan V S <avs050602@gmail.com> * Apply suggestions from code review Co-authored-by: Aryan V S <avs050602@gmail.com> * Update src/diffusers/schedulers/scheduling_lcm.py * Apply suggestions from code review Co-authored-by: Aryan V S <avs050602@gmail.com> * finish --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Aryan V S <avs050602@gmail.com>
* initial commit for LatentConsistencyModelPipeline and LCMScheduler based on the community pipeline * Add callback and freeu support. * apply suggestions from review * Clean up LCMScheduler * Remove timeindex argument to LCMScheduler.step. * Add support for clipping or thresholding the predicted original sample. * Remove unused methods and arguments in LCMScheduler. * Improve comment about (lack of) negative prompt support. * Change input guidance_scale to match the StableDiffusionPipeline (Imagen) CFG formulation. * Move lcm_origin_steps from pipeline __call__ to LCMScheduler.__init__/config (as origin_steps). * Fix typo when clipping/thresholding in LCMScheduler. * Add some initial LCMScheduler tests. * add type annotations from review * Fix type annotation bug. * Override test_add_noise_device in LCMSchedulerTest since hardcoded timesteps doesn't work under default settings. * Add generator argument pipeline prepare_latents call. * Cast LCMScheduler.timesteps to long in set_timesteps. * Add onestep and multistep full loop scheduler tests. * Set default height/width to None and don't hardcode guidance scale embedding dim. * Add initial LatentConsistencyPipeline fast and slow tests. * Add initial documentation for LatentConsistencyModelPipeline and LCMScheduler. * Make remaining failing fast tests pass. * make style * Make original_inference_steps configurable from pipeline __call__ again. * make style * Remove guidance_rescale arg from pipeline __call__ since LCM currently doesn't support CFG. * Make LCMScheduler defaults match config of LCM_Dreamshaper_v7 checkpoint. * Fix LatentConsistencyPipeline slow tests and add dummy expected slices. * Add checks for original_steps in LCMScheduler.set_timesteps. * make fix-copies * Improve LatentConsistencyModelPipeline docs. * Apply suggestions from code review Co-authored-by: Aryan V S <avs050602@gmail.com> * Apply suggestions from code review Co-authored-by: Aryan V S <avs050602@gmail.com> * Apply suggestions from code review Co-authored-by: Aryan V S <avs050602@gmail.com> * Update src/diffusers/schedulers/scheduling_lcm.py * Apply suggestions from code review Co-authored-by: Aryan V S <avs050602@gmail.com> * finish --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Aryan V S <avs050602@gmail.com>
What does this PR do?
This PR adds a pipeline and scheduler for Latent Consistency Models (LCM; paper, project page) based on the LCM community pipeline added in #5438, originally authored by @luosiallen.
Latent consistency models are an extension of consistency models which operate in the latent space of a VAE. Analogous to how diffusion models can be distilled to a consistency model for fast onestep or few-step generation, latent diffusion models such as Stable Diffusion can be distilled to a latent consistency model for fast onestep or few-step generation.
Before submitting
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@patrickvonplaten
@williamberman
@luosiallen