[Docs] Extensive / Improved guides about "Text2Image", "Image2Image", "Inpainting" #4758

patrickvonplaten · 2023-08-24T11:41:54Z

We very often run into issues like the following #4392 where people don't know how to correctly use diffusers or are unaware of all the existing features.

I propose to write three very extensive guides or possible even write whole subsections about the main important tasks:

Text2Image
Image2Image
Inpainting

and how to chain those together.

That will replace the following guides:

Each guide should be introduced with an easy example (how to use it) and then go deeper into more advanced use cases. This means:

Text-to-image.
- We explain how a very simple example works and show how different models generate different result. We could showcase the following models here:
  - SD 1.5
  - SDXL
  - Kandinsky 2.2
  - ControlNet
Then we go a bit deeper into "height" & "width" to show the user how the output sizes can be changed
Talk about guidance_scale, generator.
Talk about how extra conditionings can be added via controlnet

Also link to the prompt weighting and optimization docs
We then make a transition to "Modifying existing images" inking to the next "Img2Img" and "Inpaint" sections

=> All examples here can use the AutoPipelineForText2Image.
Image-to-Image
- We show a simple example on how it works, showcasing:
  - SD 1.5
  - SDXL
  - Kandinsky 2.2
  We explain how the input image can/should look like.
  We then go a bit deeper into the "strength" parameter (super important parameter!!!). We explain how the width & height is determined by the image itself.
  We then explain how img2img can be chained right after text-to-image - keeping everything in latent space.
  We explain how img2img can be used to make upscaled images sharper.
  We then explain how mulitple img2img models can be chained together for just a few steps (e.g. it's totally reasonable to use multiple differently fine-tuned SD checkpoints for image translation.
  We show how Kandinsky & Stable Diffusion can be mixed.
  We explain how to use controlnet for img2img.
=> All examples here can use the AutoPipelineForImage2Image.
Inpainting
- We show a simple example on how it works, showcasing:
  - SD 1.5 inpainting
  - Kandinsky 2.2 inpainting
We explain how the input & mask image can/should look like.
We then go a bit deeper into the "strength" parameter again. We explain how the width & height is determined by the image itself.
We then explain how inpainting can be chained right after text-to-image or image-to-image - keeping everything in latent space and without reloading the whole model.
We explain how img2img and inpainting can be super similar (cc @yiyixuxu - we chatted about this yesterday)
We then explain how mulitple inpainting models can be chained together for just a few steps (e.g. it's totally reasonable to use multiple differently fine-tuned SD checkpoints for image translation.
We show how Kandinsky & Stable Diffusion can be mixed.
We explain how to use controlnet for inpainting.

=> All examples here can use the AutoPipelineForInpainting.

I think it's worth to make this a really in-detail / easy-to-understand guide, make sure it works in colab and also think about creating some video content about it.

Thoughts? @pcuenca @williamberman @sayakpaul @yiyixuxu @DN6 @stevhliu @patil-suraj

If you like the idea, maybe @stevhliu and I could look more into this

The text was updated successfully, but these errors were encountered:

patrickvonplaten · 2023-08-24T11:46:56Z

Main motivation are constant issues such as: #4392 (comment)

patrickvonplaten · 2023-08-24T11:52:52Z

Chaining example: https://colab.research.google.com/drive/1IxYOiGHdiqBJfb-N7WXaSxmPkuNfdG1g?usp=sharing

williamberman · 2023-08-24T18:43:57Z

Yes, this is a very good idea. I do feel like we suffer from a bit of document sprawl where there are lots of good docs but not an obvious order to read them through.

In the limit, a chapter dependency dag is always nice 😁

yiyixuxu · 2023-08-24T20:53:14Z

love the idea!

JohanHuynh0130 · 2023-08-24T22:07:43Z

Love the Idea as well! I have tons of difficulty trying to understand the documentation of Diffusers

sayakpaul · 2023-08-25T03:45:16Z

Love the Idea as well! I have tons of difficulty trying to understand the documentation of Diffusers

Could you pinpoint some of it in bulleted points?

DN6 · 2023-08-25T07:06:59Z

Better guides are always a good idea 🙂

stevhliu · 2023-08-25T16:10:26Z

Awesome idea! 💯

VigneshHexo · 2023-08-26T13:50:06Z

Following!
Would be happy to write one! Please let me know @patrickvonplaten

vionwinnie · 2023-08-29T20:36:21Z

Happy to help!

github-actions · 2023-10-18T15:10:22Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

stevhliu · 2023-10-18T16:09:18Z

Closing since we've updated the guides already :)

patrickvonplaten changed the title ~~Extensive / Improved guides about "Text2Image", "Image2Image", "Inpainting"~~ [Docs] Extensive / Improved guides about "Text2Image", "Image2Image", "Inpainting" Aug 24, 2023

patrickvonplaten mentioned this issue Aug 24, 2023

SDXL 1.0 Inpainting - Lower result quality with certain masks #4392

Closed

This was referenced Sep 7, 2023

[docs] Improved text-to-image guide #4938

Merged

[docs] Improved image-to-image guide #5020

Merged

stevhliu mentioned this issue Sep 27, 2023

[docs] Improved inpaint docs #5210

Merged

github-actions bot added the stale Issues that haven't received updates label Oct 18, 2023

stevhliu closed this as completed Oct 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Docs] Extensive / Improved guides about "Text2Image", "Image2Image", "Inpainting" #4758

[Docs] Extensive / Improved guides about "Text2Image", "Image2Image", "Inpainting" #4758

patrickvonplaten commented Aug 24, 2023

patrickvonplaten commented Aug 24, 2023

patrickvonplaten commented Aug 24, 2023

williamberman commented Aug 24, 2023

yiyixuxu commented Aug 24, 2023

JohanHuynh0130 commented Aug 24, 2023

sayakpaul commented Aug 25, 2023

DN6 commented Aug 25, 2023

stevhliu commented Aug 25, 2023

VigneshHexo commented Aug 26, 2023 •

edited

Loading

vionwinnie commented Aug 29, 2023

github-actions bot commented Oct 18, 2023

stevhliu commented Oct 18, 2023

[Docs] Extensive / Improved guides about "Text2Image", "Image2Image", "Inpainting" #4758

[Docs] Extensive / Improved guides about "Text2Image", "Image2Image", "Inpainting" #4758

Comments

patrickvonplaten commented Aug 24, 2023

patrickvonplaten commented Aug 24, 2023

patrickvonplaten commented Aug 24, 2023

williamberman commented Aug 24, 2023

yiyixuxu commented Aug 24, 2023

JohanHuynh0130 commented Aug 24, 2023

sayakpaul commented Aug 25, 2023

DN6 commented Aug 25, 2023

stevhliu commented Aug 25, 2023

VigneshHexo commented Aug 26, 2023 • edited Loading

vionwinnie commented Aug 29, 2023

github-actions bot commented Oct 18, 2023

stevhliu commented Oct 18, 2023

VigneshHexo commented Aug 26, 2023 •

edited

Loading