-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Docs] Extensive / Improved guides about "Text2Image", "Image2Image", "Inpainting" #4758
Comments
Main motivation are constant issues such as: #4392 (comment) |
love the idea! |
Love the Idea as well! I have tons of difficulty trying to understand the documentation of Diffusers |
Could you pinpoint some of it in bulleted points? |
Better guides are always a good idea 🙂 |
Awesome idea! 💯 |
Following! |
Happy to help! |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
Closing since we've updated the guides already :) |
We very often run into issues like the following #4392 where people don't know how to correctly use
diffusers
or are unaware of all the existing features.I propose to write three very extensive guides or possible even write whole subsections about the main important tasks:
and how to chain those together.
That will replace the following guides:
Each guide should be introduced with an easy example (how to use it) and then go deeper into more advanced use cases. This means:
Text-to-image.
Then we go a bit deeper into "height" & "width" to show the user how the output sizes can be changed
Talk about
guidance_scale
,generator
.Talk about how extra conditionings can be added via controlnet
Also link to the prompt weighting and optimization docs
We then make a transition to "Modifying existing images" inking to the next "Img2Img" and "Inpaint" sections
=> All examples here can use the
AutoPipelineForText2Image
.Image-to-Image
We show a simple example on how it works, showcasing:
We explain how the input image can/should look like.
We then go a bit deeper into the "strength" parameter (super important parameter!!!). We explain how the width & height is determined by the image itself.
We then explain how img2img can be chained right after text-to-image - keeping everything in latent space.
We explain how img2img can be used to make upscaled images sharper.
We then explain how mulitple img2img models can be chained together for just a few steps (e.g. it's totally reasonable to use multiple differently fine-tuned SD checkpoints for image translation.
We show how Kandinsky & Stable Diffusion can be mixed.
We explain how to use controlnet for img2img.
=> All examples here can use the
AutoPipelineForImage2Image
.Inpainting
We explain how the input & mask image can/should look like.
We then go a bit deeper into the "strength" parameter again. We explain how the width & height is determined by the image itself.
We then explain how inpainting can be chained right after text-to-image or image-to-image - keeping everything in latent space and without reloading the whole model.
We explain how img2img and inpainting can be super similar (cc @yiyixuxu - we chatted about this yesterday)
We then explain how mulitple inpainting models can be chained together for just a few steps (e.g. it's totally reasonable to use multiple differently fine-tuned SD checkpoints for image translation.
We show how Kandinsky & Stable Diffusion can be mixed.
We explain how to use controlnet for inpainting.
=> All examples here can use the
AutoPipelineForInpainting
.I think it's worth to make this a really in-detail / easy-to-understand guide, make sure it works in colab and also think about creating some video content about it.
Thoughts? @pcuenca @williamberman @sayakpaul @yiyixuxu @DN6 @stevhliu @patil-suraj
If you like the idea, maybe @stevhliu and I could look more into this
The text was updated successfully, but these errors were encountered: