From 51d0dd25859d3eb68725e23e27572fa1099bfd32 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?M=2E=20Tolga=20Cang=C3=B6z?= <46008593+standardAI@users.noreply.github.com> Date: Thu, 2 Nov 2023 21:05:43 +0300 Subject: [PATCH] [Docs] Fix typos, improve, update at Using Diffusers' Loading & Hub page (#5584) * Fix typos, improve, update * Change to trending and apply some Grammarly fixes * Grammarly fixes * Update loading_adapters.md * Update loading_adapters.md * Update other-formats.md * Update push_to_hub.md * Update loading_adapters.md * Update loading.md * Update docs/source/en/using-diffusers/push_to_hub.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update schedulers.md * Update docs/source/en/using-diffusers/loading.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/using-diffusers/loading_adapters.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update A1111 LoRA files part * Update other-formats.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --- .../custom_pipeline_overview.md | 2 +- docs/source/en/using-diffusers/loading.md | 72 +++++++++++-------- .../en/using-diffusers/loading_adapters.md | 33 +++++---- .../en/using-diffusers/loading_overview.md | 2 +- .../en/using-diffusers/other-formats.md | 40 ++++------- docs/source/en/using-diffusers/push_to_hub.md | 28 +++++--- docs/source/en/using-diffusers/schedulers.md | 62 ++++++++++------ .../en/using-diffusers/using_safetensors.md | 20 ++++-- 8 files changed, 156 insertions(+), 103 deletions(-) diff --git a/docs/source/en/using-diffusers/custom_pipeline_overview.md b/docs/source/en/using-diffusers/custom_pipeline_overview.md index 11c06899af258..f602e73eb2c64 100644 --- a/docs/source/en/using-diffusers/custom_pipeline_overview.md +++ b/docs/source/en/using-diffusers/custom_pipeline_overview.md @@ -163,4 +163,4 @@ video_frames = pipeline( ).frames ``` -Here, notice the use of the `trust_remote_code` argument while initializing the pipeline. It is responsible for handling all the "magic" behind the scenes. \ No newline at end of file +Here, notice the use of the `trust_remote_code` argument while initializing the pipeline. It is responsible for handling all the "magic" behind the scenes. diff --git a/docs/source/en/using-diffusers/loading.md b/docs/source/en/using-diffusers/loading.md index 3fb11ac92c1f5..57348e849e6b9 100644 --- a/docs/source/en/using-diffusers/loading.md +++ b/docs/source/en/using-diffusers/loading.md @@ -29,11 +29,11 @@ This guide will show you how to load: -๐Ÿ’ก Skip to the [DiffusionPipeline explained](#diffusionpipeline-explained) section if you interested in learning in more detail about how the [`DiffusionPipeline`] class works. +๐Ÿ’ก Skip to the [DiffusionPipeline explained](#diffusionpipeline-explained) section if you are interested in learning in more detail about how the [`DiffusionPipeline`] class works. -The [`DiffusionPipeline`] class is the simplest and most generic way to load any diffusion model from the [Hub](https://huggingface.co/models?library=diffusers). The [`DiffusionPipeline.from_pretrained`] method automatically detects the correct pipeline class from the checkpoint, downloads and caches all the required configuration and weight files, and returns a pipeline instance ready for inference. +The [`DiffusionPipeline`] class is the simplest and most generic way to load the latest trending diffusion model from the [Hub](https://huggingface.co/models?library=diffusers&sort=trending). The [`DiffusionPipeline.from_pretrained`] method automatically detects the correct pipeline class from the checkpoint, downloads, and caches all the required configuration and weight files, and returns a pipeline instance ready for inference. ```python from diffusers import DiffusionPipeline @@ -42,7 +42,7 @@ repo_id = "runwayml/stable-diffusion-v1-5" pipe = DiffusionPipeline.from_pretrained(repo_id, use_safetensors=True) ``` -You can also load a checkpoint with it's specific pipeline class. The example above loaded a Stable Diffusion model; to get the same result, use the [`StableDiffusionPipeline`] class: +You can also load a checkpoint with its specific pipeline class. The example above loaded a Stable Diffusion model; to get the same result, use the [`StableDiffusionPipeline`] class: ```python from diffusers import StableDiffusionPipeline @@ -51,7 +51,7 @@ repo_id = "runwayml/stable-diffusion-v1-5" pipe = StableDiffusionPipeline.from_pretrained(repo_id, use_safetensors=True) ``` -A checkpoint (such as [`CompVis/stable-diffusion-v1-4`](https://huggingface.co/CompVis/stable-diffusion-v1-4) or [`runwayml/stable-diffusion-v1-5`](https://huggingface.co/runwayml/stable-diffusion-v1-5)) may also be used for more than one task, like text-to-image or image-to-image. To differentiate what task you want to use the checkpoint for, you have to load it directly with it's corresponding task-specific pipeline class: +A checkpoint (such as [`CompVis/stable-diffusion-v1-4`](https://huggingface.co/CompVis/stable-diffusion-v1-4) or [`runwayml/stable-diffusion-v1-5`](https://huggingface.co/runwayml/stable-diffusion-v1-5)) may also be used for more than one task, like text-to-image or image-to-image. To differentiate what task you want to use the checkpoint for, you have to load it directly with its corresponding task-specific pipeline class: ```python from diffusers import StableDiffusionImg2ImgPipeline @@ -103,12 +103,10 @@ Let's use the [`SchedulerMixin.from_pretrained`] method to replace the default [ Then you can pass the new [`EulerDiscreteScheduler`] instance to the `scheduler` argument in [`DiffusionPipeline`]: ```python -from diffusers import DiffusionPipeline, EulerDiscreteScheduler, DPMSolverMultistepScheduler +from diffusers import DiffusionPipeline, EulerDiscreteScheduler repo_id = "runwayml/stable-diffusion-v1-5" - scheduler = EulerDiscreteScheduler.from_pretrained(repo_id, subfolder="scheduler") - stable_diffusion = DiffusionPipeline.from_pretrained(repo_id, scheduler=scheduler, use_safetensors=True) ``` @@ -121,6 +119,9 @@ from diffusers import DiffusionPipeline repo_id = "runwayml/stable-diffusion-v1-5" stable_diffusion = DiffusionPipeline.from_pretrained(repo_id, safety_checker=None, use_safetensors=True) +""" +You have disabled the safety checker for by passing `safety_checker=None`. Ensure that you abide by the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend keeping the safety filter enabled in all public-facing circumstances, disabling it only for use cases that involve analyzing network behavior or auditing its results. For more information, please have a look at https://github.com/huggingface/diffusers/pull/254 . +""" ``` ### Reuse components across pipelines @@ -163,10 +164,10 @@ stable_diffusion_img2img = StableDiffusionImg2ImgPipeline( ## Checkpoint variants -A checkpoint variant is usually a checkpoint where it's weights are: +A checkpoint variant is usually a checkpoint whose weights are: - Stored in a different floating point type for lower precision and lower storage, such as [`torch.float16`](https://pytorch.org/docs/stable/tensors.html#data-types), because it only requires half the bandwidth and storage to download. You can't use this variant if you're continuing training or using a CPU. -- Non-exponential mean averaged (EMA) weights which shouldn't be used for inference. You should use these to continue finetuning a model. +- Non-exponential mean averaged (EMA) weights, which shouldn't be used for inference. You should use these to continue fine-tuning a model. @@ -174,7 +175,7 @@ A checkpoint variant is usually a checkpoint where it's weights are: -Otherwise, a variant is **identical** to the original checkpoint. They have exactly the same serialization format (like [Safetensors](./using_safetensors)), model structure, and weights have identical tensor shapes. +Otherwise, a variant is **identical** to the original checkpoint. They have exactly the same serialization format (like [Safetensors](./using_safetensors)), model structure, and weights that have identical tensor shapes. | **checkpoint type** | **weight name** | **argument for loading weights** | |---------------------|-------------------------------------|----------------------------------| @@ -202,7 +203,7 @@ stable_diffusion = DiffusionPipeline.from_pretrained( ) ``` -To save a checkpoint stored in a different floating point type or as a non-EMA variant, use the [`DiffusionPipeline.save_pretrained`] method and specify the `variant` argument. You should try and save a variant to the same folder as the original checkpoint, so you can load both from the same folder: +To save a checkpoint stored in a different floating-point type or as a non-EMA variant, use the [`DiffusionPipeline.save_pretrained`] method and specify the `variant` argument. You should try and save a variant to the same folder as the original checkpoint, so you can load both from the same folder: ```python from diffusers import DiffusionPipeline @@ -247,7 +248,7 @@ The above example is therefore deprecated and won't be supported anymore for `di If you load diffusers pipelines or models with `revision="fp16"` or `revision="non_ema"`, -please make sure to update to code and use `variant="fp16"` or `variation="non_ema"` respectively +please make sure to update the code and use `variant="fp16"` or `variation="non_ema"` respectively instead. @@ -255,7 +256,7 @@ instead. ## Models -Models are loaded from the [`ModelMixin.from_pretrained`] method, which downloads and caches the latest version of the model weights and configurations. If the latest files are available in the local cache, [`~ModelMixin.from_pretrained`] reuses files in the cache instead of redownloading them. +Models are loaded from the [`ModelMixin.from_pretrained`] method, which downloads and caches the latest version of the model weights and configurations. If the latest files are available in the local cache, [`~ModelMixin.from_pretrained`] reuses files in the cache instead of re-downloading them. Models can be loaded from a subfolder with the `subfolder` argument. For example, the model weights for `runwayml/stable-diffusion-v1-5` are stored in the [`unet`](https://huggingface.co/runwayml/stable-diffusion-v1-5/tree/main/unet) subfolder: @@ -281,9 +282,9 @@ You can also load and save model variants by specifying the `variant` argument i from diffusers import UNet2DConditionModel model = UNet2DConditionModel.from_pretrained( - "runwayml/stable-diffusion-v1-5", subfolder="unet", variant="non-ema", use_safetensors=True + "runwayml/stable-diffusion-v1-5", subfolder="unet", variant="non_ema", use_safetensors=True ) -model.save_pretrained("./local-unet", variant="non-ema") +model.save_pretrained("./local-unet", variant="non_ema") ``` ## Schedulers @@ -291,7 +292,7 @@ model.save_pretrained("./local-unet", variant="non-ema") Schedulers are loaded from the [`SchedulerMixin.from_pretrained`] method, and unlike models, schedulers are **not parameterized** or **trained**; they are defined by a configuration file. Loading schedulers does not consume any significant amount of memory and the same configuration file can be used for a variety of different schedulers. -For example, the following schedulers are compatible with [`StableDiffusionPipeline`] which means you can load the same scheduler configuration file in any of these classes: +For example, the following schedulers are compatible with [`StableDiffusionPipeline`], which means you can load the same scheduler configuration file in any of these classes: ```python from diffusers import StableDiffusionPipeline @@ -300,8 +301,8 @@ from diffusers import ( DDIMScheduler, PNDMScheduler, LMSDiscreteScheduler, - EulerDiscreteScheduler, EulerAncestralDiscreteScheduler, + EulerDiscreteScheduler, DPMSolverMultistepScheduler, ) @@ -324,9 +325,9 @@ pipeline = StableDiffusionPipeline.from_pretrained(repo_id, scheduler=dpm, use_s As a class method, [`DiffusionPipeline.from_pretrained`] is responsible for two things: - Download the latest version of the folder structure required for inference and cache it. If the latest folder structure is available in the local cache, [`DiffusionPipeline.from_pretrained`] reuses the cache and won't redownload the files. -- Load the cached weights into the correct pipeline [class](./api/pipelines/overview#diffusers-summary) - retrieved from the `model_index.json` file - and return an instance of it. +- Load the cached weights into the correct pipeline [class](../api/pipelines/overview#diffusers-summary) - retrieved from the `model_index.json` file - and return an instance of it. -The pipelines underlying folder structure corresponds directly with their class instances. For example, the [`StableDiffusionPipeline`] corresponds to the folder structure in [`runwayml/stable-diffusion-v1-5`](https://huggingface.co/runwayml/stable-diffusion-v1-5). +The pipelines' underlying folder structure corresponds directly with their class instances. For example, the [`StableDiffusionPipeline`] corresponds to the folder structure in [`runwayml/stable-diffusion-v1-5`](https://huggingface.co/runwayml/stable-diffusion-v1-5). ```python from diffusers import DiffusionPipeline @@ -338,13 +339,13 @@ print(pipeline) You'll see pipeline is an instance of [`StableDiffusionPipeline`], which consists of seven components: -- `"feature_extractor"`: a [`~transformers.CLIPFeatureExtractor`] from ๐Ÿค— Transformers. +- `"feature_extractor"`: a [`~transformers.CLIPImageProcessor`] from ๐Ÿค— Transformers. - `"safety_checker"`: a [component](https://github.com/huggingface/diffusers/blob/e55687e1e15407f60f32242027b7bb8170e58266/src/diffusers/pipelines/stable_diffusion/safety_checker.py#L32) for screening against harmful content. - `"scheduler"`: an instance of [`PNDMScheduler`]. - `"text_encoder"`: a [`~transformers.CLIPTextModel`] from ๐Ÿค— Transformers. - `"tokenizer"`: a [`~transformers.CLIPTokenizer`] from ๐Ÿค— Transformers. - `"unet"`: an instance of [`UNet2DConditionModel`]. -- `"vae"` an instance of [`AutoencoderKL`]. +- `"vae"`: an instance of [`AutoencoderKL`]. ```json StableDiffusionPipeline { @@ -379,7 +380,7 @@ StableDiffusionPipeline { } ``` -Compare the components of the pipeline instance to the [`runwayml/stable-diffusion-v1-5`](https://huggingface.co/runwayml/stable-diffusion-v1-5) folder structure, and you'll see there is a separate folder for each of the components in the repository: +Compare the components of the pipeline instance to the [`runwayml/stable-diffusion-v1-5`](https://huggingface.co/runwayml/stable-diffusion-v1-5/tree/main) folder structure, and you'll see there is a separate folder for each of the components in the repository: ``` . @@ -388,12 +389,18 @@ Compare the components of the pipeline instance to the [`runwayml/stable-diffusi โ”œโ”€โ”€ model_index.json โ”œโ”€โ”€ safety_checker โ”‚ย ย  โ”œโ”€โ”€ config.json -โ”‚ย ย  โ””โ”€โ”€ pytorch_model.bin +| โ”œโ”€โ”€ model.fp16.safetensors +โ”‚ โ”œโ”€โ”€ model.safetensors +โ”‚ โ”œโ”€โ”€ pytorch_model.bin +| โ””โ”€โ”€ pytorch_model.fp16.bin โ”œโ”€โ”€ scheduler โ”‚ย ย  โ””โ”€โ”€ scheduler_config.json โ”œโ”€โ”€ text_encoder โ”‚ย ย  โ”œโ”€โ”€ config.json -โ”‚ย ย  โ””โ”€โ”€ pytorch_model.bin +| โ”œโ”€โ”€ model.fp16.safetensors +โ”‚ โ”œโ”€โ”€ model.safetensors +โ”‚ |โ”€โ”€ pytorch_model.bin +| โ””โ”€โ”€ pytorch_model.fp16.bin โ”œโ”€โ”€ tokenizer โ”‚ย ย  โ”œโ”€โ”€ merges.txt โ”‚ย ย  โ”œโ”€โ”€ special_tokens_map.json @@ -402,9 +409,17 @@ Compare the components of the pipeline instance to the [`runwayml/stable-diffusi โ”œโ”€โ”€ unet โ”‚ย ย  โ”œโ”€โ”€ config.json โ”‚ย ย  โ”œโ”€โ”€ diffusion_pytorch_model.bin -โ””โ”€โ”€ vae - โ”œโ”€โ”€ config.json - โ”œโ”€โ”€ diffusion_pytorch_model.bin +| |โ”€โ”€ diffusion_pytorch_model.fp16.bin +โ”‚ |โ”€โ”€ diffusion_pytorch_model.f16.safetensors +โ”‚ |โ”€โ”€ diffusion_pytorch_model.non_ema.bin +โ”‚ |โ”€โ”€ diffusion_pytorch_model.non_ema.safetensors +โ”‚ โ””โ”€โ”€ diffusion_pytorch_model.safetensors +|โ”€โ”€ vae +. โ”œโ”€โ”€ config.json +. โ”œโ”€โ”€ diffusion_pytorch_model.bin + โ”œโ”€โ”€ diffusion_pytorch_model.fp16.bin + โ”œโ”€โ”€ diffusion_pytorch_model.fp16.safetensors + โ””โ”€โ”€ diffusion_pytorch_model.safetensors ``` You can access each of the components of the pipeline as an attribute to view its configuration: @@ -424,10 +439,11 @@ CLIPTokenizer( "unk_token": AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=True), "pad_token": "<|endoftext|>", }, + clean_up_tokenization_spaces=True ) ``` -Every pipeline expects a `model_index.json` file that tells the [`DiffusionPipeline`]: +Every pipeline expects a [`model_index.json`](https://huggingface.co/runwayml/stable-diffusion-v1-5/blob/main/model_index.json) file that tells the [`DiffusionPipeline`]: - which pipeline class to load from `_class_name` - which version of ๐Ÿงจ Diffusers was used to create the model in `_diffusers_version` diff --git a/docs/source/en/using-diffusers/loading_adapters.md b/docs/source/en/using-diffusers/loading_adapters.md index 0514688721d1d..8f6bf85da318f 100644 --- a/docs/source/en/using-diffusers/loading_adapters.md +++ b/docs/source/en/using-diffusers/loading_adapters.md @@ -14,13 +14,13 @@ specific language governing permissions and limitations under the License. [[open-in-colab]] -There are several [training](../training/overview) techniques for personalizing diffusion models to generate images of a specific subject or images in certain styles. Each of these training methods produce a different type of adapter. Some of the adapters generate an entirely new model, while other adapters only modify a smaller set of embeddings or weights. This means the loading process for each adapter is also different. +There are several [training](../training/overview) techniques for personalizing diffusion models to generate images of a specific subject or images in certain styles. Each of these training methods produces a different type of adapter. Some of the adapters generate an entirely new model, while other adapters only modify a smaller set of embeddings or weights. This means the loading process for each adapter is also different. This guide will show you how to load DreamBooth, textual inversion, and LoRA weights. -Feel free to browse the [Stable Diffusion Conceptualizer](https://huggingface.co/spaces/sd-concepts-library/stable-diffusion-conceptualizer), [LoRA the Explorer](multimodalart/LoraTheExplorer), and the [Diffusers Models Gallery](https://huggingface.co/spaces/huggingface-projects/diffusers-gallery) for checkpoints and embeddings to use. +Feel free to browse the [Stable Diffusion Conceptualizer](https://huggingface.co/spaces/sd-concepts-library/stable-diffusion-conceptualizer), [LoRA the Explorer](https://huggingface.co/spaces/multimodalart/LoraTheExplorer), and the [Diffusers Models Gallery](https://huggingface.co/spaces/huggingface-projects/diffusers-gallery) for checkpoints and embeddings to use. @@ -37,6 +37,7 @@ import torch pipeline = AutoPipelineForText2Image.from_pretrained("sd-dreambooth-library/herge-style", torch_dtype=torch.float16).to("cuda") prompt = "A cute herge_style brown bear eating a slice of pizza, stunning color scheme, masterpiece, illustration" image = pipeline(prompt).images[0] +image ```
@@ -45,7 +46,7 @@ image = pipeline(prompt).images[0] ## Textual inversion -[Textual inversion](https://textual-inversion.github.io/) is very similar to DreamBooth and it can also personalize a diffusion model to generate certain concepts (styles, objects) from just a few images. This method works by training and finding new embeddings that represent the images you provide with a special word in the prompt. As a result, the diffusion model weights stays the same and the training process produces a relatively tiny (a few KBs) file. +[Textual inversion](https://textual-inversion.github.io/) is very similar to DreamBooth and it can also personalize a diffusion model to generate certain concepts (styles, objects) from just a few images. This method works by training and finding new embeddings that represent the images you provide with a special word in the prompt. As a result, the diffusion model weights stay the same and the training process produces a relatively tiny (a few KBs) file. Because textual inversion creates embeddings, it cannot be used on its own like DreamBooth and requires another model. @@ -62,13 +63,14 @@ Now you can load the textual inversion embeddings with the [`~loaders.TextualInv pipeline.load_textual_inversion("sd-concepts-library/gta5-artwork") prompt = "A cute brown bear eating a slice of pizza, stunning color scheme, masterpiece, illustration, style" image = pipeline(prompt).images[0] +image ```
-Textual inversion can also be trained on undesirable things to create *negative embeddings* to discourage a model from generating images with those undesirable things like blurry images or extra fingers on a hand. This can be a easy way to quickly improve your prompt. You'll also load the embeddings with [`~loaders.TextualInversionLoaderMixin.load_textual_inversion`], but this time, you'll need two more parameters: +Textual inversion can also be trained on undesirable things to create *negative embeddings* to discourage a model from generating images with those undesirable things like blurry images or extra fingers on a hand. This can be an easy way to quickly improve your prompt. You'll also load the embeddings with [`~loaders.TextualInversionLoaderMixin.load_textual_inversion`], but this time, you'll need two more parameters: - `weight_name`: specifies the weight file to load if the file was saved in the ๐Ÿค— Diffusers format with a specific name or if the file is stored in the A1111 format - `token`: specifies the special word to use in the prompt to trigger the embeddings @@ -88,6 +90,7 @@ prompt = "A cute brown bear eating a slice of pizza, stunning color scheme, mast negative_prompt = "EasyNegative" image = pipeline(prompt, negative_prompt=negative_prompt, num_inference_steps=50).images[0] +image ```
@@ -119,6 +122,7 @@ Then use the [`~loaders.LoraLoaderMixin.load_lora_weights`] method to load the [ pipeline.load_lora_weights("ostris/super-cereal-sdxl-lora", weight_name="cereal_box_sdxl_v1.safetensors") prompt = "bears, pizza bites" image = pipeline(prompt).images[0] +image ```
@@ -142,6 +146,7 @@ pipeline.unet.load_attn_procs("jbilcke-hf/sdxl-cinematic-1", weight_name="pytorc # use cnmt in the prompt to trigger the LoRA prompt = "A cute cnmt eating a slice of pizza, stunning color scheme, masterpiece, illustration" image = pipeline(prompt).images[0] +image ```
@@ -184,7 +189,7 @@ pipeline = StableDiffusionXLPipeline.from_pretrained( ).to("cuda") ``` -Then load the LoRA checkpoint and fuse it with the original weights. The `lora_scale` parameter controls how much to scale the output by with the LoRA weights. It is important to make the `lora_scale` adjustments in the [`~loaders.LoraLoaderMixin.fuse_lora`] method because it won't work if you try to pass `scale` to the `cross_attention_kwargs` in the pipeline. +Next, load the LoRA checkpoint and fuse it with the original weights. The `lora_scale` parameter controls how much to scale the output by with the LoRA weights. It is important to make the `lora_scale` adjustments in the [`~loaders.LoraLoaderMixin.fuse_lora`] method because it won't work if you try to pass `scale` to the `cross_attention_kwargs` in the pipeline. If you need to reset the original model weights for any reason (use a different `lora_scale`), you should use the [`~loaders.LoraLoaderMixin.unfuse_lora`] method. @@ -205,7 +210,7 @@ pipeline.fuse_lora(lora_scale=0.7) -You can't unfuse multiple LoRA checkpoints so if you need to reset the model to its original weights, you'll need to reload it. +You can't unfuse multiple LoRA checkpoints, so if you need to reset the model to its original weights, you'll need to reload it. @@ -214,13 +219,14 @@ Now you can generate an image that uses the weights from both LoRAs: ```py prompt = "A cute brown bear eating a slice of pizza, stunning color scheme, masterpiece, illustration" image = pipeline(prompt).images[0] +image ``` ### ๐Ÿค— PEFT -Read the [Inference with ๐Ÿค— PEFT](../tutorials/using_peft_for_inference) tutorial to learn more its integration with ๐Ÿค— Diffusers and how you can easily work with and juggle multiple adapters. +Read the [Inference with ๐Ÿค— PEFT](../tutorials/using_peft_for_inference) tutorial to learn more about its integration with ๐Ÿค— Diffusers and how you can easily work with and juggle multiple adapters. You'll need to install ๐Ÿค— Diffusers and PEFT from source to run the example in this section. @@ -241,11 +247,12 @@ Now use the [`~loaders.UNet2DConditionLoadersMixin.set_adapters`] to activate bo pipeline.set_adapters(["ikea", "cereal"], adapter_weights=[0.7, 0.5]) ``` -Then generate an image: +Then, generate an image: ```py prompt = "A cute brown bear eating a slice of pizza, stunning color scheme, masterpiece, illustration" image = pipeline(prompt, num_inference_steps=30, cross_attention_kwargs={"scale": 1.0}).images[0] +image ``` ### Kohya and TheLastBen @@ -254,7 +261,7 @@ Other popular LoRA trainers from the community include those by [Kohya](https:// Let's download the [Blueprintify SD XL 1.0](https://civitai.com/models/150986/blueprintify-sd-xl-10) checkpoint from [Civitai](https://civitai.com/): -```py +```sh !wget https://civitai.com/api/download/models/168776 -O blueprintify-sd-xl-10.safetensors ``` @@ -264,7 +271,7 @@ Load the LoRA checkpoint with the [`~loaders.LoraLoaderMixin.load_lora_weights`] from diffusers import AutoPipelineForText2Image import torch -pipeline = AutoPipelineForText2Image.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0").to("cuda") +pipeline = AutoPipelineForText2Image.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16).to("cuda") pipeline.load_lora_weights("path/to/weights", weight_name="blueprintify-sd-xl-10.safetensors") ``` @@ -274,13 +281,14 @@ Generate an image: # use bl3uprint in the prompt to trigger the LoRA prompt = "bl3uprint, a highly detailed blueprint of the eiffel tower, explaining how to build all parts, many txt, blueprint grid backdrop" image = pipeline(prompt).images[0] +image ``` Some limitations of using Kohya LoRAs with ๐Ÿค— Diffusers include: -- Images may not look like those generated by UIs - like ComfyUI - for multiple reasons which are explained [here](https://github.com/huggingface/diffusers/pull/4287/#issuecomment-1655110736). +- Images may not look like those generated by UIs - like ComfyUI - for multiple reasons, which are explained [here](https://github.com/huggingface/diffusers/pull/4287/#issuecomment-1655110736). - [LyCORIS checkpoints](https://github.com/KohakuBlueleaf/LyCORIS) aren't fully supported. The [`~loaders.LoraLoaderMixin.load_lora_weights`] method loads LyCORIS checkpoints with LoRA and LoCon modules, but Hada and LoKR are not supported. @@ -297,4 +305,5 @@ pipeline.load_lora_weights("TheLastBen/William_Eggleston_Style_SDXL", weight_nam # use by william eggleston in the prompt to trigger the LoRA prompt = "a house by william eggleston, sunrays, beautiful, sunlight, sunrays, beautiful" image = pipeline(prompt=prompt).images[0] -``` \ No newline at end of file +image +``` diff --git a/docs/source/en/using-diffusers/loading_overview.md b/docs/source/en/using-diffusers/loading_overview.md index df870505219bb..b36fdb77e6dde 100644 --- a/docs/source/en/using-diffusers/loading_overview.md +++ b/docs/source/en/using-diffusers/loading_overview.md @@ -14,4 +14,4 @@ specific language governing permissions and limitations under the License. ๐Ÿงจ Diffusers offers many pipelines, models, and schedulers for generative tasks. To make loading these components as simple as possible, we provide a single and unified method - `from_pretrained()` - that loads any of these components from either the Hugging Face [Hub](https://huggingface.co/models?library=diffusers&sort=downloads) or your local machine. Whenever you load a pipeline or model, the latest files are automatically downloaded and cached so you can quickly reuse them next time without redownloading the files. -This section will show you everything you need to know about loading pipelines, how to load different components in a pipeline, how to load checkpoint variants, and how to load community pipelines. You'll also learn how to load schedulers and compare the speed and quality trade-offs of using different schedulers. Finally, you'll see how to convert and load KerasCV checkpoints so you can use them in PyTorch with ๐Ÿงจ Diffusers. \ No newline at end of file +This section will show you everything you need to know about loading pipelines, how to load different components in a pipeline, how to load checkpoint variants, and how to load community pipelines. You'll also learn how to load schedulers and compare the speed and quality trade-offs of using different schedulers. Finally, you'll see how to convert and load KerasCV checkpoints so you can use them in PyTorch with ๐Ÿงจ Diffusers. diff --git a/docs/source/en/using-diffusers/other-formats.md b/docs/source/en/using-diffusers/other-formats.md index c2f10ff796375..84945a6da87a7 100644 --- a/docs/source/en/using-diffusers/other-formats.md +++ b/docs/source/en/using-diffusers/other-formats.md @@ -14,7 +14,7 @@ specific language governing permissions and limitations under the License. [[open-in-colab]] -Stable Diffusion models are available in different formats depending on the framework they're trained and saved with, and where you download them from. Converting these formats for use in ๐Ÿค— Diffusers allows you to use all the features supported by the library, such as [using different schedulers](schedulers) for inference, [building your custom pipeline](write_own_pipeline), and a variety of techniques and methods for [optimizing inference speed](./optimization/opt_overview). +Stable Diffusion models are available in different formats depending on the framework they're trained and saved with, and where you download them from. Converting these formats for use in ๐Ÿค— Diffusers allows you to use all the features supported by the library, such as [using different schedulers](schedulers) for inference, [building your custom pipeline](write_own_pipeline), and a variety of techniques and methods for [optimizing inference speed](../optimization/opt_overview). @@ -28,7 +28,7 @@ This guide will show you how to convert other Stable Diffusion formats to be com The checkpoint - or `.ckpt` - format is commonly used to store and save models. The `.ckpt` file contains the entire model and is typically several GBs in size. While you can load and use a `.ckpt` file directly with the [`~StableDiffusionPipeline.from_single_file`] method, it is generally better to convert the `.ckpt` file to ๐Ÿค— Diffusers so both formats are available. -There are two options for converting a `.ckpt` file; use a Space to convert the checkpoint or convert the `.ckpt` file with a script. +There are two options for converting a `.ckpt` file: use a Space to convert the checkpoint or convert the `.ckpt` file with a script. ### Convert with a Space @@ -116,7 +116,7 @@ pipeline = DiffusionPipeline.from_pretrained( ) ``` -Then you can generate an image like: +Then, you can generate an image like: ```py from diffusers import DiffusionPipeline @@ -136,53 +136,41 @@ image = pipeline(prompt, num_inference_steps=50).images[0] [Automatic1111](https://github.com/AUTOMATIC1111/stable-diffusion-webui) (A1111) is a popular web UI for Stable Diffusion that supports model sharing platforms like [Civitai](https://civitai.com/). Models trained with the Low-Rank Adaptation (LoRA) technique are especially popular because they're fast to train and have a much smaller file size than a fully finetuned model. ๐Ÿค— Diffusers supports loading A1111 LoRA checkpoints with [`~loaders.LoraLoaderMixin.load_lora_weights`]: ```py -from diffusers import DiffusionPipeline, UniPCMultistepScheduler +from diffusers import StableDiffusionXLPipeline import torch -pipeline = DiffusionPipeline.from_pretrained( - "andite/anything-v4.0", torch_dtype=torch.float16, safety_checker=None +pipeline = StableDiffusionXLPipeline.from_pretrained( + "Lykon/dreamshaper-xl-1-0", torch_dtype=torch.float16, variant="fp16" ).to("cuda") -pipeline.scheduler = UniPCMultistepScheduler.from_config(pipeline.scheduler.config) ``` -Download a LoRA checkpoint from Civitai; this example uses the [Howls Moving Castle,Interior/Scenery LoRA (Ghibli Stlye)](https://civitai.com/models/14605?modelVersionId=19998) checkpoint, but feel free to try out any LoRA checkpoint! +Download a LoRA checkpoint from Civitai; this example uses the [Blueprintify SD XL 1.0](https://civitai.com/models/150986/blueprintify-sd-xl-10) checkpoint, but feel free to try out any LoRA checkpoint! ```py # uncomment to download the safetensor weights -#!wget https://civitai.com/api/download/models/19998 -O howls_moving_castle.safetensors +#!wget https://civitai.com/api/download/models/168776 -O blueprintify.safetensors ``` Load the LoRA checkpoint into the pipeline with the [`~loaders.LoraLoaderMixin.load_lora_weights`] method: ```py -pipeline.load_lora_weights(".", weight_name="howls_moving_castle.safetensors") +pipeline.load_lora_weights(".", weight_name="blueprintify.safetensors") ``` Now you can use the pipeline to generate images: ```py -prompt = "masterpiece, illustration, ultra-detailed, cityscape, san francisco, golden gate bridge, california, bay area, in the snow, beautiful detailed starry sky" +prompt = "bl3uprint, a highly detailed blueprint of the empire state building, explaining how to build all parts, many txt, blueprint grid backdrop" negative_prompt = "lowres, cropped, worst quality, low quality, normal quality, artifacts, signature, watermark, username, blurry, more than one bridge, bad architecture" -images = pipeline( +image = pipeline( prompt=prompt, negative_prompt=negative_prompt, - width=512, - height=512, - num_inference_steps=25, - num_images_per_prompt=4, generator=torch.manual_seed(0), -).images -``` - -Display the images: - -```py -from diffusers.utils import make_image_grid - -make_image_grid(images, 2, 2) +).images[0] +image ```
- +
diff --git a/docs/source/en/using-diffusers/push_to_hub.md b/docs/source/en/using-diffusers/push_to_hub.md index 4683860317680..58598c3bc443c 100644 --- a/docs/source/en/using-diffusers/push_to_hub.md +++ b/docs/source/en/using-diffusers/push_to_hub.md @@ -1,3 +1,15 @@ + + # Push files to the Hub [[open-in-colab]] @@ -20,7 +32,7 @@ notebook_login() ## Models -To push a model to the Hub, call [`~diffusers.utils.PushToHubMixin.push_to_hub`] and specfiy the repository id of the model to be stored on the Hub: +To push a model to the Hub, call [`~diffusers.utils.PushToHubMixin.push_to_hub`] and specify the repository id of the model to be stored on the Hub: ```py from diffusers import ControlNetModel @@ -36,7 +48,7 @@ controlnet = ControlNetModel( controlnet.push_to_hub("my-controlnet-model") ``` -For model's, you can also specify the [*variant*](loading#checkpoint-variants) of the weights to push to the Hub. For example, to push `fp16` weights: +For models, you can also specify the [*variant*](loading#checkpoint-variants) of the weights to push to the Hub. For example, to push `fp16` weights: ```py controlnet.push_to_hub("my-controlnet-model", variant="fp16") @@ -52,7 +64,7 @@ model = ControlNetModel.from_pretrained("your-namespace/my-controlnet-model") ## Scheduler -To push a scheduler to the Hub, call [`~diffusers.utils.PushToHubMixin.push_to_hub`] and specfiy the repository id of the scheduler to be stored on the Hub: +To push a scheduler to the Hub, call [`~diffusers.utils.PushToHubMixin.push_to_hub`] and specify the repository id of the scheduler to be stored on the Hub: ```py from diffusers import DDIMScheduler @@ -159,13 +171,13 @@ pipeline = StableDiffusionPipeline.from_pretrained("your-namespace/my-pipeline") Set `private=True` in the [`~diffusers.utils.PushToHubMixin.push_to_hub`] function to keep your model, scheduler, or pipeline files private: ```py -controlnet.push_to_hub("my-controlnet-model", private=True) +controlnet.push_to_hub("my-controlnet-model-private", private=True) ``` -Private repositories are only visible to you, and other users won't be able to clone the repository and your repository won't appear in search results. Even if a user has the URL to your private repository, they'll receive a `404 - Repo not found error.` +Private repositories are only visible to you, and other users won't be able to clone the repository and your repository won't appear in search results. Even if a user has the URL to your private repository, they'll receive a `404 - Sorry, we can't find the page you are looking for.` -To load a model, scheduler, or pipeline from a private or gated repositories, set `use_auth_token=True`: +To load a model, scheduler, or pipeline from private or gated repositories, set `use_auth_token=True`: ```py -model = ControlNet.from_pretrained("your-namespace/my-controlnet-model", use_auth_token=True) -``` \ No newline at end of file +model = ControlNetModel.from_pretrained("your-namespace/my-controlnet-model-private", use_auth_token=True) +``` diff --git a/docs/source/en/using-diffusers/schedulers.md b/docs/source/en/using-diffusers/schedulers.md index c791b47b78327..9a8dd29ec2ea5 100644 --- a/docs/source/en/using-diffusers/schedulers.md +++ b/docs/source/en/using-diffusers/schedulers.md @@ -15,13 +15,13 @@ specific language governing permissions and limitations under the License. [[open-in-colab]] Diffusion pipelines are inherently a collection of diffusion models and schedulers that are partly independent from each other. This means that one is able to switch out parts of the pipeline to better customize -a pipeline to one's use case. The best example of this is the [Schedulers](../api/schedulers/overview.md). +a pipeline to one's use case. The best example of this is the [Schedulers](../api/schedulers/overview). Whereas diffusion models usually simply define the forward pass from noise to a less noisy sample, schedulers define the whole denoising process, *i.e.*: - How many denoising steps? - Stochastic or deterministic? -- What algorithm to use to find the denoised sample +- What algorithm to use to find the denoised sample? They can be quite complex and often define a trade-off between **denoising speed** and **denoising quality**. It is extremely difficult to measure quantitatively which scheduler works best for a given diffusion pipeline, so it is often recommended to simply try out which works best. @@ -63,7 +63,7 @@ pipeline.scheduler ``` PNDMScheduler { "_class_name": "PNDMScheduler", - "_diffusers_version": "0.8.0.dev0", + "_diffusers_version": "0.21.4", "beta_end": 0.012, "beta_schedule": "scaled_linear", "beta_start": 0.00085, @@ -72,6 +72,7 @@ PNDMScheduler { "set_alpha_to_one": false, "skip_prk_steps": true, "steps_offset": 1, + "timestep_spacing": "leading", "trained_betas": null } ``` @@ -101,7 +102,7 @@ image ## Changing the scheduler -Now we show how easy it is to change the scheduler of a pipeline. Every scheduler has a property [`SchedulerMixin.compatibles`] +Now we show how easy it is to change the scheduler of a pipeline. Every scheduler has a property [`~SchedulerMixin.compatibles`] which defines all compatible schedulers. You can take a look at all available, compatible schedulers for the Stable Diffusion pipeline as follows. ```python @@ -110,27 +111,40 @@ pipeline.scheduler.compatibles **Output**: ``` -[diffusers.schedulers.scheduling_lms_discrete.LMSDiscreteScheduler, +[diffusers.utils.dummy_torch_and_torchsde_objects.DPMSolverSDEScheduler, + diffusers.schedulers.scheduling_euler_discrete.EulerDiscreteScheduler, + diffusers.schedulers.scheduling_lms_discrete.LMSDiscreteScheduler, diffusers.schedulers.scheduling_ddim.DDIMScheduler, + diffusers.schedulers.scheduling_ddpm.DDPMScheduler, + diffusers.schedulers.scheduling_heun_discrete.HeunDiscreteScheduler, diffusers.schedulers.scheduling_dpmsolver_multistep.DPMSolverMultistepScheduler, - diffusers.schedulers.scheduling_euler_discrete.EulerDiscreteScheduler, + diffusers.schedulers.scheduling_deis_multistep.DEISMultistepScheduler, diffusers.schedulers.scheduling_pndm.PNDMScheduler, - diffusers.schedulers.scheduling_ddpm.DDPMScheduler, - diffusers.schedulers.scheduling_euler_ancestral_discrete.EulerAncestralDiscreteScheduler] + diffusers.schedulers.scheduling_euler_ancestral_discrete.EulerAncestralDiscreteScheduler, + diffusers.schedulers.scheduling_unipc_multistep.UniPCMultistepScheduler, + diffusers.schedulers.scheduling_k_dpm_2_discrete.KDPM2DiscreteScheduler, + diffusers.schedulers.scheduling_dpmsolver_singlestep.DPMSolverSinglestepScheduler, + diffusers.schedulers.scheduling_k_dpm_2_ancestral_discrete.KDPM2AncestralDiscreteScheduler] ``` Cool, lots of schedulers to look at. Feel free to have a look at their respective class definitions: -- [`LMSDiscreteScheduler`], -- [`DDIMScheduler`], -- [`DPMSolverMultistepScheduler`], -- [`EulerDiscreteScheduler`], -- [`PNDMScheduler`], -- [`DDPMScheduler`], -- [`EulerAncestralDiscreteScheduler`]. +- [`EulerDiscreteScheduler`], +- [`LMSDiscreteScheduler`], +- [`DDIMScheduler`], +- [`DDPMScheduler`], +- [`HeunDiscreteScheduler`], +- [`DPMSolverMultistepScheduler`], +- [`DEISMultistepScheduler`], +- [`PNDMScheduler`], +- [`EulerAncestralDiscreteScheduler`], +- [`UniPCMultistepScheduler`], +- [`KDPM2DiscreteScheduler`], +- [`DPMSolverSinglestepScheduler`], +- [`KDPM2AncestralDiscreteScheduler`]. We will now compare the input prompt with all other schedulers. To change the scheduler of the pipeline you can make use of the -convenient [`ConfigMixin.config`] property in combination with the [`ConfigMixin.from_config`] function. +convenient [`~ConfigMixin.config`] property in combination with the [`~ConfigMixin.from_config`] function. ```python pipeline.scheduler.config @@ -139,7 +153,7 @@ pipeline.scheduler.config returns a dictionary of the configuration of the scheduler: **Output**: -``` +```py FrozenDict([('num_train_timesteps', 1000), ('beta_start', 0.00085), ('beta_end', 0.012), @@ -147,9 +161,12 @@ FrozenDict([('num_train_timesteps', 1000), ('trained_betas', None), ('skip_prk_steps', True), ('set_alpha_to_one', False), + ('prediction_type', 'epsilon'), + ('timestep_spacing', 'leading'), ('steps_offset', 1), + ('_use_default_values', ['timestep_spacing', 'prediction_type']), ('_class_name', 'PNDMScheduler'), - ('_diffusers_version', '0.8.0.dev0'), + ('_diffusers_version', '0.21.4'), ('clip_sample', False)]) ``` @@ -182,7 +199,7 @@ If you are a JAX/Flax user, please check [this section](#changing-the-scheduler- ## Compare schedulers So far we have tried running the stable diffusion pipeline with two schedulers: [`PNDMScheduler`] and [`DDIMScheduler`]. -A number of better schedulers have been released that can be run with much fewer steps, let's compare them here: +A number of better schedulers have been released that can be run with much fewer steps; let's compare them here: [`LMSDiscreteScheduler`] usually leads to better results: @@ -241,8 +258,7 @@ image

-At the time of writing this doc [`DPMSolverMultistepScheduler`] gives arguably the best speed/quality trade-off and can be run with as little -as 20 steps. +[`DPMSolverMultistepScheduler`] gives a reasonable speed/quality trade-off and can be run with as little as 20 steps. ```python from diffusers import DPMSolverMultistepScheduler @@ -260,12 +276,12 @@ image

-As you can see most images look very similar and are arguably of very similar quality. It often really depends on the specific use case which scheduler to choose. A good approach is always to run multiple different +As you can see, most images look very similar and are arguably of very similar quality. It often really depends on the specific use case which scheduler to choose. A good approach is always to run multiple different schedulers to compare results. ## Changing the Scheduler in Flax -If you are a JAX/Flax user, you can also change the default pipeline scheduler. This is a complete example of how to run inference using the Flax Stable Diffusion pipeline and the super-fast [DDPM-Solver++ scheduler](../api/schedulers/multistep_dpm_solver): +If you are a JAX/Flax user, you can also change the default pipeline scheduler. This is a complete example of how to run inference using the Flax Stable Diffusion pipeline and the super-fast [DPM-Solver++ scheduler](../api/schedulers/multistep_dpm_solver): ```Python import jax diff --git a/docs/source/en/using-diffusers/using_safetensors.md b/docs/source/en/using-diffusers/using_safetensors.md index 2f47eb08cb839..3e89e7eed9a01 100644 --- a/docs/source/en/using-diffusers/using_safetensors.md +++ b/docs/source/en/using-diffusers/using_safetensors.md @@ -1,3 +1,15 @@ + + # Load safetensors [[open-in-colab]] @@ -55,11 +67,11 @@ There are several reasons for using safetensors: The time it takes to load the entire pipeline: ```py - from diffusers import StableDiffusionPipeline + from diffusers import StableDiffusionPipeline - pipeline = StableDiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-2-1", use_safetensors=True) - "Loaded in safetensors 0:00:02.033658" - "Loaded in PyTorch 0:00:02.663379" + pipeline = StableDiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-2-1", use_safetensors=True) + "Loaded in safetensors 0:00:02.033658" + "Loaded in PyTorch 0:00:02.663379" ``` But the actual time it takes to load 500MB of the model weights is only: