Skip to content

v0.8.0 - improved quality and crop-training support

Compare
Choose a tag to compare
@bghira bghira released this 08 Dec 20:11
· 2821 commits to release since this release
dcba15e

image

Changelog

Breaking Changes

  • SDXL Launch Script Format: Updated launch script format, set new defaults, and rearranged options, introducing breaking changes.

New Features

  • BucketManager: Enhanced to remove images that are too small, now controlled via --minimum_image_size.
  • Captioning Toolkit: Advanced CogVLM captioner and a basic LLaVA captioner now available.
  • Crop Options: Added --crop_style and --crop_aspect options for improved control over cropping behavior.
  • Validation Negative Prompt: Added --validation_negative_prompt option.

Enhancements and Refinements

  • Learning Rate Scheduler: Distinguished between cosine and cosine_with_restarts schedulers. The default LR scheduler is now cosine.
  • DeepSpeed for SD 2.x: Integrated DeepSpeed for improved performance in Stable Diffusion 2.x models.
  • Downsampling Method: Switched to using LANCZOS for downsampling to reduce image artifacts compared to BICUBIC.
  • Diffusers Update: Adapted to the new version of the diffusers library and fixed issues related to the refactored config style.

Bug Fixes and Improvements

  • Captioning Dropout: Enhanced to also drop conditioning inputs, ensuring a more consistent dropout mechanism.
  • Unit Tests: Added unit tests for random cropping within image boundaries and updated VAE Cache to accommodate random crop coordinates.
  • EMA Model Params: Optimized logging to not print EMA (Exponential Moving Average) model parameters.
  • Dropout Code Conflict: Removed unnecessary conflicting dropout code.

Full Changelog: v0.7.4...v0.8.0