Set guidance_scale (decoder) to 1.9 and num_inference_steps to 54 for optimal image quality.
Key finding: Using torch.bfloat16 for the decoder significantly increased model loading speed (3.24x faster) compared to torch.float16. "Other performance metrics remained virtually unchanged, and surprisingly, there was no perceptible difference in image quality (see Figure 1).
- Charts: I've created two charts visualizing these results (see Figure 2, Figure 3).

PR #7381:
- Fixed the bug so we can generate multiple images simultaneously – thx @DN6! 🎉

PR #31:
- Generated image filenames follow the format: image_seed-[seed]_identifier-[UUID].png.
- Generation metadata (model, prompt, negative prompt, etc.) is embedded within the PNG files.
Example: A generated image might be named image_seed-12345_identifier-a8b7c6d5-4e9f-8g7h-6i5j-6k4lmn3o.png.
PR #30:
- Updated torch and torchvision versions to address conflicts on Linux Mint.
- Modified run.py to guarantee compatibility with Linux Mint.
Acknowledgement: Thanks to @thomasmcgannon for identifying these issues in #29 !

Enhanced installation process:
- Revised installation to enable git pull for updates regardless of initial installation method (ZIP download or Git clone).
README.md improvements:
- Added a table of contents for enhanced navigation.
Structural streamlining:
- Cleaned up the repo structure to remove visual clutter.

Adapted model integration:
- Modified code to accommodate updates (commit be562f9) in the following model repositories:
  - stabilityai/stable-cascade
  - stabilityai/stable-cascade-prior

Provide feedback

Saved searches