- Set guidance_scale (decoder) to 1.9 and num_inference_steps to 54 for optimal image quality.
- Key finding: Using torch.bfloat16 for the decoder significantly increased model loading speed (3.24x faster) compared to torch.float16. "Other performance metrics remained virtually unchanged, and surprisingly, there was no perceptible difference in image quality (see Figure 1).
-
- Generated image filenames follow the format:
image_seed-[seed]_identifier-[UUID].png
. - Generation metadata (model, prompt, negative prompt, etc.) is embedded within the PNG files.
Example: A generated image might be named
image_seed-12345_identifier-a8b7c6d5-4e9f-8g7h-6i5j-6k4lmn3o.png
. - Generated image filenames follow the format:
-
- Updated torch and torchvision versions to address conflicts on Linux Mint.
- Modified
run.py
to guarantee compatibility with Linux Mint.
Acknowledgement: Thanks to @thomasmcgannon for identifying these issues in #29 !
-
Enhanced installation process:
- Revised installation to enable
git pull
for updates regardless of initial installation method (ZIP download or Git clone).
- Revised installation to enable
-
README.md improvements:
- Added a table of contents for enhanced navigation.
-
Structural streamlining:
- Cleaned up the repo structure to remove visual clutter.
- Adapted model integration:
- Modified code to accommodate updates (commit be562f9) in the following model repositories:
- stabilityai/stable-cascade
- stabilityai/stable-cascade-prior
- Modified code to accommodate updates (commit be562f9) in the following model repositories: