This version achieves much better quality at the cost of speed and hardware requirements using these tweaks:
- Stable diffusion XL support (Requires sd_xl_base_1.0_0.9vae.safetensors to be active and sd_xl_refiner_1.0_0.9vae.safetensors to be available for the refiner)
- 1024x1024 by default.
- Optional negative prompting through discord command.
- No default negative prompting. (Original bot contains a large default negative prompt)
- Use DPM++ 2M SDE Karras sampler
- 23 steps instead of 20
- Set CfgScale to 7 instead of 9
- Use sd_xl_refiner_1.0_0.9vae.safetensors as the Refiner.
- DenoiseStrength set to 0.8 instead of 0.7.
Needless to say, some of these settings are highly subjective.
This is a Discord bot that interfaces with the Automatic1111 API, from this project: https://github.com/AUTOMATIC1111/stable-diffusion-webui
Video showing off the current features: https://www.youtube.com/watch?v=of5MBh3ueMk
- Download the appropriate version for your system from the releases page: https://github.com/AndBobsYourUncle/stable-diffusion-discord-bot/releases
- Windows users will need to use the windows-amd64 version
- Intel Macs will need to use the darwin-amd64 version
- M1 Macs will need to use the darwin-arm64 version
- Devices like a Raspberry Pi will need to use the linux-arm64 version
- Most other Linux devices will need to use the linux-amd64 version
- Extract the archive folder to a location of your choice
- Clone this repository
- Install Go
- This varies with your operating system, but the easiest way is to use the official installer: https://golang.org/dl/
- Build the bot with
go build
- Create a Discord bot and get the token
- Add the Discord bot to your Discord server. It needs permissions to post messages, use slash commands, mentioning anyone, and uploading files.
- Ensure that the Automatic 1111 webui is running with
--api
(and also--listen
if it is running on a different computer than the bot). - Run the bot with
./stable_diffusion_bot -token <token> -guild <guild ID> -host <webui host, e.g. http://127.0.0.1:7860>
- It's important that the
-host
parameter matches the IP address where the A1111 is running. If the bot is on the same computer,127.0.0.1
will work. - There needs to be no trailing slash after the port number (which is
7860
in this example). So, instead ofhttp://127.0.0.1:7860/
, it should behttp://127.0.0.1:7860
.
- It's important that the
- The first run will generate a new SQLite DB file in the current working directory.
The -imagine <new command name>
flag can be used to have the bot use a different command when running, so that it doesn't collide with a Midjourney bot running on the same Discord server.
Responds with a message that has buttons to allow updating of the default settings for the /imagine
command.
By default, the size is 512x512. However, if you are running the Stable Diffusion 2.0 768 model, you might want to change this to 768x768.
Choosing an option will cause the bot to update the setting, and edit the message in place, allowing further edits.
Creates an image from a text prompt. (e.g. /imagine cute kitten riding a skateboard
)
Available options:
- Aspect Ratio
--ar <width>:<height>
(e.g./imagine cute kitten riding a skateboard --ar 16:9
)- Uses the default width or height, and calculates the final value for the other based on the aspect ratio. It then rounds that value up to the nearest multiple of
8
, to match the expectations of the underlying neural model and SD API. - Under the hood, it will use the "Hires fix" option in the API, which will generate an image with the bot's default width/height, and then resize it to the desired aspect ratio.
The bot implements a FIFO queue (first in, first out). When a user issues the /imagine
command (or uses an interaction button), they are added to the end of the queue.
The bot then checks the queue every second. If the queue is not empty, and there is nothing currently being processed, it will send the top interaction to the Automatic1111 WebUI API, and then remove it from the queue.
After the Automatic1111 has finished processing the interaction, the bot will then update the reply message with the finished result.
Buttons are added to the Discord response message for interactions like re-roll, variations, and up-scaling.
All image generations are saved into a local SQLite database, so that the parameters of the image can be retrieved later for variations or up-scaling.
Options like aspect ratio are extracted and sanitized from the text prompt, and then the resulting options are stored in the database record for the image generation (for further variations or upscaling):
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
There are lots more features that could be added to this bot, such as:
- Moving defaults to the database
- Per-user defaults/settings, as well as enforcing limits on a user's usage of the bot
- Ability to easily re-roll an image
- Generating multiple images at once
- Ability to upscale the resulting images
- Ability to generate variations on a grid image
- Ability to tweak more settings when issuing the
/imagine
command (like aspect ratio) - Image to image processing
I'll probably be adding a few of these over time, but any contributions are also welcome.
I like Go a lot better than Python, and for me it's a lot easier to maintain dependencies with Go modules versus running a bunch of different Anaconda environments.
It's also able to be cross-compiled to a wide range of platforms, which is nice.