-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add icon and description for Stable Diffusion benchmark #917
Conversation
anhappdev
commented
Sep 12, 2024
•
edited
Loading
edited
- The icon is drawn by me using Figma. We can replace it with one from a designer later.
- The description for the Stable Diffusion benchmark is provided by @Mostelk
![](https://private-user-images.githubusercontent.com/85728587/394993174-c5d892e2-bd8b-47cb-ba29-72766e9a5ccb.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk3MTc3OTksIm5iZiI6MTczOTcxNzQ5OSwicGF0aCI6Ii84NTcyODU4Ny8zOTQ5OTMxNzQtYzVkODkyZTItYmQ4Yi00N2NiLWJhMjktNzI3NjZlOWE1Y2NiLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTYlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjE2VDE0NTEzOVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTZlMjU4YTM4ZDA5MzcxZGQzNWVkYzBhMjZmNTgzMmMwNmEwNTdiZDhkZTIxZWZiMWUyYWRjZDc3MWU5YTY5MjMmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.DxKFXu02rAn49JXz4EimLwmx29y6J36heW9dWpgksTY)
MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅ |
|
@AhmedTElthakeb please report number of parameters and FLOPs of the 3 models we use. |
|
bc50b66
to
be59d3c
Compare
|
@Mostelk Please provide a description for the Stable Diffusion benchmark. |
Please check this description, we reviewed it in the Wed meeting The Text to Image Gen AI benchmark adopts Stable Diffusion v1.5 for generating images from text prompts. It is a latent diffusion model. The benchmarked Stable Diffusion v1.5 refers to a specific configuration of the model architecture that uses a downsampling-factor 8 autoencoder with an 860M UNet,123M CLIP ViT-L/14 text encoder for the diffusion model, and VAE Decoder of 49.5M parameters. The model was trained on 595k steps at resolution of 512x512, which enables it to generate high quality images. We refer you to https://huggingface.co/benjamin-paine/stable-diffusion-v1-5 for more information. The benchmark runs 20 denoising steps for inference, and uses a precalculated time embedding of size 1x1280. Reference models can be found here https://github.com/mlcommons/mobile_open/releases |
be59d3c
to
063c086
Compare
063c086
to
bc671d0
Compare
|