Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis

Demo | Model

The framework code of Meissonic, a NAT-based text-to-image model.

Introduction

Meissonic is an efficient text-to-image synthesis foundation model, which can be run on consumer graphics cards with as little as 8 GB of VRAM. It is based on the non-autoregressive architecture and is designed to generate $1024 \times 1024$ high-resolution images from text descriptions.

Architecture

Key Features

High-resolution image generation (up to 1024x1024)
Designed to run on consumer GPUs
Versatile applications: text-to-image, image-to-image

Quick Start

Inference

python inference.py --input_text "a red apple on a white plate" --output_dir ./output

Citation

If you find this work helpful, please consider citing:

@article{bai2024meissonic,
  title={Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis},
  author={Bai, Jinbin and Ye, Tian and Chow, Wei and Song, Enxin and Chen, Qing-Guo and Li, Xiangtai and Dong, Zhen and Zhu, Lei and Yan, Shuicheng},
  journal={arXiv preprint arXiv:2410.08261},
  year={2024}
}

License

This project is licensed under Apache License Version 2 (SPDX-License-identifier: Apache-2.0) with additional use restrictions. You can find the full text of the license(s) in the following path: ./LICENSE

Disclaimer

We used compliance checking algorithms during the training process, to ensure the compliance of the trained model and dataset to the best of our ability. Due to complex data and the diversity of language model usage scenarios, we cannot guarantee that the model is completely free of copyright issues or improper content. If you believe anything infringes on your rights or generates improper content, please contact us, and we will promptly address the matter.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
assets		assets
src		src
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis

Demo | Model

Introduction

Architecture

Key Features

Quick Start

Inference

Citation

License

Disclaimer

About

Releases

Packages

Contributors 2

Languages

License

AIDC-AI/Meissonic

Folders and files

Latest commit

History

Repository files navigation

Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis

Demo | Model

Introduction

Architecture

Key Features

Quick Start

Inference

Citation

License

Disclaimer

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages