MIG_Bench

The MIG benchmark of CVPR2024 MIGC: Multi-Instance Generation Controller for Text-to-Image Synthesis

[Paper] [Project Page] [Code]

MIGC: Multi-Instance Generation Controller for Text-to-Image Synthesis

To Do List

MIG Bench File
Evaluation
Sample Code
More baselines

Introduction

In the Text-to-Image task, facing complex texts with multiple instances and rich attributes along with their layout information, higher demands are placed on existing generators and their derived generation techniques. In order to evaluate the generation capability of these techniques on complex instances and attributes, we designed the COCO-MIG benchmark.

The MIG bench is based on COCO images and their layouts, using the color attribute of instances as the starting point. It filters out layouts with smaller areas and instances related to humans and assigns a random color to each instance. Through specific templates, it can also construct a global prompt for each image. This bench, constructed in this way, not only retains the relatively natural distribution of COCO but also introduces complex attributes and counterfactual cases through random color assignment, greatly increasing the difficulty of generation, thus making it challenging.

During evaluation, we utilize the GroundedSAM model to detect and segment each instance. We then analyze the distribution of colors in the HSV color space for each object and calculate the proportion of the corresponding color to determine if the object's color meets the requirements. By calculating the proportion of instances correctly generated in terms of attributes and positions, along with their MIOU, we reflect the model's performance in position and attribute control.

You can find more details in our Paper.

Installation

Conda environment setup

conda create --name eval_mig python=3.8 -y
conda activate eval_mig

conda config --append channels conda-forge
conda install pytorch==1.11.0 torchvision cudatoolkit=11.3.1 -c pytorch

export AM_I_DOCKER=False
export BUILD_WITH_CUDA=True
export CUDA_HOME=/path/to/cuda-11.3/

python -m pip install -e segment_anything
python -m pip install -e GroundingDINO

pip install opencv-python pycocotools matplotlib onnxruntime onnx nltk imageio supervision==0.7.0 protobuf==3.20.2 pytorch_fid

Note that you should install GroundingDINO on the GPU in order to properly run the evaluation code with cuda. If you encounter problems, you can refer to Issue for more details.

Checkpoints

To run the evaluation process, you need to download some model weights.

Download the GroundingDINO checkpoint:

wget -q https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha/groundingdino_swint_ogc.pth

You shoule also download ViT-H SAM model in SAM

You can also manually download the weights for Bert.

If you want to test CLIP scores, you'll also need to download the CLIP model weights.

Put all these checkpoints under ../pretrained/ folder:

├── pretrained
│   ├── bert-base-uncased
│   │   ├── config.json
│   │   ├── pytorch_model.bin
│   │   ├── tokenizer_config.json
│   │   ├── tokenizer.json
│   │   └── vocab.txt
│   ├── clip
│   │   ├── config.json
│   │   ├── merges.txt
│   │   ├── preprocessor_config.json
│   │   ├── pytorch_model.bin
│   │   ├── special_tokens_map.json
│   │   ├── tokenizer_config.json
│   │   ├── tokenizer.json
│   │   └── vocab.json
│   ├── groundingdino_swint_ogc.pth
│   └── sam_vit_h_4b8939.pth

Evaluation Pipeline

Step 1 (Optional) Sampling MIG prompts

You can choose to resample prompts for evaluation. You can check the entire steps of Resampling.

You can also generate your image on the 800 prompts that have been sampled from MIG-Bench.

Step 2 Generation

Use the sampled prompts and layouts to generate images.

You can try our MIGC method, hope you enjoy it.

Step 3 Evaluation

Finally, you can start evaluating your model now.

python eval_mig.py \
--need_miou_score \
--need_instance_sucess_ratio \
--metric_name 'eval' \
--image_dir /path/of/image/

Evaluation Results

We re-sampled a version of the COCO-MIG benchmark, filtering out examples related to humans. Based on the new version of bench, we sampled 800 images and compared them with InstanceDiffusion, GLIGEN, etc. On MIG-Bench, the results are shown below. You can also find the image results and bench layout information that we generate in some of the methods in the Example.

Method	MIOU↑						Instance Success Rate↑						Model Type	Publication
Method	L2	L3	L4	L5	L6	Avg	L2	L3	L4	L5	L6	Avg	Model Type	Publication
Box-Diffusion	0.37	0.33	0.25	0.23	0.23	0.26	0.28	0.24	0.14	0.12	0.13	0.16	Training-free	ICCV2023
Gligen	0.37	0.29	0.253	0.26	0.26	0.27	0.42	0.32	0.27	0.27	0.28	0.30	Adapter	CVPR2023
ReCo	0.55	0.48	0.49	0.47	0.49	0.49	0.63	0.53	0.55	0.52	0.55	0.55	Full model tuning	CVPR2023
InstanceDiffusion	0.52	0.48	0.50	0.42	0.42	0.46	0.58	0.52	0.55	0.47	0.47	0.51	Adapter	CVPR2024
Ours	0.64	0.58	0.57	0.54	0.57	0.56	0.74	0.67	0.67	0.63	0.66	0.66	Adapter	CVPR2024

Acknowledgements

MIG-Bench is based on GroundedSAM, SAM,CLIP, Bert and GroundingDINO. We appreciate their outstanding contributions.

Citation

If you find this repository useful, please use the following BibTeX entry for citation.

@misc{zhou2024migc,
      title={MIGC: Multi-Instance Generation Controller for Text-to-Image Synthesis}, 
      author={Dewei Zhou and You Li and Fan Ma and Xiaoting Zhang and Yi Yang},
      year={2024},
      eprint={2402.05408},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
GroundingDINO		GroundingDINO
eval		eval
segment_anything		segment_anything
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
bench_resample.md		bench_resample.md
coco_val.npz		coco_val.npz
eval_mig.py		eval_mig.py
layout_visualizer.py		layout_visualizer.py
mig_bench_prepare.py		mig_bench_prepare.py
mig_bench_sample.py		mig_bench_sample.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MIG_Bench

[Paper] [Project Page] [Code]

To Do List

Introduction

Installation

Conda environment setup

Checkpoints

Evaluation Pipeline

Step 1 (Optional) Sampling MIG prompts

Step 2 Generation

Step 3 Evaluation

Evaluation Results

Acknowledgements

Citation

About

Releases

Packages

Languages

License

LeyRio/MIG_Bench

Folders and files

Latest commit

History

Repository files navigation

MIG_Bench

[Paper] [Project Page] [Code]

To Do List

Introduction

Installation

Conda environment setup

Checkpoints

Evaluation Pipeline

Step 1 (Optional) Sampling MIG prompts

Step 2 Generation

Step 3 Evaluation

Evaluation Results

Acknowledgements

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages