Skip to content

Latest commit

 

History

History
177 lines (117 loc) · 5.18 KB

README.md

File metadata and controls

177 lines (117 loc) · 5.18 KB

Adaptive Image Quality Assessment via Teaching Large Multimodal Model to Compare

1City University of Hong Kong 2Nanyang Technological University, 3Shanghai Jiao Tong University, 4Jiangxi University of Finance and Economics
*Equal contribution. #Corresponding author.

Motivation

Training & Inference

Structure

Quicker Start with Hugging Face AutoModel

No need to install this GitHub repo.

import requests
import torch
from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained("q-future/Compare2Score", trust_remote_code=True, torch_dtype=torch.float16, device_map="auto")

from PIL import Image
image_path_url = "https://raw.githubusercontent.com/Q-Future/Q-Align/main/fig/singapore_flyer.jpg"
print("The quality score of this image is {}".format(model.score(image_path_url)) 

Installation

Evaluation:

git clone https://github.com/Q-Future/Compare2Score.git
cd Compare2Score
pip install -e .

Training:

pip install -e ".[train]"
pip install flash_attn --no-build-isolation

Visual Quality Scorer

from q_align import Compare2Scorer
from PIL import Image

scorer = Compare2Scorer()
image_path = "figs/i04_03_4.bmp"
print("The quality score of this image is {}.".format(scorer(image_path)))

Training & Evaluation

Get Datasets

Download all IQA datasets and training JSONs

import os, glob
from huggingface_hub import snapshot_download


snapshot_download("VQA-CityU/IQA_data", repo_type="dataset", local_dir="./playground/data", local_dir_use_symlinks=False)

gz_files = glob.glob("playground/data/*.zip")

for gz_file in gz_files:
    print(gz_file)
    os.system("unzip ./playground/data/".format(gz_file))

Evaluation

After preparing the datasets, you can evaluate pre-trained Compare2Score as follows:

python q_align/evaluate/IQA_dataset_eval.py --model-path q-future/Compare2Score --device cuda:0

Training from Scratch

sh scripts/train_bid_csiq_clive_kadid_koniq_live_compare.sh

Citation

@article{zhu2024adaptive,
  title={Adaptive Image Quality Assessment via Teaching Large Multimodal Model to Compare},
  author={Zhu, Hanwei and Wu, Haoning and Li, Yixuan and Zhang, Zicheng and Chen, Baoliang and Zhu, Lingyu and Fang, Yuming and Zhai, Guangtao and Lin, Weisi and Wang, Shiqi},
  journal={arXiv preprint arXiv:2405.19298},
  year={2024},
}