LDFusion: Infrared and Visible Image Fusion with Language-driven Loss in CLIP Embedding Space

Yuhao Wang, Lingjuan Miao, Zhiqiang Zhou, Lei Zhang, Yajun Qiao

We first propose to use nature language to express the whole objective of IVIF, which allows to avoid the complex and explicit mathematical modeling in current fusion loss functions.
A language-driven fusion model is derived in CLIP embedding space, based on which we develop a simple yet highly effective language-driven loss for IVIF. Particularly, by introducing a novel regularization and patch filtering approach, we ensure high robustness of the trained model in practice and resolve the challenge of removing textual artifacts induced by CLIP.
Experiments show a great improvement of fusion quality achieved by the proposed method, revealing the superiority of language in modeling of the fusion output and the potential of pre-trained vision-language model in improving the IVIF performance.

Usage

1. Create Environment

create conda environment

conda create -n LDFusion python=3.9.12
conda activate LDFusion

Install Dependencies

pip install -r requirements.txt

(recommended cuda11.1 and torch 1.8.2)

2. Data Preparation and Running

Please put test data into the test_imgs directory (infrared images in ir subfolder, visible images in vi subfolder), and run python src/test.py.

(Note: The weight files (*.pt) might require independent download from the repository)

Then, the fused results will be saved in the ./results/ folder.

Examples

From left to right are the infrared image, visible image, and the fused image generated by LDFusion.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LDFusion: Infrared and Visible Image Fusion with Language-driven Loss in CLIP Embedding Space

Usage

1. Create Environment

2. Data Preparation and Running

Examples

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
doc		doc
results		results
src		src
test_imgs		test_imgs
weight		weight
README.md		README.md
requirements.txt		requirements.txt

wyhlaowang/LDFusion

Folders and files

Latest commit

History

Repository files navigation

LDFusion: Infrared and Visible Image Fusion with Language-driven Loss in CLIP Embedding Space

Usage

1. Create Environment

2. Data Preparation and Running

Examples

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages