Universal Adversarial Perturbations for Vision-Language Pre-trained Models

This is the PyTorch implementation of the paper "Universal Adversarial Perturbations for Vision-Language Pre-trained Models" at SIGIR 24.

Requirements

pytorch 1.10.2
transformers 4.8.1
timm 0.4.9
bert_score 0.3.11

Prepare datasets and models

Download the datasets, Flickr30k and MSCOCO (the annotations are provided in ./data_annotation/), and put them into ./Dataset. Set the root path of the dataset in ./configs/Retrieval_flickr.yaml, image_root.

The checkpoints of the fine-tuned VLP models are accessible in CLIP, ALBEF, TCL, BLIP, and put them into ./checkpoint.

Learn universal adversarial perturbations

Set paths of source/target model names and checkpoints, dataset names and roots, test file path, original_rank_index_path and so on in corresponding main files before running them.

# Learn UAPs by taking CLIP as the victim
python Attack_CLIP.py

# Learn UAPs by taking ALBEF/TCL as the victim 
python Attack_ALBEFTCL.py

Evaluation

Image-Text Retrieval

# Eval CLIP models:
python Eval_Retrieval_CLIP.py

# Eval ALBEF models:
python Eval_Retrieval_ALBEF.py

# Eval TCL models:
python Eval_Retrieval_TCL.py

Visual Grounding

Download Refcoco+ datasets from the origin website, and set 'image_root' in configs/Grounding.yaml accordingly.
# Eval:
python Eval_Grounding.py

Image Captioning

Download the MSCOCO dataset from the original websites, and set 'image_root' in configs/caption_coco.yaml accordingly.
# Eval:
python Eval_ImgCap_BLIP.py

Citation

If you find this code to be useful for your research, please consider citing our paper .

@inproceedings{zhang2024universal,
  title={Universal Adversarial Perturbations for Vision-Language Pre-trained Models},
  author={Zhang, Peng-Fei and Huang, Zi and Bai, Guangdong},
  booktitle={Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval},
  pages={862--871},
  year={2024}
}

Reference

Co-Attack, SGA, ALBEF, BLIP.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Universal Adversarial Perturbations for Vision-Language Pre-trained Models

Requirements

Prepare datasets and models

Learn universal adversarial perturbations

Evaluation

Image-Text Retrieval

Visual Grounding

Image Captioning

Citation

Reference

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
Dataset		Dataset
attacks		attacks
checkpoint		checkpoint
configs		configs
dataset		dataset
models		models
refTools		refTools
std_eval_idx		std_eval_idx
Attack_ALBEFTCL.py		Attack_ALBEFTCL.py
Attack_CLIP.py		Attack_CLIP.py
Eval_Grounding.py		Eval_Grounding.py
Eval_ImgCap_BLIP.py		Eval_ImgCap_BLIP.py
Eval_Retrieval_ALBEF.py		Eval_Retrieval_ALBEF.py
Eval_Retrieval_CLIP.py		Eval_Retrieval_CLIP.py
Eval_Retrieval_TCL.py		Eval_Retrieval_TCL.py
README.md		README.md
utils.py		utils.py

sduzpf/UAP_VLP

Folders and files

Latest commit

History

Repository files navigation

Universal Adversarial Perturbations for Vision-Language Pre-trained Models

Requirements

Prepare datasets and models

Learn universal adversarial perturbations

Evaluation

Image-Text Retrieval

Visual Grounding

Image Captioning

Citation

Reference

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages