This repository contains the official implementation of VIAT and GMVFool for the paper "Towards Viewpoint-Invariant Visual Recognition via Adversarial Training" (ICCV2023)
and "Improving viewpoint robustness for visual recognition via adversarial training" (Extended Version)
By Shouwei Ruan, Yinpeng Dong, Hang Su, Jianteng Peng, Ning Chen, Xingxing Wei
Ensure that you have these environments:
- Pytorch (1.11.0)
- torchvision (0.13.1)
- timm (0.5.4)
- pytorch-lighting(1.6.5)
- For the complete running environment of VIAT, please refer to
./requirements.txt
. - (🔥Extremely important) Please ensure to follow the ngp_pl(Instant-NGP based on pytorch-lighting) to install the relevant environment for Instant-NGP, since VIAT and GMVFool depend on Instant-NGP for viewpoint rendering.
The released data consists of two parts:
IM3D
: A multi-view dataset used for training, containing 100 hemispherical sampled viewpoint renderings of 1000 virtual 3D objects.
ImageNet-V+
, A larger adversarial viewpoint benchmark, including 100K adversarial viewpoint samples captured by GMVFool on IM3D.
Before beginning, we need to use Instant-NGP to construct NeRF representations for the 1000 objects in IM3D, which will take approximately 24 hours. However, if you only want to conduct attacks or run simple demos, you can opt to train NeRF for a subset of the objects. Due to limited upload space, I regret that I cannot directly share the NeRF weights.
If you have correctly downloaded the IM3D
dataset, extract it into a folder within the project. Then, training nerf can be conducted using the following script:
cd ./run_train_nerf
sh run_train_NeRF_1.sh
Note that you should modified --root_dir
in run_train_NeRF_1.sh
to the path of IM3D
datasets, and the checkpoint will be saved in .\run_train_NeRF\ckpts\nerf
Using the following script to run GMVFool:
cd ./viat
sh run_Attack_1.sh
Note that you should modified --ckpt_attack_path
to the path of nerf checkpoint, and --treat_model
is the target model to attack. The weights of target model can be found in downloadlink, these are the weights we fine-tuned on the 100 classes from ImageNet, corresponding to the categories in IM3D.
For VIAT using GMVFool:
cd ./viat
sh run_train_AT_final_res_viatgf_R.sh
For VIAT using ViewFool(Instant-NGP version):
sh run_train_AT_final_res_viatvf_R.sh
... The code is currently being organized, thank you for your patience.
If you find our methods useful or use the IM3D and imagenet-v+ dataset, please consider citing:
@inproceedings{ruan2023towards,
title={Towards Viewpoint-Invariant Visual Recognition via Adversarial Training},
author={Ruan, Shouwei and Dong, Yinpeng and Su, Hang and Peng, Jianteng and Chen, Ning and Wei, Xingxing},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
pages={4709--4719},
year={2023}
}
@article{ruan2023improving,
title={Improving viewpoint robustness for visual recognition via adversarial training},
author={Ruan, Shouwei and Dong, Yinpeng and Su, Hang and Peng, Jianteng and Chen, Ning and Wei, Xingxing},
journal={arXiv preprint arXiv:2307.11528},
year={2023}
}
This project uses Unofficial implementation of Instant-NGP (ngp_pl), Thanks to @kwea123:
@misc{queianchen_nerf,
author={Quei-An, Chen},
title={Nerf_pl: a pytorch-lightning implementation of NeRF},
url={https://github.com/kwea123/nerf_pl/},
year={2020},
}
If you are interested in viewpoint robustness, welcome to check our previous work: ViewFool: Evaluating the Robustness of Visual Recognition to Adversarial Viewpoints (NIPS2022) [project].
If you have any questions or suggestions about the paper or code, look forward to your contact with us:
- Yinpeng Dong: dyp17@mails.tsinghua.edu.cn
- Shouwei Ruan: shouweiruan@buaa.edu.cn / shouweiruan@gmail.com