Our code is based on ruotianluo/pytorch-faster-rcnn and WSCDN
Sincerely thanks for your resources.
Newer version of our code (based on Detectron 2) work in progress.
We use one RTX 2080Ti GPU (11GB) to train and evaluate our method, GPU with larger memory is better (e.g., TITAN RTX with 24GB memory)
- Python 3.6 or higher
- CUDA 10.0 with corresponding cuDNN
- PyTorch 1.2.0
- numpy 1.18.1
- opencv 3.4.2
- scipy 1.1.0 (note that scipy >= 1.3.0 has removed imresize operation, you need to rewrite the corresponding code if using a newer scipy)
We provide a full requirements.txt (namely lbba.yml) in the workspace.
- selective_search_data: precomputed proposals of VOC 2007/2012
- pretrained_models/imagenet_pretrain: imagenet pretrained models of WSOD backbone/LBBA backbone
- pretrained_models/pretrained_on_wsddn: pretrained WSOD network of VOC 2007/2012, using this pretrained model usually converges faster and more stable.
- models/voc07: our pretrained WSOD
- models/lbba: our pretrained LBBA
- codes_zip: our code template of LBBA training procedure and LBBA-boosted WSOD training procedure
We use Anaconda to construct our experimental environment.
Install all required packages (or simply follow lbba_requirements.txt).
We have initialized all directories with gitkeep files.
first, cd lbba_boosted_wsod
then, download selective_search_data/* into data/selective_search_data
download pretrained_models/imagenet_pretrain/* into data/imagenet_weights
download pretrained_models/pretrained_on_wsddn/* into data/wsddn_weights
Same with rbgirshick/py-faster-rcnn
For example, PASCAL VOC 2007 dataset
-
Download the training, validation, test data and VOCdevkit
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCdevkit_08-Jun-2007.tar
-
Extract all of these tars into one directory named
VOCdevkit
tar xvf VOCtrainval_06-Nov-2007.tar tar xvf VOCtest_06-Nov-2007.tar tar xvf VOCdevkit_08-Jun-2007.tar
-
It should have this basic structure
$VOCdevkit/ # development kit $VOCdevkit/VOCcode/ # VOC utility code $VOCdevkit/VOC2007 # image sets, annotations, etc. # ... and several other directories ...
-
Create symlinks for the PASCAL VOC dataset
cd $FRCN_ROOT/data ln -s $VOCdevkit VOCdevkit2007
Download models/voc07/voc07_55.8.pth to lbba_boosted_wsod/
./test_voc07.sh 0 pascal_voc vgg16 voc07_55.8.pth
Note that different environments might result in a slight performance drop. For example, we obtain 55.8 mAP with CUDA 10.1 but obtain 55.5 mAP using the same code with CUDA 11.
Download models/lbba/lbba_final.pth (or lbba_init.pth) to lbba_boosted_wsod/
bash train_wsod.sh 1 pascal_voc vgg16 voc07_wsddn_pre lbba_final.pth
Note that we provide different LBBA checkpoints (initialization stage, final stage, or even one-class adjuster mentioned in the suppl.).
@InProceedings{Dong_2021_ICCV,
author = {Dong, Bowen and Huang, Zitong and Guo, Yuelin and Wang, Qilong and Niu, Zhenxing and Zuo, Wangmeng},
title = {Boosting Weakly Supervised Object Detection via Learning Bounding Box Adjusters},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
month = {October},
year = {2021},
pages = {2876-2885}
}