IDNet

Nuri Kim, Donghoon Lee, and Songhwai Oh, "Learning Instance-Aware Object Detection Using Determinantal Point Processes," Computer Vision and Image Understanding, 2020. (Published)

A Tensorflow implementation of IDNet (Learning Instance-Aware Object Detection Using Determinantal Point Processes) by Nuri Kim (nuri.kim@rllab.snu.ac.kr). This repository is based on the tensorflow implementation of Faster R-CNN available here.

Citation

If you find this paper helpful, please consider citing:

@article{kim2020learning,
    Author = {{Nuri Kim and Donghoon Lee and Songhwai Oh}},
    Title = {{Learning Instance-Aware Object Detection Using Determinantal Point Processes}},
    Journal = {{Computer Vision and Image Understanding (CVIU)}},
    Year = {2020},
    volume = {201},
    pages = {103061},
    publisher = {Elsevier}
}

Detection Performance

The current code supports VGG16 model.

With VGG16 (conv5_3):

Train on VOC 2007 trainval and test on VOC 2007 test, 72.2.
Train on VOC 2007+2012 trainval and test on VOC 2007 test, 76.8.
Train on COCO 2014 train set and test on validation set, 27.3.

Prerequisites

A basic Tensorflow installation. The code follows r1.2 format. If you are using r1.0, please check out the r1.0 branch to fix the slim Resnet block issue. If you are using an older version (r0.1-r0.12), please check out the r0.12 branch. While it is not required, for experimenting the original RoI pooling (which requires modification of the C++ code in tensorflow), you can check out my tensorflow fork and look for tf.image.roi_pooling.
Python packages you might not have: cython, opencv-python, easydict (similar to py-faster-rcnn). For easydict make sure you have the right version. I use 1.6.

Installation

Clone the repository

git clone https://github.com/bareblackfoot/IDNet.git

Update your -arch in setup script to match your GPU

cd IDNet/lib
# Change the GPU architecture (-arch) if necessary
vim setup.py

GPU model	Architecture
TitanX (Maxwell/Pascal)	sm_52
GTX 960M	sm_50
GTX 1080 (Ti)	sm_61
Grid K520 (AWS g2.2xlarge)	sm_30
Tesla K80 (AWS p2.xlarge)	sm_37

Note: You are welcome to contribute the settings on your end if you have made the code work properly on other GPUs. Also even if you are only using CPU tensorflow, GPU based code (for NMS) will be used by default, so please set USE_GPU_NMS False to get the correct output.

Build the Cython modules

make clean
make
cd ..

Install the Python COCO API. The code requires the API to access COCO dataset.

cd data
git clone https://github.com/pdollar/coco.git
cd coco/PythonAPI
make
cd ../../..

Setup data

Please follow the instructions of py-faster-rcnn here to setup VOC and COCO datasets (Part of COCO is done). The steps involve downloading data and optionally creating soft links in the data folder. Since faster RCNN does not rely on pre-computed proposals, it is safe to ignore the steps that setup proposals.

Test with pre-trained models

Download pre-trained model

Onedrive COCO, VOC07, VOC0712.

Create a folder and a soft link to use the pre-trained model

NET=vgg16
TRAIN_IMDB=voc_2007_trainval+voc_2012_trainval
mkdir -p output/${NET}/${TRAIN_IMDB}
cd output/${NET}/${TRAIN_IMDB}
ln -s ../../../data/voc_2007_trainval+voc_2012_trainval ./default
cd ../../..

Test with pre-trained vgg16 models

GPU_ID=0
./experiments/scripts/test_idnet.sh ${GPU_ID} pascal_voc vgg16

Train your own model

Download pre-trained models and weights. The current code support VGG16 and Resnet V1 models. Pre-trained models are provided by slim, you can get the pre-trained models here and set them in the data/imagenet_weights folder. For example for VGG16 model, you can set up like:
```
mkdir -p data/imagenet_weights
cd data/imagenet_weights
wget -v http://download.tensorflow.org/models/vgg_16_2016_08_28.tar.gz
tar -xzvf vgg_16_2016_08_28.tar.gz
mv vgg_16.ckpt vgg16.ckpt
cd ../..
```
Train (and test, evaluation)

./experiments/scripts/train_idnet.sh [GPU_ID] [DATASET] [NET]
# GPU_ID is the GPU you want to test on
# NET in {vgg16, res50} is the network arch to use
# DATASET {pascal_voc, pascal_voc_0712, coco} is defined in train_idnet.sh
# Examples:
./experiments/scripts/train_idnet.sh 0 pascal_voc vgg16
./experiments/scripts/train_idnet.sh 1 coco vgg16

Visualization with Tensorboard

tensorboard --logdir=tensorboard/vgg16/voc_2007_trainval/ --port=7001 &
tensorboard --logdir=tensorboard/vgg16/coco_2014_train/ --port=7002 &

Test and evaluate

./experiments/scripts/test_idnet.sh [GPU_ID] [DATASET] [NET]
# GPU_ID is the GPU you want to test on
# NET in {vgg16, res50} is the network arch to use
# DATASET {pascal_voc, pascal_voc_0712, coco} is defined in test_idnet.sh
# Examples:
./experiments/scripts/test_idnet.sh 0 pascal_voc vgg16
./experiments/scripts/test_idnet.sh 1 coco res101

You can use tools/reval.sh for re-evaluation

By default, trained networks are saved under:

output/[NET]/[DATASET]/default/

Test outputs are saved under:

output/[NET]/[DATASET]/default/[SNAPSHOT]/

Tensorboard information for train and validation is saved under:

tensorboard/[NET]/[DATASET]/default/
tensorboard/[NET]/[DATASET]/default_val/

The default number of training iterations is kept the same to the original Faster R-CNN for PASCAL VOC and COCO.

Name		Name	Last commit message	Last commit date
Latest commit History 355 Commits
data		data
experiments		experiments
lib		lib
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IDNet

Citation

Detection Performance

Prerequisites

Installation

Setup data

Test with pre-trained models

Train your own model

About

Releases

Packages

Contributors 13

Languages

License

bareblackfoot/IDNet

Folders and files

Latest commit

History

Repository files navigation

IDNet

Citation

Detection Performance

Prerequisites

Installation

Setup data

Test with pre-trained models

Train your own model

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 13

Languages

Packages