A Simple and Fast Implementation of RGBD Faster R-CNN

1. Introduction

This project is a RGBD Faster R-CNN implementation based on Chen Yun's faster rcnn . It aims to:

Easily transform the origin RGB Faster R-CNN to RGBD version which can easily run on NYUV2 dataset
Extract the feature from RGB and Depth images respectively and concat them at Conv5 layer followed by a NIN layer
Be straightly transfer to other muti-modality dataset(such as RGB-infrared)

2. Performance

Evaluate on NYU Depth V2 dataset VGG16 train on trainval and test on test split.

Note: You can run code in the $faster-rcnn-RGB/ folder to get the RGB or Depth result(As the first two rows in the following table) by simplythe change one line #127 in $data/voc_dataset.py file.Training shows great randomness, you may need a bit of luck and more epoches of training to reach the highest mAP. However, it should be easy to surpass the lower bound.

Implementation	mAP
RGB(~4G)	0.337
Depth(~4G)	0.348
RGBD(concat at Conv5)(~5G)	0.394
RGBD(concat at fc7)(~6G)	0.386

3. Install dependencies

requires python3 and PyTorch 0.3

install PyTorch >=0.3 with GPU (code are GPU-only), refer to official website
install cupy, you can install via pip install but it's better to read the docs and make sure the environ is correctly set
install other dependencies: pip install -r requirements.txt
Optional, but strongly recommended: build cython code nms_gpu_post:
```
cd model/utils/nms/
python3 build.py build_ext --inplace
```
start vidom for visualization

python3 -m visdom.server

5. Train

5.1 Prepare data

NYU Depth V2 dataset

Download the training, validation, test data from Gupta 's dataset' or the official website
Transform the format of origin dataset to the format of "Pascal VOC2007".You can refer to Gupta's code

It should have this basic structure of 'nyuv2' folder

$Annotations/                 # annotations
$ImageSets/                   # image list
$JPEGImages/                  # RGB image sets
$JPEGImages_depth/      # Depth image sets
# ... and several other directories ...

modify voc_data_dir cfg item in utils/config.py, or pass it to program using argument like --voc-data-dir=/path/to/nyuv2/ .

5 begin training

mkdir checkpoints/ # folder for snapshots

sh train.sh

you may refer to utils/config.py for more argument.

Some Key arguments:

--plot-every=n: visualize prediction, loss etc every n batches.
--env: visdom env for visualization
--voc_data_dir: where the VOC data stored
--use-drop: use dropout in RoI head, default False
--use-Adam: use Adam instead of SGD, default SGD. (You need set a very low lr for Adam)
--load-path: pretrained model path, default None, if it's specified, it would be loaded.

you may open browser, visit http://<ip>:8097 and see the visualization of training procedure as below:

If you're in China and encounter problem with visdom (i.e. timeout, blank screen), you may refer to visdom issue, and see troubleshooting for solution.

Troubleshooting

visdom

Some js files in visdom was blocked in China, see simple solution here

Also, updata=append doesn't work due to a bug brought in latest version, see issue and fix

You don't need to build from source, modifying related files would be OK.
dataloader: received 0 items of ancdata

see discussion, It's alreadly fixed in train.py. So I think you are free from this problem.
cupy numpy.core._internal.AxisError: axis 1 out of bounds [0, 1)

bug of cupy, see issue, fix via pull request

You don't need to build from source, modifying related files would be OK.
VGG: Slow in construction

VGG16 is slow in construction(i.e. 9 seconds),it could be speed up by this PR

You don't need to build from source, modifying related files would be OK.
About the speed

One strange thing is that, even the code doesn't use chainer, but if I remove from chainer import cuda, the speed drops a lot (train 6.5->6.1,test 14.5->10), because Chainer replaces the default allocator of CuPy by its memory pool implementation. But ever since V4.0, cupy use memory pool as default. However you need to build from souce if you are gona use the latest version of cupy (uninstall cupy -> git clone -> git checkout v4.0 -> setup.py install) @_@

Another simple fix: add from chainer import cuda at the begining of train.py. in such case,you'll need to pip install chainer first.

Acknowledgement

This work builds on many excellent works, which include:

Chen Yun's simple-faster-rcnn-pytorch (mainly)

Licensed under MIT, see the LICENSE for more detail.

Contribution Welcome.

If you encounter any problem, feel free to open an issue.

Correct me if anything is wrong or unclear.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
__pycache__		__pycache__
data		data
faster-rcnn-RGB		faster-rcnn-RGB
misc		misc
model		model
utils		utils
.gitattributes		.gitattributes
.tags		.tags
.tags_sorted_by_file		.tags_sorted_by_file
LICENSE		LICENSE
README		README
README.MD		README.MD
demo.ipynb		demo.ipynb
demo.py		demo.py
requirements.txt		requirements.txt
train.py		train.py
train.sh		train.sh
trainer.py		trainer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A Simple and Fast Implementation of RGBD Faster R-CNN

1. Introduction

2. Performance

3. Install dependencies

5. Train

5.1 Prepare data

NYU Depth V2 dataset

5 begin training

Troubleshooting

Acknowledgement

About

Releases

Packages

Languages

License

gd2016229035/faster-rcnn-rgbd

Folders and files

Latest commit

History

Repository files navigation

A Simple and Fast Implementation of RGBD Faster R-CNN

1. Introduction

2. Performance

3. Install dependencies

5. Train

5.1 Prepare data

NYU Depth V2 dataset

5 begin training

Troubleshooting

Acknowledgement

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages