This is the official implementation of the paper ""DSGN++: Exploiting Visual-Spatial Relation for Stereo-based 3D Detectors"" to jointly estimate scene depth and detect 3D objects in 3D world. With input of binocular image pair, our model achieves over 70+ AP on the KITTI val dataset.
DSGN++: Exploiting Visual-Spatial Relation for Stereo-based 3D Detectors
Authors: Yilun Chen, Shijia Huang, Shu Liu, Bei Yu, Jiaya Jia
- 7/2022: We released the first vision-based model that achieved 70+ AP on the KITTI val set.
(1) Download the KITTI 3D object detection dataset including velodyne, stereo images, calibration matrices, and the road plane. The folders are organized as follows:
ROOT_PATH
├── data
│ ├── kitti
│ │ │── ImageSets
│ │ │── training
│ │ │ ├──calib & velodyne & label_2 & image_2 & image_3 & planes
│ │ │── testing
│ │ │ ├──calib & velodyne & image_2 & image_3
├── pcdet
├── mmdetection-v2.22.0
(2) Generate KITTI data list and joint Stereo-Lidar Copy-Paste database for training.
python -m pcdet.datasets.kitti.lidar_kitti_dataset create_kitti_infos
python -m pcdet.datasets.kitti.lidar_kitti_dataset create_gt_database_only --image_crops
Keep in mind that download and put the pre-computed road plane to ./kitti/training/planes
for precise copy-paste augmentation.
(1) Clone this repository.
git clone https://github.com/chenyilun95/DSGN2
cd DSGN2
(2) Install mmcv-1.4.0 library.
pip install pycocotools==2.0.2
pip install torch==1.7.1 torchvision==0.8.2
pip install -U mim
mim install mmcv-full==1.4.0
(3) Install the spconv library.
sudo apt install libboost-dev
git clone https://github.com/traveller59/spconv --recursive
cd spconv
git reset --hard f22dd9
git submodule update --recursive
python setup.py bdist_wheel
pip install ./dist/spconv-1.2.1-xxx.whl
(4) Install the included mmdetection-v2.22.0.
cd mmdetection-v2.22.0
pip install -e .
(5) Install OpenPCDet library.
pip install -e .
Train the model by
python -m torch.distributed.launch --nproc_per_node=4 tools/train.py \
--launcher pytorch \
--fix_random_seed \
--workers 2 \
--sync_bn \
--save_to_file \
--cfg_file ./configs/stereo/kitti_models/dsgn2.yaml \
--tcp_port 12345 \
--continue_train
Evaluating the model by
python -m torch.distributed.launch --nproc_per_node=4 tools/test.py \
--launcher pytorch \
--workers 2 \
--save_to_file \
--cfg_file ./configs/stereo/kitti_models/dsgn2.yaml \
--exp_name default \
--tcp_port 12345 \
--ckpt_id 60
The evaluation results can be found in the outputing model folder.
We provide the pretrained models of DSGN2 evaluated on the KITTI val set.
Methods | Car | Ped. | Cyc. | Models |
---|---|---|---|---|
DSGN++ | 70.05 | 39.42 | 44.47 | GoogleDrive |
If you find our work useful in your research, please consider citing:
@ARTICLE{chen2022dsgn++,
title={DSGN++: Exploiting Visual-Spatial Relation for Stereo-Based 3D Detectors},
author={Chen, Yilun and Huang, Shijia and Liu, Shu and Yu, Bei and Jia, Jiaya},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
year={2022}
}
Our code is based on several released code repositories. We thank the great code from LIGA-Stereo, OpenPCDet, mmdetection.
If you get troubles or suggestions for this repository, please feel free to contact me (chenyilun95@gmail.com).