This is the official repository for our CVPR 2024 paper: Robust Synthetic-to-Real Transfer for Stereo Matching.
paper: [arxiv]
We aim to fine-tune stereo networks without compromising robustness to unseen domains. We identify that learning new knowledge without sufficient regularization and overfitting GT details can degrade the robustness. We propose the DKT framework, which improves fine-tuning by dynamically measuring what has been learned.
- Release Training Code.
- Release Checkpoint.
Fine-tuned checkpoints of DKT-Stereo can be downloaded from google drive
The sceneflow pre-trained checkpoints can be obtained from IGEV and RAFT-Stereo.
-
NVIDIA RTX 3090
-
Python 3.8
-
pytorch 1.12
conda create -n DKT_Stereo python=3.8
conda activate DKT_Stereo
conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch -c nvidia
pip install opencv-python
pip install scikit-image
pip install tensorboard
pip install matplotlib
pip install tqdm
pip install timm==0.5.4
To evaluate/train DKT-Stereo, you will need to download the required datasets.
By default stereo_datasets.py
will search for the datasets in these locations.
├── /data
├── KITTI
├── KITTI_2012
├── training
├── testing
├── KITTI_2015
├── training
├── testing
├── Booster_dataset
├── full
├── half
├── quarter
├── train
├── Middlebury
├── MiddEval3
├── trainingF
├── trainingH
├── trainingQ
├── ETH3D
├── two_view_training
├── two_view_training_gt
python tools/evaluate_stereo.py --config configs/raft_stereo/base.json --restore_ckpt ckpt/dkt-raft/booster_ft.pth --logdir output/eval/dkt-raft
python tools/evaluate_stereo.py --config configs/igev_stereo/base.json --restore_ckpt ckpt/dkt-igev/kitti_ft.pth --logdir output/eval/dkt-igev
Booster fine-tuning. This current fine-tuning code on booster is different from the implementation for online submission checkpoints, which use the cascade training strategy as PCVNet.
bash run_scripts/raft-stereo/ft_booster.sh gpus(0,1) output_dir(/output/raftstereo/booster_ft)
KITTI fine-tuning.
bash run_scripts/igev/ft_kitti.sh gpus(0,1,2,3) output_dir(/output/igevstereo/kitti_ft)
If you find our work useful in your research, please consider citing our paper:
@inproceedings{zhang2024robust,
title={Robust Synthetic-to-Real Transfer for Stereo Matching},
author={Zhang, Jiawei and Li, Jiahe and Huang, Lei and Yu, Xiaohan and Gu, Lin and Zheng, Jin and Bai, Xiao},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={20247--20257},
year={2024}
}
This project is based on IGEV and RAFT-Stereo. Thanks for these great projects!