A PyTorch implementation for "Towards Interpretable Deep Networks for Monocular Depth Estimation" paper.
arXiv link: https://arxiv.org/abs/2108.05312
For MFF models, we use the dataset they released here, and you can download their models as the baselines here. For BTS models, they use a different set of NYUv2 training images (24,231 instead of 50,688), and you download it here. We put all of our models here.
In this project we use yacs to manage the configurations. To evaluate the performance of a model, for example, the MFF model with SENet backbone using our assigning method, simply run
python eval.py MODEL_WEIGHTS_FILE [PATH_TO_MODEL/mff_senet_asn]
from the root directory.
To evaluate the depth selectivity, run
python dissect.py MODEL_WEIGHTS_FILE [PATH_TO_MODEL/mff_senet_asn] LAYERS D_MFF ON_TRAINING_DATA True
then get the depth selectivity and the dissection result of each unit. Layers' names are seperated by _
.
To train a model from scratch, run
python train.py MODEL_NAME MFF_resnet
We currently provide four options for MODEL_NAME
, and the training scheme will automatically be switched to align with the original ones when using BTS models.
The model part of our code is adapted from Revisiting_Single_Depth_Estimation and bts. Some snippets are adapted from monodepth2.
@inproceedings{you2021iccv,
title = {Towards Interpretable Deep Networks for Monocular Depth Estimation},
author = {Zunzhi You and Yi-Hsuan Tsai and Wei-Chen Chiu and Guanbin Li},
booktitle = {International Conference on Computer Vision (ICCV)},
year = {2021}
}