Paper (arXiv) | Poster (TBA) | Video (TBA)
Authors: Muhammad Akhtar Munir, Salman Khan, Muhammad Haris Khan, Mohsen Ali, Fahad Shahbaz Khan
This paper is accepted at NeurIPS 2023 and this repository contains the PyTorch implementation of our proposed Cal-DETR.
Albeit revealing impressive predictive performance for several computer vision tasks, deep neural networks (DNNs) are prone to making overconfident predictions. This limits the adoption and wider utilization of DNNs in many safety-critical applications. There have been recent efforts toward calibrating DNNs, however, almost all of them focus on the classification task. Surprisingly, very little attention has been devoted to calibrating modern DNN-based object detectors, especially detection transformers, which have recently demonstrated promising detection performance and are influential in many decision-making systems. In this work, we address the problem by proposing a mechanism for calibrated detection transformers (Cal-DETR), particularly for Deformable-DETR, UP-DETR, and DINO. We pursue the train-time calibration route and make the following contributions. First, we propose a simple yet effective approach for quantifying uncertainty in transformer-based object detectors. Second, we develop an uncertainty-guided logit modulation mechanism that leverages the uncertainty to modulate the class logits. Third, we develop a logit mixing approach that acts as a regularizer with detection-specific losses and is also complementary to the uncertainty-guided logit modulation technique to further improve the calibration performance. Finally, we perform extensive experiments on three in-domain and four out-domain scenarios. Results corroborate the effectiveness of Cal-DETR against the competing train-time methods in calibrating both in-domain and out-domain detections while maintaining or even improving the detection performance.
Reliability Diagrams: Selected classes from MS-COCO.
Results report Detection Expected Calibration Error (D-ECE) for In-Domain (MS-COCO) and Out-Domain (CorCOCO).
Methods | D-ECE (MS-COCO) | APbox (MS-COCO) | D-ECE (CorCOCO) | APbox (CorCOCO) | model |
---|---|---|---|---|---|
Baseline | 12.8 | 44.0 | 10.8 | 23.9 | link |
Temp. Scaling | 14.2 | 44.0 | 12.3 | 23.9 | - |
MDCA | 12.2 | 44.0 | 11.1 | 23.5 | link |
MbLS | 15.7 | 44.4 | 12.4 | 23.5 | link |
TCD | 11.8 | 44.1 | 10.4 | 23.8 | link |
Cal-DETR | 8.4 | 44.4 | 8.9 | 24.0 | link |
Requirements: This implementation is on:
Linux, CUDA>=11.0
Python>=3.7
PyTorch>=1.7.0
Deformable-DETR: For complete Installation and usage instructions, follow the guidelines here
DINO: Follow the guidelines here
UP-DETR: Follow the guidelines here
For more setup details (Training and Evaluation etc.) we refer here
Please cite the following, if you find this work useful in your research:
@article{munir2023caldetr,
title={Cal-DETR: Calibrated Detection Transformer},
author={ Munir, Muhammad Akhtar and Khan, Salman and Khan, Muhammad Haris and Ali, Mohsen and Khan, Fahad},
journal={Neural Information Processing Systems (NeurIPS)},
year={2023}
}
@article{munir2023bridging,
title={Bridging Precision and Confidence: A Train-Time Loss for Calibrating Object Detection},
author={ Munir, Muhammad Akhtar and Khan, Muhammad Haris and Khan, Salman and Khan, Fahad},
journal={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2023}
}
In case of any query, create issue or contact akhtar.munir@mbzuai.ac.ae
This codebase is built on Deformable-DETR and Detection Calibration