This repository contains the tensorflow implementation of Boosting Standard Classification Architectures Through a Ranking Regularizer
(formely known as In Defense of the Triplet Loss for Visual Recognition)
This code employs triplet loss as a feature embedding regularizer to boost classification performance. It extends standard architectures, like ResNet and Inception, to support both losses with minimal hyper-parameter tuning. During inference, our network supports both classification and embedding tasks without any computational overhead. Quantitative evaluation highlights a steady improvement on five fine-grained recognition datasets. Further evaluation on an imbalanced video dataset achieves significant improvement.
- Python 3+ [Tested on 3.4.7]
- Tensorflow 1+ [Tested on 1.8]
Update base_config._load_user_setup
with your machine configurations
- Where is the datasets dir?
- Where is the pre-trained model dir? I use TF slim pretrained models
- Where to save tf checkpoints?
The current main.py
is configured to use FGVC-Aircraft dataset. To run the code smoothly, your datasets
dir should contain a directory named aircrafts
with the following structure
. ├── fgvc-aircraft-2013b │ └── data │ └── images └── lists
This is the default directory structure when you download the FGVC-Aircraft dataset except for the lists
dir. The lists
dir contains csv files that defines the train, validation and testing splits. For the aircraft dataset, the splits are defined here. Similar splits format for other datasets is available in this repos
Most of the datasets, pretrained, and checkpoint settings are handled in the base_config.py module. Once you have these configuration and parameters set, you should be able to train the network using python main.py
This code achieves the following classification performance on ResNet-50
Cars | Flowers | Dogs | Aircrafts | Birds | |
---|---|---|---|---|---|
Softmax | 85.85 | 85.68 | 69.76 | 83.22 | 64.23 |
Two-head-Center | 88.23 | 85.00 | 70.45 | 84.48 | 65.50 |
Two-Head-Semi-Hard Triplet | 88.22 | 85.52 | 70.69 | 85.08 | 65.20 |
Two-Head-Hard Triplet | 89.44 | 86.61 | 72.70 | 87.33 | 66.19 |
The proposed Two-head architecture is computational very cheap. It's training time increases, over the single head softmax, by approximately 2%. The following images shows a quantitative timing analysis comparing the single head vs two-head architecture using multiple standard architectures
- 0.0.1
- CHANGE: Add quantitative results and timing analysis 7 Jan 2020
- CHANGE: First commit 27 Aug 2019
- CHANGE: Clean code & Update Readme file 07 Sep 2019
- Add code comments
- Improve code documentation
- Report quantitative evaluation
If you found this code useful, please cite the following paper:
@inproceedings{taha2020boosting,
title={Boosting Standard Classification Architectures Through a Ranking Regularizer},
author={Taha, Ahmed and Chen, Yi-Ting and Misu, Teruhisa and Shrivastava, Abhinav and Davis, Larry},
booktitle={The IEEE Winter Conference on Applications of Computer Vision},
pages={758--766},
year={2020}
}
Both tips to improve the code and pull requests to contribute are very welcomed
1 - Support Tensorflow 1.4 & 2