Skip to content

Latest commit

 

History

History

Spot-adaptive Knowledge Distillation

Introduction

This repo contains the code of the work. We benchmark 11 state-of-the-art knowledge distillation methods with spot-adaptive KD in PyTorch, including:

  • (FitNet) - Fitnets: hints for thin deep nets
  • (AT) - Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer
  • (SP) - Similarity-Preserving Knowledge Distillation
  • (CC) - Correlation Congruence for Knowledge Distillation
  • (VID) - Variational Information Distillation for Knowledge Transfer
  • (RKD) - Relational Knowledge Distillation
  • (PKT) - Probabilistic Knowledge Transfer for deep representation learning
  • (FT) - Paraphrasing Complex Network: Network Compression via Factor Transfer
  • (FSP) - A Gift from Knowledge Distillation: Fast Optimization, Network Minimization and Transfer Learning
  • (NST) - Like what you like: knowledge distill via neuron selectivity transfer
  • (CRD) - Contrastive Representation Distillation

Running

1.Fetch the pretrained teacher models by:

sh train_single.sh 

which will run the code and save the models to ./run/$dataset/$seed/$model/ckpt

The flags in train_single.sh can be explained as:

  • seed: specify the random seed.
  • dataset: specify the training dataset.
  • num_classes: give the number of categories of the above dataset.
  • model: specify the model, see 'models/init.py' to check the available model types.

Note: the default setting can be seen in config files from 'configs/$dataset/seed-$seed/single/$model.yml'.

2.Run our spot-adaptive KD by:

sh train.sh

Citation

@article{song2022spot,
  title={Spot-adaptive knowledge distillation},
  author={Song, Jie and Chen, Ying and Ye, Jingwen and Song, Mingli},
  journal={IEEE Transactions on Image Processing},
  volume={31},
  pages={3359--3370},
  year={2022},
  publisher={IEEE}
}