Skip to content

Daner-Wang/VTC-LFC

Repository files navigation

[NeurIPS 2022] VTC-LFC: Vision Transformer Compression with Low-Frequency Components

This repository contains PyTorch evaluation code, training code and pretrained models for the following projects:

  • VTC-LFC (Vision Transformer Compression with Low-Frequency Components), NeurIPS 2022

For details see VTC-LFC: Vision Transformer Compression with Low-Frequency Components by Zhenyu Wang, Hao Luo, Pichao Wang, Feng Ding, Fan Wang, and Hao Li.

Pipeline

framework

framework

Model Zoo

We provide compressed DeiT models fine-tuned on ImageNet 2012.

name acc@1 acc@5 #FLOPs #params url
DeiT-tiny 71.6 90.7 0.7G 5.1M model
DeiT-small 79.4 94.6 2.1G 17.7M model
DeiT-base 81.3 95.3 7.5G 63.5M model

We also provide the pruned models without fine-tuning.

name #FLOPs #params url
DeiT-tiny w/o fine-tuning 0.7G 5.1M model
DeiT-small w/o fine-tuning 2.1G 17.7M model
DeiT-base w/o fine-tuning 7.5G 63.5M model

Usage

First, clone the repository locally:

git clone

Then, install requirements:

pip install -r -requriments.txt
(we use /torch 1.7.1 /torchvision 0.8.2 /timm 0.3.4/ fvcore 0.1.5 /cuda 11.0 / 16G or 32G V100 for training and evaluation.
Note that we use torch.cuda.amp to accelerate speed of training which requires pytorch >=1.6)

Data preparation

Download and extract ImageNet train and val images from http://image-net.org/. The directory structure is the standard layout for the torchvision datasets.ImageFolder, and the training and validation data is expected to be in the train/ folder and val folder respectively:

/path/to/imagenet/
  train/
    class1/
      img1.jpeg
    class2/
      img2.jpeg
  val/
    class1/
      img3.jpeg
    class/2
      img4.jpeg

Pretrained models

You can download the official pretrained models from here.

name acc@1 acc@5 #params url
DeiT-tiny 72.2 91.1 5M model
DeiT-small 79.9 95.0 22M model
DeiT-base 81.8 95.6 86M model

Pruning

To pruning a pre-trained DeiT model on ImageNet:

python -m torch.distributed.launch --master_port 29500 --nproc_per_node=2 --use_env ./pruning/pruning_py/pruning_deit.py \
--device cuda \
--data-set IMNET \
--batch-size 128 \
--num-classes 1000 \
--dist-eval \
--model deit_tiny_cfged_patch16_224 \
--data-path /path/to/imagenet \
--resume /path/to/pretrained model \
--prune-part qkv,fc1 \
--prune-criterion lfs \
--prune-block-id 0,1,2,3,4,5,6,7,8,9,10,11 \
--cutoff-channel 0.1 \
--cutoff-token 0.85 \
--lfs-lambda 0.1 \
--num-samples 2000 \
--keep-qk-scale \
--prune-mode bcp \
--allowable-drop 9.5 \
--drop-for-token 0.56 \
--prune --prune-only \
--output-dir /path/to/save

Fine-tuning

To fine-tuning a pruned DeiT model on ImageNet:

python -m torch.distributed.launch --master_port 29500 --nproc_per_node=8 --use_env ./pruning/pruning_py/pruning_deit.py \
--device cuda \
--epochs 300 \
--lr 0.0001 \
--data-set IMNET \
--num-classes 1000 \
--dist-eval \
--model deit_tiny_cfged_patch16_224 \
--batch-size 512 \
--teacher-model deit_tiny_cfged_patch16_224 \
--distillation-alpha 0.1 \
--distillation-type hard \
--warmup-epochs 0 \
--keep-qk-scale \
--finetune-only \
--output-dir /path/to/save \
--data-path /path/to/imagenet \
--teacher-path /path/to/original model  \
--resume /path/to/pruned model

Citation

If you use this code for a paper please cite:

@inproceedings{
wang2022vtclfc,
title={{VTC}-{LFC}: Vision Transformer Compression with Low-Frequency Components},
author={Zhenyu Wang and Hao Luo and Pichao WANG and Feng Ding and Fan Wang and Hao Li},
booktitle={Thirty-Sixth Conference on Neural Information Processing Systems},
year={2022},
url={https://openreview.net/forum?id=HuiLIB6EaOk}
}

License

This repository is released under the Apache 2.0 license as found in the LICENSE file.

Contributing

We actively welcome your pull requests!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published