Skip to content

Latest commit

 

History

History
48 lines (38 loc) · 4.94 KB

File metadata and controls

48 lines (38 loc) · 4.94 KB

DiNAT - Classification

Make sure to set up your environment according to the classification README.

Training and on ImageNet-1K

Training and evaluation is identical to NAT.

Training on ImageNet-22K

Details will be released soon.

Checkpoints

DiNAT

DiNAT is identical to NAT in architecture, with every other layer replaced with Dilated NA. These variants provide similar or better classification accuracy (except for Tiny), but yield significantly better downstream performance.

Model Resolution Kernel size # of Params FLOPs Pre-training Top-1 Config file
DiNAT-Mini 224x224 7x7 20M 2.7G - 81.8% dinat_mini.yml
DiNAT-Tiny 224x224 7x7 28M 4.3G - 82.7% dinat_tiny.yml
DiNAT-Small 224x224 7x7 51M 7.8G - 83.8% dinat_small.yml
DiNAT-Base 224x224 7x7 90M 13.7G - 84.4% dinat_base.yml
DiNAT-Large 224x224 7x7 200M 30.6G ImageNet-22K 86.6%
DiNAT-Large 384x384 7x7 200M 89.7G ImageNet-22K 87.4%
DiNAT-Large 384x384 11x11 200M 92.4G ImageNet-22K 87.5%

DiNATs

DiNATs variants are identical to Swin in terms of architecture, with WSA replaced with NA and SWSA replaced with DiNA. These variants can provide better throughput on CUDA, at the expense of slightly higher memory footprint, and lower performance.

Model Resolution Kernel size # of Params FLOPs Pre-training Top-1 Config file
DiNATs-Tiny 224x224 7x7 28M 4.5G - 81.8% dinat_s_tiny.yml
DiNATs-Small 224x224 7x7 50M 8.7G - 83.5% dinat_s_small.yml
DiNATs-Base 224x224 7x7 88M 15.4G - 83.8% dinat_s_base.yml
DiNATs-Large 224x224 7x7 197M 34.5G ImageNet-22K 86.5% dinat_s_large.yml
DiNATs-Large 384x384 7x7 197M 101.5G ImageNet-22K 87.4% dinat_s_large_384.yml

Isotropic variants

Model # of Params FLOPs Top-1 Config file
NAT-iso-Small 22M 4.3G 80.0% nat_isotropic_small.yml
DiNAT-iso-Small 22M 4.3G 80.8% dinat_isotropic_small.yml
ViT-rpb-Small 22M 4.6G 81.2% vit_rpb_small.yml
NAT-iso-Base 86M 16.9G 81.6% nat_isotropic_base.yml
DiNAT-iso-Base 86M 16.9G 82.1% dinat_isotropic_base.yml
ViT-rpb-Base 86M 17.5G 82.5% vit_rpb_base.yml