Make sure to set up your environment according to the classification README.
Training and evaluation is identical to NAT.
Details will be released soon.
DiNAT is identical to NAT in architecture, with every other layer replaced with Dilated NA. These variants provide similar or better classification accuracy (except for Tiny), but yield significantly better downstream performance.
Model | Resolution | Kernel size | # of Params | FLOPs | Pre-training | Top-1 | Config file |
---|---|---|---|---|---|---|---|
DiNAT-Mini | 224x224 | 7x7 | 20M | 2.7G | - | 81.8% | dinat_mini.yml |
DiNAT-Tiny | 224x224 | 7x7 | 28M | 4.3G | - | 82.7% | dinat_tiny.yml |
DiNAT-Small | 224x224 | 7x7 | 51M | 7.8G | - | 83.8% | dinat_small.yml |
DiNAT-Base | 224x224 | 7x7 | 90M | 13.7G | - | 84.4% | dinat_base.yml |
DiNAT-Large | 224x224 | 7x7 | 200M | 30.6G | ImageNet-22K | 86.6% | |
DiNAT-Large | 384x384 | 7x7 | 200M | 89.7G | ImageNet-22K | 87.4% | |
DiNAT-Large | 384x384 | 11x11 | 200M | 92.4G | ImageNet-22K | 87.5% |
DiNATs variants are identical to Swin in terms of architecture, with WSA replaced with NA and SWSA replaced with DiNA. These variants can provide better throughput on CUDA, at the expense of slightly higher memory footprint, and lower performance.
Model | Resolution | Kernel size | # of Params | FLOPs | Pre-training | Top-1 | Config file |
---|---|---|---|---|---|---|---|
DiNATs-Tiny | 224x224 | 7x7 | 28M | 4.5G | - | 81.8% | dinat_s_tiny.yml |
DiNATs-Small | 224x224 | 7x7 | 50M | 8.7G | - | 83.5% | dinat_s_small.yml |
DiNATs-Base | 224x224 | 7x7 | 88M | 15.4G | - | 83.8% | dinat_s_base.yml |
DiNATs-Large | 224x224 | 7x7 | 197M | 34.5G | ImageNet-22K | 86.5% | dinat_s_large.yml |
DiNATs-Large | 384x384 | 7x7 | 197M | 101.5G | ImageNet-22K | 87.4% | dinat_s_large_384.yml |
Model | # of Params | FLOPs | Top-1 | Config file |
---|---|---|---|---|
NAT-iso-Small | 22M | 4.3G | 80.0% | nat_isotropic_small.yml |
DiNAT-iso-Small | 22M | 4.3G | 80.8% | dinat_isotropic_small.yml |
ViT-rpb-Small | 22M | 4.6G | 81.2% | vit_rpb_small.yml |
NAT-iso-Base | 86M | 16.9G | 81.6% | nat_isotropic_base.yml |
DiNAT-iso-Base | 86M | 16.9G | 82.1% | dinat_isotropic_base.yml |
ViT-rpb-Base | 86M | 17.5G | 82.5% | vit_rpb_base.yml |