Skip to content

Latest commit

 

History

History
24 lines (18 loc) · 1.27 KB

CSWinTransformer_en.md

File metadata and controls

24 lines (18 loc) · 1.27 KB

CSWinTransformer


Catalogue

1. Overview

CSWinTransformer is a new visual Transformer network that can be used as a general backbone network in the field of computer vision. CSWinTransformer proposes to do self-attention through a cross-shaped window, which not only has a very high computational efficiency, but also can obtain a global receptive field through two-layer calculation. CSWinTransformer also proposed a new encoding method: LePE, which further improved the accuracy of the model. Paper

2. Accuracy, FLOPs and Parameters

Models Top1 Top5 Reference
top1
Reference
top5
FLOPs
(G)
Params
(M)
CSWinTransformer_tiny_224 0.8281 0.9628 0.828 - 4.1 22
CSWinTransformer_small_224 0.8358 0.9658 0.836 - 6.4 35
CSWinTransformer_base_224 0.8420 0.9692 0.842 - 14.3 77
CSWinTransformer_large_224 0.8643 0.9799 0.865 - 32.2 173.3
CSWinTransformer_base_384 0.8550 0.9749 0.855 - 42.2 77
CSWinTransformer_large_384 0.8748 0.9833 0.875 - 94.7 173.3