CSWinTransformer

Catalogue

1. Overview
2. Accuracy, FLOPs and Parameters

1. Overview

CSWinTransformer is a new visual Transformer network that can be used as a general backbone network in the field of computer vision. CSWinTransformer proposes to do self-attention through a cross-shaped window, which not only has a very high computational efficiency, but also can obtain a global receptive field through two-layer calculation. CSWinTransformer also proposed a new encoding method: LePE, which further improved the accuracy of the model. Paper

2. Accuracy, FLOPs and Parameters

Models	Top1	Top5	Reference top1	Reference top5	FLOPs (G)	Params (M)
CSWinTransformer_tiny_224	0.8281	0.9628	0.828	-	4.1	22
CSWinTransformer_small_224	0.8358	0.9658	0.836	-	6.4	35
CSWinTransformer_base_224	0.8420	0.9692	0.842	-	14.3	77
CSWinTransformer_large_224	0.8643	0.9799	0.865	-	32.2	173.3
CSWinTransformer_base_384	0.8550	0.9749	0.855	-	42.2	77
CSWinTransformer_large_384	0.8748	0.9833	0.875	-	94.7	173.3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CSWinTransformer_en.md

CSWinTransformer_en.md

CSWinTransformer

Catalogue

1. Overview

2. Accuracy, FLOPs and Parameters

Files

CSWinTransformer_en.md

Latest commit

History

CSWinTransformer_en.md

File metadata and controls

CSWinTransformer

Catalogue

1. Overview

2. Accuracy, FLOPs and Parameters