GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond
Official Repo
Code Snippet
The Non-Local Network (NLNet) presents a pioneering approach for capturing long-range dependencies, via aggregating query-specific global context to each query position. However, through a rigorous empirical analysis, we have found that the global contexts modeled by non-local network are almost the same for different query positions within an image. In this paper, we take advantage of this finding to create a simplified network based on a query-independent formulation, which maintains the accuracy of NLNet but with significantly less computation. We further observe that this simplified design shares similar structure with Squeeze-Excitation Network (SENet). Hence we unify them into a three-step general framework for global context modeling. Within the general framework, we design a better instantiation, called the global context (GC) block, which is lightweight and can effectively model the global context. The lightweight property allows us to apply it for multiple layers in a backbone network to construct a global context network (GCNet), which generally outperforms both simplified NLNet and SENet on major benchmarks for various recognition tasks. The code and configurations are released at this https URL.
@inproceedings{cao2019gcnet,
title={Gcnet: Non-local networks meet squeeze-excitation networks and beyond},
author={Cao, Yue and Xu, Jiarui and Lin, Stephen and Wei, Fangyun and Hu, Han},
booktitle={Proceedings of the IEEE International Conference on Computer Vision Workshops},
pages={0--0},
year={2019}
}
Method |
Backbone |
Crop Size |
Lr schd |
Mem (GB) |
Inf time (fps) |
mIoU |
mIoU(ms+flip) |
config |
download |
GCNet |
R-50-D8 |
512x1024 |
40000 |
5.8 |
3.93 |
77.69 |
78.56 |
config |
model | log |
GCNet |
R-101-D8 |
512x1024 |
40000 |
9.2 |
2.61 |
78.28 |
79.34 |
config |
model | log |
GCNet |
R-50-D8 |
769x769 |
40000 |
6.5 |
1.67 |
78.12 |
80.09 |
config |
model | log |
GCNet |
R-101-D8 |
769x769 |
40000 |
10.5 |
1.13 |
78.95 |
80.71 |
config |
model | log |
GCNet |
R-50-D8 |
512x1024 |
80000 |
- |
- |
78.48 |
80.01 |
config |
model | log |
GCNet |
R-101-D8 |
512x1024 |
80000 |
- |
- |
79.03 |
79.84 |
config |
model | log |
GCNet |
R-50-D8 |
769x769 |
80000 |
- |
- |
78.68 |
80.66 |
config |
model | log |
GCNet |
R-101-D8 |
769x769 |
80000 |
- |
- |
79.18 |
80.71 |
config |
model | log |
Method |
Backbone |
Crop Size |
Lr schd |
Mem (GB) |
Inf time (fps) |
mIoU |
mIoU(ms+flip) |
config |
download |
GCNet |
R-50-D8 |
512x512 |
80000 |
8.5 |
23.38 |
41.47 |
42.85 |
config |
model | log |
GCNet |
R-101-D8 |
512x512 |
80000 |
12 |
15.20 |
42.82 |
44.54 |
config |
model | log |
GCNet |
R-50-D8 |
512x512 |
160000 |
- |
- |
42.37 |
43.52 |
config |
model | log |
GCNet |
R-101-D8 |
512x512 |
160000 |
- |
- |
43.69 |
45.21 |
config |
model | log |
Method |
Backbone |
Crop Size |
Lr schd |
Mem (GB) |
Inf time (fps) |
mIoU |
mIoU(ms+flip) |
config |
download |
GCNet |
R-50-D8 |
512x512 |
20000 |
5.8 |
23.35 |
76.42 |
77.51 |
config |
model | log |
GCNet |
R-101-D8 |
512x512 |
20000 |
9.2 |
14.80 |
77.41 |
78.56 |
config |
model | log |
GCNet |
R-50-D8 |
512x512 |
40000 |
- |
- |
76.24 |
77.63 |
config |
model | log |
GCNet |
R-101-D8 |
512x512 |
40000 |
- |
- |
77.84 |
78.59 |
config |
model | log |