Generalized Focal Loss V2: Learning Reliable Localization Quality Estimation for Dense Object Detection
GFocalV2 (GFLV2) is a next generation of GFocalV1 (GFLV1), which utilizes the statistics of learned bounding box distributions to guide the reliable localization quality estimation. For details see GFocalV2.
Backbone | Lr schd | Multi-scale training | box AP | Inf time (fps) | Download |
---|---|---|---|---|---|
R-50 | 1x | No | 41.0 | 19.4 | model |
R-50 | 2x | Yes | 43.9 | 19.4 | model |
R-101 | 2x | Yes | 45.8 | 14.6 | model |
R-101-dcn | 2x | Yes | 48.0 | 12.7 | model |
X-101-dcn | 2x | Yes | 48.8 | 10.7 | model |
R2-101-dcn | 2x | Yes | 49.9 | 10.9 | model |
[1] The reported numbers here are from new experimental trials (in the cleaned repo), which may be slightly different from the original paper.
[2] Note that the 1x performance may be slightly unstable due to insufficient training. In practice, the 2x results are considerably stable between multiple runs.
[3] All results are obtained with a single model and without any test time data augmentation such as multi-scale, flipping and etc..
[4] dcn
denotes deformable convolutional networks.
[5] FPS is tested with a single GeForce RTX 2080Ti GPU, using a batch size of 1.
@article{li2020gfl,
title={Generalized focal loss: Learning qualified and distributed bounding boxes for dense object detection},
author={Li, Xiang and Wang, Wenhai and Wu, Lijun and Chen, Shuo and Hu, Xiaolin and Li, Jun and Tang, Jinhui and Yang, Jian},
journal={arXiv preprint arXiv:2006.04388},
year={2020}
}
@article{li2020gflv2,
title={Generalized Focal Loss V2: Learning Reliable Localization Quality Estimation for Dense Object Detection},
author={Li, Xiang and Wang, Wenhai and Hu, Xiaolin and Li, Jun and Tang, Jinhui and Yang, Jian},
journal={arXiv preprint arXiv:2011.12885},
year={2020}
}