Add ASFF (three fuse feature layers) int the Head for V5(s,m,l,x) #2348

positive666 · 2021-03-03T06:04:00Z

🚀 Feature

Add ASFF fuse feature layers to the Head : the level1-level 3 scale maps are respectively fused into 3 corresponding scale feature maps, and the fusion weights are adaptively adjusted.

Motivation

Refer to the feature fusion case of yolov3_asff. paper
Add optional four yolov5_asff models structure (in yaml file )
The ASFF method is very suitable for the YOLO series, and through reading the paper, I found that it has a reasonable explanatory nature. It can be incorporated into an alternative structure of V5.
Integrate ASFF functions into the project and hope to make a contribution for yoloV5 project

Pitch

I add ASFFV5 classes at 310 line in https://github.com/positive666/yolov5/blob/master/models/common.py :
Add asff layers structure for yolov5(s,m,x,l),Integrated into YOLOV5's code project. and different more than v3_asff and add RFB block.such as, yolov5s.yaml:

head:
  [[-1, 1, Conv, [512, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 6], 1, Concat, [1]],  # cat backbone P4
   [-1, 3, C3, [512, False]],  # 13

   [-1, 1, Conv, [256, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 4], 1, Concat, [1]],  # cat backbone P3
   [-1, 3, C3, [256, False]],  # 17 (P3/8-small)

   [-1, 1, Conv, [256, 3, 2]],
   [[-1, 14], 1, Concat, [1]],  # cat head P4
   [-1, 3, C3, [512, False]],  # 20 (P4/16-medium)

   [-1, 1, Conv, [512, 3, 2]],
   [[-1, 10], 1, Concat, [1]],  # cat head P5
   [-1, 3, C3, [1024, False]],  # 23 (P5/32-large)

   [[17,20,23], 1, ASFFV5, [0, 512, 0.5 ]],   
   [[17,20,23], 1, ASFFV5, [1, 256, 0.5 ]],   
   [[17,20,23], 1, ASFFV5, [2, 128 ,0.5]],  
  #[[17, 20, 23], 1, Detect, [nc, anchors]],  # Detect(P3, P4, P5)
  [[26, 25, 24], 1, Detect, [nc, anchors]],  # Detect(P3, P4, P5)
  ]

ASFF Interpretability

The paper also explains why the weight parameter of feature fusion comes from output feature + convolution, because the fusion weight parameter and feature are closely related .

COCO

System	test-dev mAP	Time (V100)	Time (2080ti)
YOLOv3 608	33.0	20ms	26ms
YOLOv3 608+ BoFs	37.0	20ms	26ms
YOLOv3 608 (our baseline)	38.8	20ms	26ms
YOLOv3 608+ ASFF	40.6	22ms	30ms
YOLOv3 608+ ASFF*	42.4	22ms	30ms
YOLOv3 800+ ASFF*	43.9	34ms	38ms
YOLOv3 MobileNetV1 416 + BoFs	28.6	-	22 ms
YOLOv3 MobileNetV2 416 (our baseline)	29.0	-	22 ms
YOLOv3 MobileNetV2 416 +ASFF	30.6	-	24 ms

I also plan to add some other tricks, such as aware IOU, and other transformer idea etc., I will conduct some experiments and changes in the future

The text was updated successfully, but these errors were encountered:

github-actions · 2021-03-03T06:04:43Z

👋 Hello @positive666, thank you for your interest in 🚀 YOLOv5! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.

If this is a 🐛 Bug Report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset images, training logs, screenshots, and a public link to online W&B logging if available.

For business inquiries or professional support requests please visit https://www.ultralytics.com or email Glenn Jocher at glenn.jocher@ultralytics.com.

Requirements

Python 3.8 or later with all requirements.txt dependencies installed, including torch>=1.7. To install run:

$ pip install -r requirements.txt

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Google Colab and Kaggle notebooks with free GPU:
Google Cloud Deep Learning VM. See GCP Quickstart Guide
Amazon Deep Learning AMI. See AWS Quickstart Guide
Docker Image. See Docker Quickstart Guide

Status

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training (train.py), testing (test.py), inference (detect.py) and export (export.py) on MacOS, Windows, and Ubuntu every 24 hours and on every commit.

cszer · 2021-03-03T14:53:41Z

Hello , check issues in yolov4 repo , authors of ASFF used all bag of species , and standalone ASFF add only 0.5 MAP

positive666 · 2021-03-03T15:22:10Z

Hello , check issues in yolov4 repo , authors of ASFF used all bag of species , and standalone ASFF add only 0.5 MAP

I am very happy to receive your reply. Yes, I have verified similar conclusions on some data sets.I want to integrate this module into V5 for the convenience of subsequent research, and my first addition is to add ASFF after PANnet. The output of this ASFFV5 layer is different from V3. I still need to study and understand in the follow-up. I originally wanted to add BIFPN, but I think the increase in the feature layer and the close connection will increase the training time, thank you for your reply。

glenn-jocher · 2021-03-05T22:35:38Z

@positive666 thanks for the idea! I see you submitted a PR, I will take a look there.

I experimented with ASFF with YOLOv3 before, but had difficulty implementing it as we used to build our pytorch models from the darknet cfg files, which placed the output layers in very different places in the model.

I think now with all the output layers located in the Detect() layer, an ASFF implementation should be a bit easier to do.

positive666 · 2021-03-07T05:15:16Z

@glenn-jocher Thank you for your reply. Now I'm verifying this on coco.
Another question I have is. For example, my first change was that the data set was 5000 cigarettes detect dataset and the training was 300 epoch Map is always 0.7. I didn't add any additional training data set. I just want to verify that the addition of ASFF doesn't improve significantly . One of my thoughts here is that even the same MAP can't guarantee the reasoning performance in the future. Now I add some lightweight modules of attention mechanism, which have not been submitted in the PR， I will continue to do some experiments.

glenn-jocher · 2021-03-08T02:24:58Z

@positive666 I think what you're mentioning is generalization of your results to the wider world. Typically this is why COCO is used a benchmark, as it overlaps many common usecases. It takes a long time to train though, so if you want to prototype results quickly I would recommend VOC, which still generalizes somewhat, but is much smaller and faster to train. You can train VOC in Colab in less than a day, especially the smaller models:

https://colab.research.google.com/github/ultralytics/yolov5/blob/master/tutorial.ipynb?hl=en#scrollTo=BSgFCAcMbk1R

# VOC
for b, m in zip([64, 48, 32, 16], ['yolov5s', 'yolov5m', 'yolov5l', 'yolov5x']):  # zip(batch_size, model)
  !python train.py --batch {b} --weights {m}.pt --data voc.yaml --epochs 50 --cache --img 512 --nosave --hyp hyp.finetune.yaml --project VOC --name {m}

cszer · 2021-03-08T22:35:19Z

I have tried your modification, with my modifications that aims to small objects detection. I have achived 30.5 0.5:0.95 small map (Coco) with 75 Gflops(this module adds 20 Gflops) , but ablation is needed to verify impact

cszer · 2021-03-11T17:18:20Z

I have done ablation , this module is useless , adds only 0.4 map to 0.5:0.95 small map for 20 Gflops

cszer · 2021-03-11T17:19:55Z

I think now best target to study - convolutions to involutions replacement

glenn-jocher · 2021-03-11T20:39:46Z

@cszer involutions?

cszer · 2021-03-11T22:12:52Z

@cszer involutions?

Yes, check this paper https://arxiv.org/abs/2103.06255

glenn-jocher · 2021-03-11T22:20:46Z

@cszer wow! Just out yesterday. Thanks for the link.

cszer · 2021-03-11T22:32:03Z

@cszer wow! Just out yesterday. Thanks for the link.

10 telegram channels help me a lot))

glenn-jocher · 2021-03-11T22:44:39Z

@cszer what 10 telegram channels?

Paper seems interesting, a nice bridge between attention (across channels) and convolutions (across image space). AP increase is slight, but it's also accompanied by slight size and FLOPS reductions.
https://github.com/d-li14/involution#object-detection-and-instance-segmentation-on-coco

glenn-jocher · 2021-03-11T23:09:01Z

@cszer I've raised issue #1 on the involutions repo (yay): d-li14/involution#1

The straightforward implementation seems to be to use this involution() module here, replacing the MMDetection Conv modules with the local YOLOv5 Conv() module:
https://github.com/d-li14/involution/blob/main/det/mmdet/models/utils/involution_naive.py

glenn-jocher · 2021-03-12T00:21:11Z

@cszer I've created an Involution PR #2435 to experiment.

positive666 · 2021-03-12T03:34:43Z

I have tried your modification, with my modifications that aims to small objects detection. I have achived 30.5 0.5:0.95 small map (Coco) with 75 Gflops(this module adds 20 Gflops) , but ablation is needed to verify impact

I have done ablation , this module is useless , adds only 0.4 map to 0.5:0.95 small map for 20 Gflops

positive666 · 2021-03-12T03:35:55Z

I have done ablation , this module is useless , adds only 0.4 map to 0.5:0.95 small map for 20 Gflops
@glenn-jocher @cszer ,Hello, I have trained the v5 small scale on VOC before and did some related ablation comparison experiments, and the improvement on the AP of the test set is indeed not big (adding CBAM separately without pre-training weights, on the test set of VOC2007 , Using already trained yolov5:

cbam_v5s mAP@: 0.56 mAP, @.5:.95: 0.3, 16.6 Gflops; (without loading weights)
asff_v5s, mAP@: 0.56 mAP, @.5:.95: 0.38, 20 Gflops;
But I feel that my own experiments on V5s are not sufficient, and the current simple experiments cannot explain the failure of the attention mechanism. I have been busy recently. I will continue to complete the verification, but I added ASFF and CBAM to do it once. Simple ablation. This attempt has caused me some exploration and thinking. I started to pay attention to some of the difficulties in anchor target detection: the introduction of positive samples and the existence of independent and mutual interference between classification and regression. My thoughts It is about detecting the weak correlation between classification and regression. I plan to use these attention mechanisms to improve the LOSS of classification and regression, such as Aware-IOU. Thank you for your feedback.

glenn-jocher · 2021-03-12T05:34:41Z

@positive666 those mAPs seem pretty low, the baseline VOC training script (below) will train YOLOv5s to about 0.85 mAP@0.5 (and YOLOv5x to about 0.92mAP@0.5):

https://colab.research.google.com/github/ultralytics/yolov5/blob/master/tutorial.ipynb?hl=en#scrollTo=BSgFCAcMbk1R

# VOC
for b, m in zip([64, 48, 32, 16], ['yolov5s', 'yolov5m', 'yolov5l', 'yolov5x']):  # zip(batch_size, model)
  !python train.py --batch {b} --weights {m}.pt --data voc.yaml --epochs 50 --cache --img 512 --nosave --hyp hyp.finetune.yaml --project VOC --name {m}

glenn-jocher · 2021-03-12T05:36:41Z

@positive666 BTW, you can see these VOC training logs here:
https://wandb.ai/glenn-jocher/VOC

developer0hye · 2021-05-16T06:25:46Z

@positive666 @glenn-jocher

How about attention layer proposed in ECANet?

Someone already checked its performance with yolov3-tiny.

Look at this results.

positive666 · 2021-05-17T01:40:20Z

In general attention module, the improvement of baseline on YOLOV5's public data set is almost negligible. You can try it. There is indeed ECA code in my FORK warehouse, but I did not register it and tried it. I tried CBAM. And COORD, the latter may behave a little normal, but there is no improvement. My personal thinking here is that YOLOV5's backbone has been trained to have good generalization, and you can also train it yourself! good luck

…

------------------ 原始邮件 ------------------ 发件人: "Yonghye ***@***.***>; 发送时间: 2021年5月16日(星期天) 下午2:25 收件人: ***@***.***>; 抄送: ***@***.***>; ***@***.***>; 主题: Re: [ultralytics/yolov5] Add ASFF (three fuse feature layers) int the Head for V5(s,m,l,x) (#2348) @positive666 @glenn-jocher How about attention layer proposed in ECANet? Someone already checked its performance with yolov3-tiny. Look at this results. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

phunix9 · 2021-05-24T10:31:16Z

In general attention module, the improvement of baseline on YOLOV5's public data set is almost negligible. You can try it. There is indeed ECA code in my FORK warehouse, but I did not register it and tried it. I tried CBAM. And COORD, the latter may behave a little normal, but there is no improvement. My personal thinking here is that YOLOV5's backbone has been trained to have good generalization, and you can also train it yourself! good luck
…
------------------ 原始邮件 ------------------ 发件人: "Yonghye @.>; 发送时间: 2021年5月16日(星期天) 下午2:25 收件人: @.>; 抄送: @.>; @.>; 主题: Re: [ultralytics/yolov5] Add ASFF (three fuse feature layers) int the Head for V5(s,m,l,x) (#2348) @positive666 @glenn-jocher How about attention layer proposed in ECANet? Someone already checked its performance with yolov3-tiny. Look at this results. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

@positive666 Hello, thank you for your contribution. I have a question after adding cbam layer acoording your code. When I start training, loss becomes Nan after several epochs(such as 10 or 100 epochs). However, when I use yolo5s.yaml without cbam layer, it can train successfully. I wonder if you know the reason. Thanks!

farajist · 2021-10-10T14:08:33Z

@phunix9 did you find a solution to NaN loss issue ?

positive666 added the enhancement New feature or request label Mar 3, 2021

positive666 mentioned this issue Mar 3, 2021

Add ASFF(Adaptively Spatial Feature Fusion) layers in Head for YoloV5 and some attention modules #2349

Closed

positive666 closed this as completed Mar 12, 2021

This was referenced Apr 11, 2021

YOLOv5 v5.0 Release #2762

Merged

YOLOv5 v5.0 release compatibility update for YOLOv3 ultralytics/yolov3#1737

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ASFF (three fuse feature layers) int the Head for V5(s,m,l,x) #2348

Add ASFF (three fuse feature layers) int the Head for V5(s,m,l,x) #2348

positive666 commented Mar 3, 2021 •

edited

Loading

github-actions bot commented Mar 3, 2021 •

edited by UltralyticsAssistant

Loading

cszer commented Mar 3, 2021 •

edited

Loading

positive666 commented Mar 3, 2021 •

edited

Loading

glenn-jocher commented Mar 5, 2021

positive666 commented Mar 7, 2021 •

edited

Loading

glenn-jocher commented Mar 8, 2021

cszer commented Mar 8, 2021 •

edited

Loading

cszer commented Mar 11, 2021

cszer commented Mar 11, 2021

glenn-jocher commented Mar 11, 2021

cszer commented Mar 11, 2021

glenn-jocher commented Mar 11, 2021

cszer commented Mar 11, 2021

glenn-jocher commented Mar 11, 2021 •

edited

Loading

glenn-jocher commented Mar 11, 2021 •

edited

Loading

glenn-jocher commented Mar 12, 2021

positive666 commented Mar 12, 2021

positive666 commented Mar 12, 2021

glenn-jocher commented Mar 12, 2021 •

edited

Loading

glenn-jocher commented Mar 12, 2021

developer0hye commented May 16, 2021

positive666 commented May 17, 2021 via email

phunix9 commented May 24, 2021 •

edited

Loading

farajist commented Oct 10, 2021

Add ASFF (three fuse feature layers) int the Head for V5(s,m,l,x) #2348

Add ASFF (three fuse feature layers) int the Head for V5(s,m,l,x) #2348

Comments

positive666 commented Mar 3, 2021 • edited Loading

🚀 Feature

Motivation

Pitch

ASFF Interpretability

COCO

github-actions bot commented Mar 3, 2021 • edited by UltralyticsAssistant Loading

Requirements

Environments

Status

cszer commented Mar 3, 2021 • edited Loading

positive666 commented Mar 3, 2021 • edited Loading

glenn-jocher commented Mar 5, 2021

positive666 commented Mar 7, 2021 • edited Loading

glenn-jocher commented Mar 8, 2021

cszer commented Mar 8, 2021 • edited Loading

cszer commented Mar 11, 2021

cszer commented Mar 11, 2021

glenn-jocher commented Mar 11, 2021

cszer commented Mar 11, 2021

glenn-jocher commented Mar 11, 2021

cszer commented Mar 11, 2021

glenn-jocher commented Mar 11, 2021 • edited Loading

glenn-jocher commented Mar 11, 2021 • edited Loading

glenn-jocher commented Mar 12, 2021

positive666 commented Mar 12, 2021

positive666 commented Mar 12, 2021

glenn-jocher commented Mar 12, 2021 • edited Loading

glenn-jocher commented Mar 12, 2021

developer0hye commented May 16, 2021

positive666 commented May 17, 2021 via email

phunix9 commented May 24, 2021 • edited Loading

farajist commented Oct 10, 2021

positive666 commented Mar 3, 2021 •

edited

Loading

github-actions bot commented Mar 3, 2021 •

edited by UltralyticsAssistant

Loading

cszer commented Mar 3, 2021 •

edited

Loading

positive666 commented Mar 3, 2021 •

edited

Loading

positive666 commented Mar 7, 2021 •

edited

Loading

cszer commented Mar 8, 2021 •

edited

Loading

glenn-jocher commented Mar 11, 2021 •

edited

Loading

glenn-jocher commented Mar 11, 2021 •

edited

Loading

glenn-jocher commented Mar 12, 2021 •

edited

Loading

phunix9 commented May 24, 2021 •

edited

Loading