Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About [sam] layer. #3708

Closed
ChenCong7375 opened this issue Aug 5, 2019 · 52 comments
Closed

About [sam] layer. #3708

ChenCong7375 opened this issue Aug 5, 2019 · 52 comments
Labels

Comments

@ChenCong7375
Copy link

ChenCong7375 commented Aug 5, 2019

I noticed that you added [sam] layer in darknet. How can we use it?

cfg file with [sam]: yolov3-tiny-sam.cfg.txt

COCO test-dev

Model Size BFLOPS Inference time, ms AP@.5:.95 AP@.5 AP@.75
yolo_v3_tiny_pan3 aa_ae_mixup_scale_giou (no sgdr).txt 416x416 8.4 6.4 18.8% 36.8% 17.5%
yolov3-tiny-prn.cfg.txt 416x416 3.5 3.8 - 33.1% -
enet-coco.cfg.txt 416x416 3.7 22.7 - 45.5% -
@LukeAI
Copy link

LukeAI commented Aug 6, 2019

think it's for thundernet #3702

@WongKinYiu
Copy link
Collaborator

WongKinYiu commented Aug 6, 2019

image

notice that number of filters should be equal to from layer.

@ChenCong7375
Copy link
Author

@WongKinYiu could you please share the cfg file?

@WongKinYiu
Copy link
Collaborator

yolov3-tiny-sam.cfg.txt
here u r.

@LukeAI
Copy link

LukeAI commented Aug 8, 2019

@WongKinYiu Thanks for sharing another novel architecture! - Would you be kind enough to explain a little about the design? I notice it contains only a single Yolo layer, what about rough cocoAP / inf. time on an RTX?

@WongKinYiu
Copy link
Collaborator

WongKinYiu commented Aug 8, 2019

#3380 (comment)

u can compare it with efficientnet-b0
#3380 (comment)

by the way, thundernet is a 2-stage detector.
you may need do some modifications to make it suitable for yolo.

@LukeAI
Copy link

LukeAI commented Aug 8, 2019

Oh, I see this is the CEM + SAM + Yolov3 with 42.0% mAP@0.5 with 2.90 BFLOPs.? Sounds great, I'll see how it goes and report back. Have you done any other experimental architectures that you would be happy sharing? Do you think it might be improved by trying to use a pan-like head?

@AlexeyAB
Copy link
Owner

AlexeyAB commented Aug 8, 2019

@LukeAI If you have a time, try to train this model (CEM + SAM + Yolov3 with 42.0% mAP@0.5 with 2.90 BFLOPs) on this dataset: #3114 (comment)

For adding result (chart Loss & mAP, BFlops) to this table.

@ChenCong7375
Copy link
Author

@AlexeyAB Is there any cfg file of CEM + SAM + Yolov3 ?
I will have a try.

@WongKinYiu
Copy link
Collaborator

WongKinYiu commented Aug 8, 2019

enetb0-cemsam.cfg.txt

Because there is no parameter can let up-sampling layer up-sample the feature maps to the size before global average pooling layer, I use max-pooling layer instead of global average pooling layer in CEM.
(#3380 (comment) uses SPP instead of global average pooling layer.)

If you get error while training the model, try to set random=0 of yolo layer.

@LukeAI
Copy link

LukeAI commented Aug 9, 2019

yolov3-tiny-sam.cfg.txt
here u r.

I try to train with:
./darknet detector train my_stuff/bdd100k.data my_stuff/yolov3-tiny-sam.cfg my_stuff/yolov3-tiny.conv.15 -dont_show -mjpeg_port 8090 -map -i 1`

But it immediately aborts with:

...
[yolo] params: iou loss: mse, iou_norm: 0.75, cls_norm: 1.00, scale_x_y: 1.00
Total BFLOPS 4.887 
 Allocate additional workspace_size = 1245.71 MB 
Loading weights from my_stuff/yolov3-tiny.conv.15...
 seen 64 
Done!
Learning Rate: 0.001, Momentum: 0.9, Decay: 0.0005
Resizing
608 x 608 
Resizing type 15 
Cannot resize this type of layer: File exists
darknet: ./src/utils.c:293: error: Assertion `0' failed.
....

UPDATE: it works if I set random=0

@LukeAI
Copy link

LukeAI commented Aug 9, 2019

training now, looking good so far.
what am I missing with random=0?
How could I add scales_x_y to this model?

@WongKinYiu
Copy link
Collaborator

@LukeAI
For using scale_x_y, plz see #3114 (comment)

@LukeAI
Copy link

LukeAI commented Aug 11, 2019

@WongKinYiu I mean, I know that the scale models have "scale_x_y = 1.05" or something like that in the Yolo layers, I just don't really understand what an appropriate value would be. I could try with 1.05 and just see how that works? or 1.1?

@WongKinYiu
Copy link
Collaborator

@LukeAI to set an appropriate value, plz see #3293 (comment)

@LukeAI
Copy link

LukeAI commented Aug 30, 2019

Hi all,
Some experiments I ran a while back using the berkley deep drive dataset (slightly reduced number of classes)
Baseline:
CEM1.cfg.txt
CEM1

With anchors generated from the dataset:
CEM_with_anchors.cfg.txt
CEM_with_anchors

Using scale_x_y=1.05
CEM_with_scale.cfg.txt
CEM_with_scale

Using swish activations:
CEM_with_swish.cfg.txt
CEM_with_swish

For comparison, the same dataset trained with tiny_3l:
tiny_3l

and with tiny_pan2:
tiny_pan2_swish_3

@AlexeyAB
Copy link
Owner

@LukeAI
So CEM, scale and swish doesn't give significant improvements?

tiny_pan2 is the most accuracy network?

@LukeAI
Copy link

LukeAI commented Aug 30, 2019

yeah, tiny_pan2 is a good one, here's hoping for a full sized pan2 network. I didn't measure the inf. time, I guess the point of the CEM network is that is very fast whilst still being reasonably accurate?

@AlexeyAB
Copy link
Owner

@LukeAI Just add comparison table, with final accuracy, FLOPS, and inference time

@WongKinYiu
Copy link
Collaborator

I think the mainly improvement is from more anchors/yolo-layers.
In my experiments, yolo-v3-tiny-3l gets 5.7% higher mAP@0.5 than yolo-v3-tiny(2l) on pedestrian detection task.

@WongKinYiu
Copy link
Collaborator

WongKinYiu commented Sep 2, 2019

here list some results of my backbone (evaluate on coco test-dev set):

  1. model A with 2l (6 anchors): 45.0% mAP@0.5, 4.04 BFLOPs.
  2. model A with 3l (9 anchors): 46.3% mAP@0.5, 5.03 BFLOPs.
  3. model B with 2l (6 anchors): 46.8% mAP@0.5, 4.76 BFLOPs.
  4. model B with cem (6 anchors): 45.2% mAP@0.5, 4.81 BFLOPs.
  5. model B with cem sam (6 anchors): 46.1% mAP@0.5, 4.90 BFLOPs.
  6. model B with modified cem sam (9 anchors): 48.0% mAP@0.5, 4.95 BFLOPs.

@AlexeyAB
Copy link
Owner

AlexeyAB commented Sep 2, 2019

@WongKinYiu

  1. model B with modified cem sam (9 anchors): 48.0% mAP@0.5, 4.95 BFLOPs.

Thanks!
What modifications did you do in 6-model?

@WongKinYiu
Copy link
Collaborator

WongKinYiu commented Sep 3, 2019

@AlexeyAB Hello, i m on a business trip, i ll share the modified cem sam tonight.

@WongKinYiu
Copy link
Collaborator

WongKinYiu commented Sep 3, 2019

@AlexeyAB modified-cem-sam-head.txt

  1. i using spp instead global average pooling, it is becuz currently this repo can not support multi-scale training when using global average pooling as an intermediate layer.
  2. since yolo is one-stage object detector, i add sam layer for each feature pyramid.

@AlexeyAB
Copy link
Owner

AlexeyAB commented Sep 3, 2019

@WongKinYiu Thanks!
Did you compare Inference time (sec) for 2. model A with 3l (9 anchors): 46.3% mAP@0.5, 5.03 BFLOPs. and 6. model B with modified cem sam (9 anchors): 48.0% mAP@0.5, 4.95 BFLOPs. ?

@WongKinYiu
Copy link
Collaborator

@AlexeyAB Hello,
sam layer is similar to scale channels layer, although it increase only <1% computation, it increase 20%~30% inference time on gpu. on cpu, they take similar inference time.

@AlexeyAB
Copy link
Owner

AlexeyAB commented Sep 3, 2019

@WongKinYiu When you will find the best cfg-file, please share it, I will add it to this repository.

@WongKinYiu
Copy link
Collaborator

@AlexeyAB for best inference speed, i may share this model after discuss with my team.
image
it reduces 45% number of parameter, 38% of computation, 37% CPU computation time, 19% GPU computation time, and 25% TX2 computation time, while maintaining same mAP@0.5 as yolo-v3-tiny.
this model achieves 485 fps on gtx 1080 ti (batch size = 1).

@AlexeyAB
Copy link
Owner

AlexeyAB commented Sep 5, 2019

@WongKinYiu

If I train the model using old repo, then valid the model using new repos. They get worse results, too.

Maybe only the new accuracy checking function is different, and the training is just as good?
I fixed a little in mAP function.

GIoU improves mAP@0.5:0.95, but drops mAP@0.5. For some cases, mAP@0.5 is more important.
PAN2 reduces 13% computation than PAN and reduces 0.5% mAP@0.5 in my experiment.
Mixup can not benefit lightweight model in my experiments.

Did you test it on MS COCO dataset?

I will add PAN3 block and new tiny model today there: #3114 (comment)

@WongKinYiu
Copy link
Collaborator

WongKinYiu commented Sep 5, 2019

@AlexeyAB
I upload predicted bounding boxes to codalab.
And I train a same model for several times, old repo always get better results.

Yes, all of my experiment results are tested on MS COCO test-dev set.

@AlexeyAB
Copy link
Owner

AlexeyAB commented Sep 5, 2019

@WongKinYiu
I added another one mode: #3114 (comment)

cfg: https://github.com/AlexeyAB/darknet/files/3580764/yolo_v3_tiny_pan3_aa_ae_mixup_scale_giou.cfg.txt

It seems it is the best cfg-file for this small dataset: #3114 (comment)

You can try to train it on MS COCO and check the mAP if you have a time.

@AlexeyAB
Copy link
Owner

AlexeyAB commented Sep 5, 2019

@WongKinYiu Also can you attach entire the best of yours SAM_CEM model (not only head)?
I will attach it here and close the Issue: #3702

@AlexeyAB modified-cem-sam-head.txt

i using spp instead global average pooling, it is becuz currently this repo can not support multi-scale training when using global average pooling as an intermediate layer.
since yolo is one-stage object detector, i add sam layer for each feature pyramid.

@WongKinYiu
Copy link
Collaborator

@AlexeyAB

Thank you for sharing a good model (yolo_v3_tiny_pan3_aa_ae_mixup_scale_giou).

After discuss with my team, I can not share the backbone of #3708 (comment) currently.
I will add the modified-cem-sam-head to yolo-v3-tiny and share the cfg latter.

@WongKinYiu
Copy link
Collaborator

@AlexeyAB
now training yolo_v3_tiny_pan3_aa_ae_mixup_scale_giou (no sgdr) on coco dataset.
i ll report the result after finish training.

@AlexeyAB
Copy link
Owner

AlexeyAB commented Sep 7, 2019

@WongKinYiu Try to increase assisted_excitation=4000 to assisted_excitation=20000 or 50000

@WongKinYiu
Copy link
Collaborator

WongKinYiu commented Sep 17, 2019

COCO test-dev

Model Size AP@.5:.95 AP@.5 AP@.75
yolo_v3_tiny_pan3_aa_ae_mixup_scale_giou (no sgdr).txt 416x416 18.8% 36.8% 17.5%

@AlexeyAB
Copy link
Owner

COCO test-dev

Model Size BFLOPS Inference time, ms AP@.5:.95 AP@.5 AP@.75
yolo_v3_tiny_pan3 aa_ae_mixup_scale_giou (no sgdr).txt 416x416 8.4 6.4 18.8% 36.8% 17.5%
yolov3-tiny-prn.cfg.txt 416x416 3.5 3.8 - 33.1% -
enet-coco.cfg.txt 416x416 3.7 22.7 - 45.5% -

@nyj-ocean
Copy link

@AlexeyAB
I download the latest repo and set like following in makefile

GPU=1
CUDNN=1
CUDNN_HALF=1
OPENCV=1
AVX=0
OPENMP=0
LIBSO=0
ZED_CAMERA=0

Then use yolov3-tiny-sam.cfg.txt in #3708 (comment) to train with my own dataset
But meet error like following

Total BFLOPS 4.883
Allocate additional workspace_size = 52.43 MB
Loading weights from /home/gc/4-images/9.18/darknet/yolov3-tiny.conv.15...
seen 64
Done! Loaded 23 layers from weights-file
Learning Rate: 0.001, Momentum: 0.9, Decay: 0.0005
If error occurs - run training with flag: -dont_show
Resizing
608 x 608
Resizing type 16
Cannot resize this type of layer:
darknet: ./src/utils.c:297:error:

@WongKinYiu
Copy link
Collaborator

@nyj-ocean

yes, you should modify the resize function of sam layer, or you can only train it with random=1.

@nyj-ocean
Copy link

nyj-ocean commented Jan 9, 2020

@WongKinYiu

  • sam layers could not train with Multi-Scale(random=1),is it right?

  • How to modify the resize function of sam layer to train with random=1

@WongKinYiu
Copy link
Collaborator

@nyj-ocean

yes.

just add the case of resize function of sam layer in network.c.
it already defined in sam_layer.c.
so you can simply include it and just need a little bit modification.

@nyj-ocean
Copy link

@WongKinYiu
Thanks a lot

@nyj-ocean
Copy link

nyj-ocean commented Jan 9, 2020

@AlexeyAB
I notice another module:CBAM: Convolutional Block Attention Module
cbam1

cbam2

Is there a need to add CBAM Module to this repo?

CBAM: Convolutional Block Attention Module.pdf
code:https://github.com/Jongchan/attention-module

@WongKinYiu
Copy link
Collaborator

WongKinYiu commented Jan 9, 2020

The kernel function of CAM module and SAM module are SE(squeeze-and-excitation) and SAM, respectively, the already supported by this repo.

@AlexeyAB
Copy link
Owner

AlexeyAB commented Jan 9, 2020

I added resizing (random=1) for [sam] layers.

@924175302
Copy link

@nyj-ocean
Hello, have you tried the CBAM module in YOLOV4?

@924175302
Copy link

@WongKinYiu
You mentioned that this repo already supports the SE module, I don’t find the relevant code and how to use it, can you help me answer it, thank you

@WongKinYiu
Copy link
Collaborator

Squeeze-and-Excitation blocks (layers: [avgpool]->[conv]->[conv]->[scale_channels])

@AlexeyAB
Copy link
Owner

@924175302 Example of SE
https://raw.githubusercontent.com/AlexeyAB/darknet/master/cfg/enet-coco.cfg

#squeeze-n-excitation
[avgpool]

# squeeze ratio r=16 (recommended r=16)
[convolutional]
filters=24
size=1
stride=1
activation=swish

# excitation
[convolutional]
filters=384
size=1
stride=1
activation=logistic

# multiply channels
[scale_channels]
from=-4

@cenit cenit closed this as completed Jan 23, 2021
@tony71200
Copy link

@924175302 Example of SE https://raw.githubusercontent.com/AlexeyAB/darknet/master/cfg/enet-coco.cfg

#squeeze-n-excitation
[avgpool]

# squeeze ratio r=16 (recommended r=16)
[convolutional]
filters=24
size=1
stride=1
activation=swish

# excitation
[convolutional]
filters=384
size=1
stride=1
activation=logistic

# multiply channels
[scale_channels]
from=-4

chart_modified_yolo_se_20211017

Hi everyone, I have a problem when I train the model which use SE in the architecture. The program can not calculate mAP. I dont know. Please help me,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants