shufflenetV2: an extremely light-weight architecture | Implementation #3750

gmayday1997 · 2019-08-11T03:44:18Z

shufflenetV2: Practical Guidelines for Efficient CNN Architecture Design
paper: https://arxiv.org/abs/1807.11164
source code(caffe): https://github.com/miaow1988/ShuffleNet_V2_pytorch_caffe

I have implemented channel shuffle layer and channel_slice layer. Everyone who is interested in this work can try it.

How to use

WongKinYiu · 2019-08-15T01:24:35Z

@gmayday1997 hello,

i v tried several models with channel_shuffle layers, all model got nan after training 30~80k epochs.
could you provide the learning rate schedule for training imagenet dataset?
(models with only channel_split layers r sometimes OK, but training speed are quite slower than models without channel_split layers.)

thanks a lot.

gmayday1997 · 2019-08-15T02:48:59Z

hi, @WongKinYiu , I have trained model with channel_shuffle and channel_slice for 100k epochs, it got 56% top5 precision. Training is still in progress.

   layer   filters  size/strd(dil)      input                output
   0 conv     16       3 x 3/ 1    224 x 224 x   3 ->  224 x 224 x  16 0.043 BF
   1 max               2 x 2/ 2    224 x 224 x  16 ->  112 x 112 x  16 0.001 BF
   2 conv     16       1 x 1/ 1    112 x 112 x  16 ->  112 x 112 x  16 0.006 BF
   3 max               2 x 2/ 2    112 x 112 x  16 ->   56 x  56 x  16 0.000 BF
   4 conv     32       1 x 1/ 1     56 x  56 x  16 ->   56 x  56 x  32 0.003 BF
   5 conv     32/  32  3 x 3/ 1     56 x  56 x  32 ->   56 x  56 x  32 0.002 BF
   6 conv     32       1 x 1/ 1     56 x  56 x  32 ->   56 x  56 x  32 0.006 BF
   7 channel_slice             56 x  56 x  32   ->    56 x  56 x  16 
   8 channel_slice             56 x  56 x  32   ->    56 x  56 x  16 
   9 conv     16       1 x 1/ 1     56 x  56 x  16 ->   56 x  56 x  16 0.002 BF
  10 conv     16/  16  3 x 3/ 1     56 x  56 x  16 ->   56 x  56 x  16 0.001 BF
  11 conv     16       1 x 1/ 1     56 x  56 x  16 ->   56 x  56 x  16 0.002 BF
  12 route  7 11
  13 channel_shuffle                56 x  56 x  32   ->    56 x  56 x  32 
  14 channel_slice             56 x  56 x  32   ->    56 x  56 x  16 
  15 channel_slice             56 x  56 x  32   ->    56 x  56 x  16 
  16 conv     16       1 x 1/ 1     56 x  56 x  16 ->   56 x  56 x  16 0.002 BF
  17 conv     16/  16  3 x 3/ 1     56 x  56 x  16 ->   56 x  56 x  16 0.001 BF
  18 conv     16       1 x 1/ 1     56 x  56 x  16 ->   56 x  56 x  16 0.002 BF
  19 route  14 18
  20 channel_shuffle                56 x  56 x  32   ->    56 x  56 x  32 
  21 channel_slice             56 x  56 x  32   ->    56 x  56 x  16 
  22 channel_slice             56 x  56 x  32   ->    56 x  56 x  16 
  23 conv     16       1 x 1/ 1     56 x  56 x  16 ->   56 x  56 x  16 0.002 BF
  24 conv     16/  16  3 x 3/ 1     56 x  56 x  16 ->   56 x  56 x  16 0.001 BF
  25 conv     16       1 x 1/ 1     56 x  56 x  16 ->   56 x  56 x  16 0.002 BF
  26 route  21 25
  27 channel_shuffle                56 x  56 x  32   ->    56 x  56 x  32 
  28 channel_slice             56 x  56 x  32   ->    56 x  56 x  16 
  29 channel_slice             56 x  56 x  32   ->    56 x  56 x  16 
  30 conv     16       1 x 1/ 1     56 x  56 x  16 ->   56 x  56 x  16 0.002 BF
  31 conv     16/  16  3 x 3/ 1     56 x  56 x  16 ->   56 x  56 x  16 0.001 BF
  32 conv     16       1 x 1/ 1     56 x  56 x  16 ->   56 x  56 x  16 0.002 BF
  33 route  28 32
  34 conv     32       1 x 1/ 1     56 x  56 x  32 ->   56 x  56 x  32 0.006 BF
  35 conv     32/  32  3 x 3/ 1     56 x  56 x  32 ->   56 x  56 x  32 0.002 BF
  36 conv     32       1 x 1/ 1     56 x  56 x  32 ->   56 x  56 x  32 0.006 BF
  37 route  36 6
  38 max               2 x 2/ 2     56 x  56 x  64 ->   28 x  28 x  64 0.000 BF
  39 conv     64       1 x 1/ 1     28 x  28 x  64 ->   28 x  28 x  64 0.006 BF
  40 conv     64/  64  3 x 3/ 1     28 x  28 x  64 ->   28 x  28 x  64 0.001 BF
  41 conv     64       1 x 1/ 1     28 x  28 x  64 ->   28 x  28 x  64 0.006 BF
  42 channel_slice             28 x  28 x  64   ->    28 x  28 x  32 
  43 channel_slice             28 x  28 x  64   ->    28 x  28 x  32 
  44 conv     32       1 x 1/ 1     28 x  28 x  32 ->   28 x  28 x  32 0.002 BF
  45 conv     32/  32  3 x 3/ 1     28 x  28 x  32 ->   28 x  28 x  32 0.000 BF
  46 conv     32       1 x 1/ 1     28 x  28 x  32 ->   28 x  28 x  32 0.002 BF
  47 route  42 46
  48 channel_shuffle                28 x  28 x  64   ->    28 x  28 x  64 
  49 channel_slice             28 x  28 x  64   ->    28 x  28 x  32 
  50 channel_slice             28 x  28 x  64   ->    28 x  28 x  32 
  51 conv     32       1 x 1/ 1     28 x  28 x  32 ->   28 x  28 x  32 0.002 BF
  52 conv     32/  32  3 x 3/ 1     28 x  28 x  32 ->   28 x  28 x  32 0.000 BF
  53 conv     32       1 x 1/ 1     28 x  28 x  32 ->   28 x  28 x  32 0.002 BF
  54 route  49 53
  55 channel_shuffle                28 x  28 x  64   ->    28 x  28 x  64 
  56 channel_slice             28 x  28 x  64   ->    28 x  28 x  32 
  57 channel_slice             28 x  28 x  64   ->    28 x  28 x  32 
  58 conv     32       1 x 1/ 1     28 x  28 x  32 ->   28 x  28 x  32 0.002 BF
  59 conv     32/  32  3 x 3/ 1     28 x  28 x  32 ->   28 x  28 x  32 0.000 BF
  60 conv     32       1 x 1/ 1     28 x  28 x  32 ->   28 x  28 x  32 0.002 BF
  61 route  56 60
  62 channel_shuffle                28 x  28 x  64   ->    28 x  28 x  64 
  63 channel_slice             28 x  28 x  64   ->    28 x  28 x  32 
  64 channel_slice             28 x  28 x  64   ->    28 x  28 x  32 
  65 conv     32       1 x 1/ 1     28 x  28 x  32 ->   28 x  28 x  32 0.002 BF
  66 conv     32/  32  3 x 3/ 1     28 x  28 x  32 ->   28 x  28 x  32 0.000 BF
  67 conv     32       1 x 1/ 1     28 x  28 x  32 ->   28 x  28 x  32 0.002 BF
  68 route  63 67
  69 conv     64       1 x 1/ 1     28 x  28 x  64 ->   28 x  28 x  64 0.006 BF
  70 conv     64/  64  3 x 3/ 1     28 x  28 x  64 ->   28 x  28 x  64 0.001 BF
  71 conv     64       1 x 1/ 1     28 x  28 x  64 ->   28 x  28 x  64 0.006 BF
  72 route  71 41
  73 max               2 x 2/ 2     28 x  28 x 128 ->   14 x  14 x 128 0.000 BF
  74 conv    128       1 x 1/ 1     14 x  14 x 128 ->   14 x  14 x 128 0.006 BF
  75 conv    128       3 x 3/ 1     14 x  14 x 128 ->   14 x  14 x 128 0.058 BF
  76 conv    128       1 x 1/ 1     14 x  14 x 128 ->   14 x  14 x 128 0.006 BF
  77 channel_slice             14 x  14 x 128   ->    14 x  14 x  64 
  78 channel_slice             14 x  14 x 128   ->    14 x  14 x  64 
  79 conv     64       1 x 1/ 1     14 x  14 x  64 ->   14 x  14 x  64 0.002 BF
  80 conv     64/  64  3 x 3/ 1     14 x  14 x  64 ->   14 x  14 x  64 0.000 BF
  81 conv     64       1 x 1/ 1     14 x  14 x  64 ->   14 x  14 x  64 0.002 BF
  82 route  77 81
  83 channel_shuffle                14 x  14 x 128   ->    14 x  14 x 128 
  84 channel_slice             14 x  14 x 128   ->    14 x  14 x  64 
  85 channel_slice             14 x  14 x 128   ->    14 x  14 x  64 
  86 conv     64       1 x 1/ 1     14 x  14 x  64 ->   14 x  14 x  64 0.002 BF
  87 conv     64/  64  3 x 3/ 1     14 x  14 x  64 ->   14 x  14 x  64 0.000 BF
  88 conv     64       1 x 1/ 1     14 x  14 x  64 ->   14 x  14 x  64 0.002 BF
  89 route  84 88
  90 channel_shuffle                14 x  14 x 128   ->    14 x  14 x 128 
  91 channel_slice             14 x  14 x 128   ->    14 x  14 x  64 
  92 channel_slice             14 x  14 x 128   ->    14 x  14 x  64 
  93 conv     64       1 x 1/ 1     14 x  14 x  64 ->   14 x  14 x  64 0.002 BF
  94 conv     64/  64  3 x 3/ 1     14 x  14 x  64 ->   14 x  14 x  64 0.000 BF
  95 conv     64       1 x 1/ 1     14 x  14 x  64 ->   14 x  14 x  64 0.002 BF
  96 route  91 95
  97 channel_shuffle                14 x  14 x 128   ->    14 x  14 x 128 
  98 channel_slice             14 x  14 x 128   ->    14 x  14 x  64 
  99 channel_slice             14 x  14 x 128   ->    14 x  14 x  64 
 100 conv     64       1 x 1/ 1     14 x  14 x  64 ->   14 x  14 x  64 0.002 BF
 101 conv     64/  64  3 x 3/ 1     14 x  14 x  64 ->   14 x  14 x  64 0.000 BF
 102 conv     64       1 x 1/ 1     14 x  14 x  64 ->   14 x  14 x  64 0.002 BF
 103 route  98 102
 104 conv    128       1 x 1/ 1     14 x  14 x 128 ->   14 x  14 x 128 0.006 BF
 105 conv    128/ 128  3 x 3/ 1     14 x  14 x 128 ->   14 x  14 x 128 0.000 BF
 106 conv    128       1 x 1/ 1     14 x  14 x 128 ->   14 x  14 x 128 0.006 BF
 107 route  106 76
 108 max               2 x 2/ 2     14 x  14 x 256 ->    7 x   7 x 256 0.000 BF
 109 conv    256       1 x 1/ 1      7 x   7 x 256 ->    7 x   7 x 256 0.006 BF
 110 conv    256/ 256  3 x 3/ 1      7 x   7 x 256 ->    7 x   7 x 256 0.000 BF
 111 conv    256       1 x 1/ 1      7 x   7 x 256 ->    7 x   7 x 256 0.006 BF
 112 channel_slice              7 x   7 x 256   ->     7 x   7 x 128 
 113 channel_slice              7 x   7 x 256   ->     7 x   7 x 128 
 114 conv    128       1 x 1/ 1      7 x   7 x 128 ->    7 x   7 x 128 0.002 BF
 115 conv    128/ 128  3 x 3/ 1      7 x   7 x 128 ->    7 x   7 x 128 0.000 BF
 116 conv    128       1 x 1/ 1      7 x   7 x 128 ->    7 x   7 x 128 0.002 BF
 117 route  112 116
 118 channel_shuffle                 7 x   7 x 256   ->     7 x   7 x 256 
 119 channel_slice              7 x   7 x 256   ->     7 x   7 x 128 
 120 channel_slice              7 x   7 x 256   ->     7 x   7 x 128 
 121 conv    128       1 x 1/ 1      7 x   7 x 128 ->    7 x   7 x 128 0.002 BF
 122 conv    128/ 128  3 x 3/ 1      7 x   7 x 128 ->    7 x   7 x 128 0.000 BF
 123 conv    128       1 x 1/ 1      7 x   7 x 128 ->    7 x   7 x 128 0.002 BF
 124 route  119 123
 125 channel_shuffle                 7 x   7 x 256   ->     7 x   7 x 256 
 126 channel_slice              7 x   7 x 256   ->     7 x   7 x 128 
 127 channel_slice              7 x   7 x 256   ->     7 x   7 x 128 
 128 conv    128       1 x 1/ 1      7 x   7 x 128 ->    7 x   7 x 128 0.002 BF
 129 conv    128/ 128  3 x 3/ 1      7 x   7 x 128 ->    7 x   7 x 128 0.000 BF
 130 conv    128       1 x 1/ 1      7 x   7 x 128 ->    7 x   7 x 128 0.002 BF
 131 route  126 130
 132 channel_shuffle                 7 x   7 x 256   ->     7 x   7 x 256 
 133 channel_slice              7 x   7 x 256   ->     7 x   7 x 128 
 134 channel_slice              7 x   7 x 256   ->     7 x   7 x 128 
 135 conv    128       1 x 1/ 1      7 x   7 x 128 ->    7 x   7 x 128 0.002 BF
 136 conv    128/ 128  3 x 3/ 1      7 x   7 x 128 ->    7 x   7 x 128 0.000 BF
 137 conv    128       1 x 1/ 1      7 x   7 x 128 ->    7 x   7 x 128 0.002 BF
 138 route  133 137
 139 conv    256       1 x 1/ 1      7 x   7 x 256 ->    7 x   7 x 256 0.006 BF
 140 conv    256/ 256  3 x 3/ 1      7 x   7 x 256 ->    7 x   7 x 256 0.000 BF
 141 conv    256       1 x 1/ 1      7 x   7 x 256 ->    7 x   7 x 256 0.006 BF
 142 route  141 110
 143 conv    512       1 x 1/ 1      7 x   7 x 512 ->    7 x   7 x 512 0.026 BF
 144 conv    512/ 512  3 x 3/ 1      7 x   7 x 512 ->    7 x   7 x 512 0.000 BF
 145 conv    512       1 x 1/ 1      7 x   7 x 512 ->    7 x   7 x 512 0.026 BF
 146 conv   1000       1 x 1/ 1      7 x   7 x 512 ->    7 x   7 x1000 0.050 BF
 147 avg                             7 x   7 x1000 ->   1000
 148 softmax                                        1000
 149 cost                                           1000
Total BFLOPS 0.375 
 Allocate additional workspace_size = 1.64 MB

Here are the cfg and weights.
shuffle_imagenet.cfg.txt
shuffle.weights[google] OR [baidupan]

WongKinYiu · 2019-08-15T02:59:37Z

@gmayday1997 thank you for sharing the cfg file.

i checked my cfgs, it seems all models with lrelu activation function are failed.
and all models with swish activation function can converge.
i will do more experiments to make sure the reason.

the mainly difference between ur cfg and mine are:

i use leaky relu activation function.
i use warm up at the first 2000 epochs.
there is no sse cost layer in my cfgs.
i use down sampling module proposed in shuffulenet-v2.

gmayday1997 · 2019-08-15T03:30:09Z

@WongKinYiu Yes, you are right. In fact, I tried to implement the proposed down sampling module, but it seems hard to converge. Do you mind show your cfg file?

WongKinYiu · 2019-08-15T03:35:30Z

@gmayday1997 here is the cfg file. SNet49.cfg.txt
i implement the SNet49 of thundernet. #3380 (comment)
it gets nan after training 80k epochs.

gmayday1997 · 2019-08-15T03:55:01Z

Hi @WongKinYiu
Thank you for sharing.
I found there are no activation functions used in some layers (activation = linear). I am not sure but it may hurt gradient propagation without shortcut layer.

[convolutional]
filters=30
groups=30
size=3
stride=1
pad=1
batch_normalize=1
activation=linear

WongKinYiu · 2019-08-15T03:59:23Z

@gmayday1997 Hello,

depthwise convolutional layers of shufflenetv2 do not have activation function.

only 1 by 1 convolutional layers use relu activation function.

gmayday1997 · 2019-08-15T04:17:02Z

@WongKinYiu There is really no activation function in depthwise convolution module. Thank you for pointing this.
Can you share the top5 precision before the model become crash?

WongKinYiu · 2019-08-15T04:46:43Z

@gmayday1997 i m sorry about that i can not provide such information.
i delete all of weight files after it gets nan.
and the loss never go down when training.

gmayday1997 · 2019-08-15T05:11:15Z

@WongKinYiu so the model never learn anything from training.
What is your opinion about that route is total equal to split layer, which used in dw module?
I really doubt about that. Maybe i missed some details.

WongKinYiu · 2019-08-15T05:21:39Z

@gmayday1997
yes, i use equivalent architecture composed by route layer instead of channel split layers before, and it works fine. (also, the training speed quite quicker than model using channel split layers)

module using route:
[conv * 16]
[route -2]
[conv * 16]

module using channel split:
[conv * 32]
[channel split form -1, 0~16]
[channel split from -2, 16~32]

maybe i will check the code of channel split layer and channel shuffle layer after my busy weeks.
or if u update the code in these days, i can train & check the performance is normal or not.

gmayday1997 · 2019-08-15T12:07:25Z

@WongKinYiu sorry for late reply.
I mean that split layer(caffe) is equal to route?
If you found any errors about code of channel_shuffle or channel_slice, don't hesitate to tell me. Thanks in advance!!!

WongKinYiu · 2019-08-16T01:11:36Z

i do not use caffe, but the behavior of split and route are almost same.
ok.

dexception · 2019-08-16T06:37:06Z

hi, @WongKinYiu , I have trained model with channel_shuffle and channel_slice for 100k epochs, it got 56% top5 precision. Training is still in progress.

   layer   filters  size/strd(dil)      input                output
   0 conv     16       3 x 3/ 1    224 x 224 x   3 ->  224 x 224 x  16 0.043 BF
   1 max               2 x 2/ 2    224 x 224 x  16 ->  112 x 112 x  16 0.001 BF
   2 conv     16       1 x 1/ 1    112 x 112 x  16 ->  112 x 112 x  16 0.006 BF
   3 max               2 x 2/ 2    112 x 112 x  16 ->   56 x  56 x  16 0.000 BF
   4 conv     32       1 x 1/ 1     56 x  56 x  16 ->   56 x  56 x  32 0.003 BF
   5 conv     32/  32  3 x 3/ 1     56 x  56 x  32 ->   56 x  56 x  32 0.002 BF
   6 conv     32       1 x 1/ 1     56 x  56 x  32 ->   56 x  56 x  32 0.006 BF
   7 channel_slice             56 x  56 x  32   ->    56 x  56 x  16 
   8 channel_slice             56 x  56 x  32   ->    56 x  56 x  16 
   9 conv     16       1 x 1/ 1     56 x  56 x  16 ->   56 x  56 x  16 0.002 BF
  10 conv     16/  16  3 x 3/ 1     56 x  56 x  16 ->   56 x  56 x  16 0.001 BF
  11 conv     16       1 x 1/ 1     56 x  56 x  16 ->   56 x  56 x  16 0.002 BF
  12 route  7 11
  13 channel_shuffle                56 x  56 x  32   ->    56 x  56 x  32 
  14 channel_slice             56 x  56 x  32   ->    56 x  56 x  16 
  15 channel_slice             56 x  56 x  32   ->    56 x  56 x  16 
  16 conv     16       1 x 1/ 1     56 x  56 x  16 ->   56 x  56 x  16 0.002 BF
  17 conv     16/  16  3 x 3/ 1     56 x  56 x  16 ->   56 x  56 x  16 0.001 BF
  18 conv     16       1 x 1/ 1     56 x  56 x  16 ->   56 x  56 x  16 0.002 BF
  19 route  14 18
  20 channel_shuffle                56 x  56 x  32   ->    56 x  56 x  32 
  21 channel_slice             56 x  56 x  32   ->    56 x  56 x  16 
  22 channel_slice             56 x  56 x  32   ->    56 x  56 x  16 
  23 conv     16       1 x 1/ 1     56 x  56 x  16 ->   56 x  56 x  16 0.002 BF
  24 conv     16/  16  3 x 3/ 1     56 x  56 x  16 ->   56 x  56 x  16 0.001 BF
  25 conv     16       1 x 1/ 1     56 x  56 x  16 ->   56 x  56 x  16 0.002 BF
  26 route  21 25
  27 channel_shuffle                56 x  56 x  32   ->    56 x  56 x  32 
  28 channel_slice             56 x  56 x  32   ->    56 x  56 x  16 
  29 channel_slice             56 x  56 x  32   ->    56 x  56 x  16 
  30 conv     16       1 x 1/ 1     56 x  56 x  16 ->   56 x  56 x  16 0.002 BF
  31 conv     16/  16  3 x 3/ 1     56 x  56 x  16 ->   56 x  56 x  16 0.001 BF
  32 conv     16       1 x 1/ 1     56 x  56 x  16 ->   56 x  56 x  16 0.002 BF
  33 route  28 32
  34 conv     32       1 x 1/ 1     56 x  56 x  32 ->   56 x  56 x  32 0.006 BF
  35 conv     32/  32  3 x 3/ 1     56 x  56 x  32 ->   56 x  56 x  32 0.002 BF
  36 conv     32       1 x 1/ 1     56 x  56 x  32 ->   56 x  56 x  32 0.006 BF
  37 route  36 6
  38 max               2 x 2/ 2     56 x  56 x  64 ->   28 x  28 x  64 0.000 BF
  39 conv     64       1 x 1/ 1     28 x  28 x  64 ->   28 x  28 x  64 0.006 BF
  40 conv     64/  64  3 x 3/ 1     28 x  28 x  64 ->   28 x  28 x  64 0.001 BF
  41 conv     64       1 x 1/ 1     28 x  28 x  64 ->   28 x  28 x  64 0.006 BF
  42 channel_slice             28 x  28 x  64   ->    28 x  28 x  32 
  43 channel_slice             28 x  28 x  64   ->    28 x  28 x  32 
  44 conv     32       1 x 1/ 1     28 x  28 x  32 ->   28 x  28 x  32 0.002 BF
  45 conv     32/  32  3 x 3/ 1     28 x  28 x  32 ->   28 x  28 x  32 0.000 BF
  46 conv     32       1 x 1/ 1     28 x  28 x  32 ->   28 x  28 x  32 0.002 BF
  47 route  42 46
  48 channel_shuffle                28 x  28 x  64   ->    28 x  28 x  64 
  49 channel_slice             28 x  28 x  64   ->    28 x  28 x  32 
  50 channel_slice             28 x  28 x  64   ->    28 x  28 x  32 
  51 conv     32       1 x 1/ 1     28 x  28 x  32 ->   28 x  28 x  32 0.002 BF
  52 conv     32/  32  3 x 3/ 1     28 x  28 x  32 ->   28 x  28 x  32 0.000 BF
  53 conv     32       1 x 1/ 1     28 x  28 x  32 ->   28 x  28 x  32 0.002 BF
  54 route  49 53
  55 channel_shuffle                28 x  28 x  64   ->    28 x  28 x  64 
  56 channel_slice             28 x  28 x  64   ->    28 x  28 x  32 
  57 channel_slice             28 x  28 x  64   ->    28 x  28 x  32 
  58 conv     32       1 x 1/ 1     28 x  28 x  32 ->   28 x  28 x  32 0.002 BF
  59 conv     32/  32  3 x 3/ 1     28 x  28 x  32 ->   28 x  28 x  32 0.000 BF
  60 conv     32       1 x 1/ 1     28 x  28 x  32 ->   28 x  28 x  32 0.002 BF
  61 route  56 60
  62 channel_shuffle                28 x  28 x  64   ->    28 x  28 x  64 
  63 channel_slice             28 x  28 x  64   ->    28 x  28 x  32 
  64 channel_slice             28 x  28 x  64   ->    28 x  28 x  32 
  65 conv     32       1 x 1/ 1     28 x  28 x  32 ->   28 x  28 x  32 0.002 BF
  66 conv     32/  32  3 x 3/ 1     28 x  28 x  32 ->   28 x  28 x  32 0.000 BF
  67 conv     32       1 x 1/ 1     28 x  28 x  32 ->   28 x  28 x  32 0.002 BF
  68 route  63 67
  69 conv     64       1 x 1/ 1     28 x  28 x  64 ->   28 x  28 x  64 0.006 BF
  70 conv     64/  64  3 x 3/ 1     28 x  28 x  64 ->   28 x  28 x  64 0.001 BF
  71 conv     64       1 x 1/ 1     28 x  28 x  64 ->   28 x  28 x  64 0.006 BF
  72 route  71 41
  73 max               2 x 2/ 2     28 x  28 x 128 ->   14 x  14 x 128 0.000 BF
  74 conv    128       1 x 1/ 1     14 x  14 x 128 ->   14 x  14 x 128 0.006 BF
  75 conv    128       3 x 3/ 1     14 x  14 x 128 ->   14 x  14 x 128 0.058 BF
  76 conv    128       1 x 1/ 1     14 x  14 x 128 ->   14 x  14 x 128 0.006 BF
  77 channel_slice             14 x  14 x 128   ->    14 x  14 x  64 
  78 channel_slice             14 x  14 x 128   ->    14 x  14 x  64 
  79 conv     64       1 x 1/ 1     14 x  14 x  64 ->   14 x  14 x  64 0.002 BF
  80 conv     64/  64  3 x 3/ 1     14 x  14 x  64 ->   14 x  14 x  64 0.000 BF
  81 conv     64       1 x 1/ 1     14 x  14 x  64 ->   14 x  14 x  64 0.002 BF
  82 route  77 81
  83 channel_shuffle                14 x  14 x 128   ->    14 x  14 x 128 
  84 channel_slice             14 x  14 x 128   ->    14 x  14 x  64 
  85 channel_slice             14 x  14 x 128   ->    14 x  14 x  64 
  86 conv     64       1 x 1/ 1     14 x  14 x  64 ->   14 x  14 x  64 0.002 BF
  87 conv     64/  64  3 x 3/ 1     14 x  14 x  64 ->   14 x  14 x  64 0.000 BF
  88 conv     64       1 x 1/ 1     14 x  14 x  64 ->   14 x  14 x  64 0.002 BF
  89 route  84 88
  90 channel_shuffle                14 x  14 x 128   ->    14 x  14 x 128 
  91 channel_slice             14 x  14 x 128   ->    14 x  14 x  64 
  92 channel_slice             14 x  14 x 128   ->    14 x  14 x  64 
  93 conv     64       1 x 1/ 1     14 x  14 x  64 ->   14 x  14 x  64 0.002 BF
  94 conv     64/  64  3 x 3/ 1     14 x  14 x  64 ->   14 x  14 x  64 0.000 BF
  95 conv     64       1 x 1/ 1     14 x  14 x  64 ->   14 x  14 x  64 0.002 BF
  96 route  91 95
  97 channel_shuffle                14 x  14 x 128   ->    14 x  14 x 128 
  98 channel_slice             14 x  14 x 128   ->    14 x  14 x  64 
  99 channel_slice             14 x  14 x 128   ->    14 x  14 x  64 
 100 conv     64       1 x 1/ 1     14 x  14 x  64 ->   14 x  14 x  64 0.002 BF
 101 conv     64/  64  3 x 3/ 1     14 x  14 x  64 ->   14 x  14 x  64 0.000 BF
 102 conv     64       1 x 1/ 1     14 x  14 x  64 ->   14 x  14 x  64 0.002 BF
 103 route  98 102
 104 conv    128       1 x 1/ 1     14 x  14 x 128 ->   14 x  14 x 128 0.006 BF
 105 conv    128/ 128  3 x 3/ 1     14 x  14 x 128 ->   14 x  14 x 128 0.000 BF
 106 conv    128       1 x 1/ 1     14 x  14 x 128 ->   14 x  14 x 128 0.006 BF
 107 route  106 76
 108 max               2 x 2/ 2     14 x  14 x 256 ->    7 x   7 x 256 0.000 BF
 109 conv    256       1 x 1/ 1      7 x   7 x 256 ->    7 x   7 x 256 0.006 BF
 110 conv    256/ 256  3 x 3/ 1      7 x   7 x 256 ->    7 x   7 x 256 0.000 BF
 111 conv    256       1 x 1/ 1      7 x   7 x 256 ->    7 x   7 x 256 0.006 BF
 112 channel_slice              7 x   7 x 256   ->     7 x   7 x 128 
 113 channel_slice              7 x   7 x 256   ->     7 x   7 x 128 
 114 conv    128       1 x 1/ 1      7 x   7 x 128 ->    7 x   7 x 128 0.002 BF
 115 conv    128/ 128  3 x 3/ 1      7 x   7 x 128 ->    7 x   7 x 128 0.000 BF
 116 conv    128       1 x 1/ 1      7 x   7 x 128 ->    7 x   7 x 128 0.002 BF
 117 route  112 116
 118 channel_shuffle                 7 x   7 x 256   ->     7 x   7 x 256 
 119 channel_slice              7 x   7 x 256   ->     7 x   7 x 128 
 120 channel_slice              7 x   7 x 256   ->     7 x   7 x 128 
 121 conv    128       1 x 1/ 1      7 x   7 x 128 ->    7 x   7 x 128 0.002 BF
 122 conv    128/ 128  3 x 3/ 1      7 x   7 x 128 ->    7 x   7 x 128 0.000 BF
 123 conv    128       1 x 1/ 1      7 x   7 x 128 ->    7 x   7 x 128 0.002 BF
 124 route  119 123
 125 channel_shuffle                 7 x   7 x 256   ->     7 x   7 x 256 
 126 channel_slice              7 x   7 x 256   ->     7 x   7 x 128 
 127 channel_slice              7 x   7 x 256   ->     7 x   7 x 128 
 128 conv    128       1 x 1/ 1      7 x   7 x 128 ->    7 x   7 x 128 0.002 BF
 129 conv    128/ 128  3 x 3/ 1      7 x   7 x 128 ->    7 x   7 x 128 0.000 BF
 130 conv    128       1 x 1/ 1      7 x   7 x 128 ->    7 x   7 x 128 0.002 BF
 131 route  126 130
 132 channel_shuffle                 7 x   7 x 256   ->     7 x   7 x 256 
 133 channel_slice              7 x   7 x 256   ->     7 x   7 x 128 
 134 channel_slice              7 x   7 x 256   ->     7 x   7 x 128 
 135 conv    128       1 x 1/ 1      7 x   7 x 128 ->    7 x   7 x 128 0.002 BF
 136 conv    128/ 128  3 x 3/ 1      7 x   7 x 128 ->    7 x   7 x 128 0.000 BF
 137 conv    128       1 x 1/ 1      7 x   7 x 128 ->    7 x   7 x 128 0.002 BF
 138 route  133 137
 139 conv    256       1 x 1/ 1      7 x   7 x 256 ->    7 x   7 x 256 0.006 BF
 140 conv    256/ 256  3 x 3/ 1      7 x   7 x 256 ->    7 x   7 x 256 0.000 BF
 141 conv    256       1 x 1/ 1      7 x   7 x 256 ->    7 x   7 x 256 0.006 BF
 142 route  141 110
 143 conv    512       1 x 1/ 1      7 x   7 x 512 ->    7 x   7 x 512 0.026 BF
 144 conv    512/ 512  3 x 3/ 1      7 x   7 x 512 ->    7 x   7 x 512 0.000 BF
 145 conv    512       1 x 1/ 1      7 x   7 x 512 ->    7 x   7 x 512 0.026 BF
 146 conv   1000       1 x 1/ 1      7 x   7 x 512 ->    7 x   7 x1000 0.050 BF
 147 avg                             7 x   7 x1000 ->   1000
 148 softmax                                        1000
 149 cost                                           1000
Total BFLOPS 0.375 
 Allocate additional workspace_size = 1.64 MB

Here are the cfg and weights.
shuffle_imagenet.cfg.txt
shuffle.weights[google] OR [baidupan]

I am assuming the weights are trained on imagenet.
The original model shufflenetv2 scores about

69.06% Top 1 Accuracy
88.77% Top 5 Accuracy

WongKinYiu · 2019-08-17T12:52:25Z

@gmayday1997 I have finished training of ur provided cfg and some alternatives.

shufflenet with swish activation function
shuffle_swish.cfg.txt
it got 31.8% top-1 acc and 55.6% top-5 acc.
shufflenet with swish activation function + warm up learning rate scheduler
shuffle_swish_warmup.txt
it got 33.4% top-1 acc and 57.4% top-5 acc.
shufflenet with leaky relu activation function
shuffle_leaky.cfg.txt
it got 29.0% top-1 acc and 52.1% top-5 acc.
shufflenet with leaky relu activation function + warm up learning rate scheduler
shuffle_leaky_warmup.txt
it got 31.5% top-1 acc and 55.1% top-5 acc.

gmayday1997 · 2019-08-18T02:15:56Z

@WongKinYiu wow, thank you for sharing so valuable experiment comparisons. It seems that warm up learning gives major improvement.
Since I have only one GPU, so i will update some results after the other projects finished. Thank you again.

AlexeyAB · 2019-08-18T10:07:03Z

@WongKinYiu Will you try to train Detector with ShuffleNet backbone?

WongKinYiu · 2019-08-18T13:43:47Z

@AlexeyAB Hello,
Currently, the model with channel_split or channel_shuffle layers take very long training time.
it takes almost five times of training time than the model without channel_split or channel_shuffle layers on my machine.
It may spend 40 days to train an imagenet pre-trained model.

So i won't train a detector with shufflenetv2 backbone now.
but if the training speed issue can be solved, i'd like to do it.

Thanks.

beHappy666 · 2019-08-20T02:36:53Z

@gmayday1997 Hi, i have a question about the slice layer, is it the same as slice layer in caffe which is using for cut the channels?

gmayday1997 · 2019-08-20T03:28:45Z

@beHappy666 Yes. As darknet don't support multi-ouputs, so we need pay attention to propagate gradients correctly.

[convolutional]
batch_normalize=1
filters=32
size=1
stride=1
pad=0
activation=swish

[channel_slice]
from=-1
axis=1
start=0
end=16
[channel_slice]
from=-2
axis=1
start=16
end=32

we use "from" to indicate which feature blobs are sliced and use "start" and "end" to set the slice point.

LukeAI · 2019-08-20T16:05:10Z

@gmayday1997 I have finished training of ur provided cfg and some alternatives.

1. shufflenet with swish activation function
   [shuffle_swish.cfg.txt](https://github.com/AlexeyAB/darknet/files/3511851/shuffle_swish.cfg.txt)
   it got 31.8% top-1 acc and 55.6% top-5 acc.

2. shufflenet with swish activation function + warm up learning rate scheduler
   [shuffle_swish_warmup.txt](https://github.com/AlexeyAB/darknet/files/3511852/shuffle_swish_warmup.txt)
   it got 33.4% top-1 acc and 57.4% top-5 acc.

3. shufflenet with leaky relu activation function
   [shuffle_leaky.cfg.txt](https://github.com/AlexeyAB/darknet/files/3511854/shuffle_leaky.cfg.txt)
   it got 29.0% top-1 acc and 52.1% top-5 acc.

4. shufflenet with leaky relu activation function + warm up learning rate scheduler
   [shuffle_leaky_warmup.txt](https://github.com/AlexeyAB/darknet/files/3511856/shuffle_leaky_warmup.txt)
   it got 31.5% top-1 acc and 55.1% top-5 acc.

What is the inference time on these guys?

WongKinYiu · 2019-08-21T00:52:24Z

@LukeAI
u can just download and examine them.
the inference time will be different in different machines.

beHappy666 · 2019-08-21T07:02:18Z

@gmayday1997 Ok,thank you.

pcorner · 2019-08-21T07:26:58Z

Dear Yolo friends. I'm look for best advice on training a job where I'll need the locate crops ( one type ) like tomatoes sitting at the plant . I'll easily be able to collect a training and trstset with all the required characteristics to foster solid variance in all dimensions. But I'm somewhat in doubt about what would be the best settings and workflow. Btw I have a navidia cuda backbone in place . Kind regards Get Outlook for Android<https://aka.ms/ghei36>

…

________________________________ From: LukeAI <notifications@github.com> Sent: Tuesday, August 20, 2019 6:05:17 PM To: AlexeyAB/darknet <darknet@noreply.github.com> Cc: Subscribed <subscribed@noreply.github.com> Subject: Re: [AlexeyAB/darknet] shufflenetV2: an extremely light-weight architecture | Implementation (#3750) @gmayday1997<https://github.com/gmayday1997> I have finished training of ur provided cfg and some alternatives. 1. shufflenet with swish activation function [shuffle_swish.cfg.txt](https://github.com/AlexeyAB/darknet/files/3511851/shuffle_swish.cfg.txt) it got 31.8% top-1 acc and 55.6% top-5 acc. 2. shufflenet with swish activation function + warm up learning rate scheduler [shuffle_swish_warmup.txt](https://github.com/AlexeyAB/darknet/files/3511852/shuffle_swish_warmup.txt) it got 33.4% top-1 acc and 57.4% top-5 acc. 3. shufflenet with leaky relu activation function [shuffle_leaky.cfg.txt](https://github.com/AlexeyAB/darknet/files/3511854/shuffle_leaky.cfg.txt) it got 29.0% top-1 acc and 52.1% top-5 acc. 4. shufflenet with leaky relu activation function + warm up learning rate scheduler [shuffle_leaky_warmup.txt](https://github.com/AlexeyAB/darknet/files/3511856/shuffle_leaky_warmup.txt) it got 31.5% top-1 acc and 55.1% top-5 acc. What is the inference time on these guys? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub<#3750?email_source=notifications&email_token=ABY4RBNFR57XDWE3YPUGLPDQFQI33A5CNFSM4IK23722YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4WZZIA#issuecomment-523082912>, or mute the thread<https://github.com/notifications/unsubscribe-auth/ABY4RBNUFKSQEE4FQT7CQADQFQI33ANCNFSM4IK2372Q>.

WongKinYiu · 2019-08-21T07:37:43Z

@pcorner Do you mean https://youtu.be/sY4tLRI6pYc ?

pcorner · 2019-08-21T07:40:29Z

Yes similar to this. Im just looking for 2D detection, as I have other means to achieve the 3D location postprocessing. Best regards / med venlig hilsen /真诚 Preben Hjornet Special Advisor Moble +45 2460 9899 skype preben.hjornet LinkedIn.<http://www.linkedin.com/pub/preben-hj%C3%B8rnet/5/b88/126> Profile [Mailsignatur2] From: Kin-Yiu, Wong <notifications@github.com> Sent: 21. august 2019 09:38 To: AlexeyAB/darknet <darknet@noreply.github.com> Cc: Preben Hjornet <preben.hjornet@gmail.com>; Mention <mention@noreply.github.com> Subject: Re: [AlexeyAB/darknet] shufflenetV2: an extremely light-weight architecture | Implementation (#3750) @pcorner<https://github.com/pcorner> Do you mean https://youtu.be/sY4tLRI6pYc ? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#3750?email_source=notifications&email_token=ABY4RBIKLH73AXHB32VPLGLQFTWEZA5CNFSM4IK23722YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4YXW7Y#issuecomment-523336575>, or mute the thread<https://github.com/notifications/unsubscribe-auth/ABY4RBLBUYYQEP4WBZRZLP3QFTWEZANCNFSM4IK2372Q>.

jamessmith90 · 2019-08-22T12:17:26Z

hi, @WongKinYiu , I have trained model with channel_shuffle and channel_slice for 100k epochs, it got 56% top5 precision. Training is still in progress.

   layer   filters  size/strd(dil)      input                output
   0 conv     16       3 x 3/ 1    224 x 224 x   3 ->  224 x 224 x  16 0.043 BF
   1 max               2 x 2/ 2    224 x 224 x  16 ->  112 x 112 x  16 0.001 BF
   2 conv     16       1 x 1/ 1    112 x 112 x  16 ->  112 x 112 x  16 0.006 BF
   3 max               2 x 2/ 2    112 x 112 x  16 ->   56 x  56 x  16 0.000 BF
   4 conv     32       1 x 1/ 1     56 x  56 x  16 ->   56 x  56 x  32 0.003 BF
   5 conv     32/  32  3 x 3/ 1     56 x  56 x  32 ->   56 x  56 x  32 0.002 BF
   6 conv     32       1 x 1/ 1     56 x  56 x  32 ->   56 x  56 x  32 0.006 BF
   7 channel_slice             56 x  56 x  32   ->    56 x  56 x  16 
   8 channel_slice             56 x  56 x  32   ->    56 x  56 x  16 
   9 conv     16       1 x 1/ 1     56 x  56 x  16 ->   56 x  56 x  16 0.002 BF
  10 conv     16/  16  3 x 3/ 1     56 x  56 x  16 ->   56 x  56 x  16 0.001 BF
  11 conv     16       1 x 1/ 1     56 x  56 x  16 ->   56 x  56 x  16 0.002 BF
  12 route  7 11
  13 channel_shuffle                56 x  56 x  32   ->    56 x  56 x  32 
  14 channel_slice             56 x  56 x  32   ->    56 x  56 x  16 
  15 channel_slice             56 x  56 x  32   ->    56 x  56 x  16 
  16 conv     16       1 x 1/ 1     56 x  56 x  16 ->   56 x  56 x  16 0.002 BF
  17 conv     16/  16  3 x 3/ 1     56 x  56 x  16 ->   56 x  56 x  16 0.001 BF
  18 conv     16       1 x 1/ 1     56 x  56 x  16 ->   56 x  56 x  16 0.002 BF
  19 route  14 18
  20 channel_shuffle                56 x  56 x  32   ->    56 x  56 x  32 
  21 channel_slice             56 x  56 x  32   ->    56 x  56 x  16 
  22 channel_slice             56 x  56 x  32   ->    56 x  56 x  16 
  23 conv     16       1 x 1/ 1     56 x  56 x  16 ->   56 x  56 x  16 0.002 BF
  24 conv     16/  16  3 x 3/ 1     56 x  56 x  16 ->   56 x  56 x  16 0.001 BF
  25 conv     16       1 x 1/ 1     56 x  56 x  16 ->   56 x  56 x  16 0.002 BF
  26 route  21 25
  27 channel_shuffle                56 x  56 x  32   ->    56 x  56 x  32 
  28 channel_slice             56 x  56 x  32   ->    56 x  56 x  16 
  29 channel_slice             56 x  56 x  32   ->    56 x  56 x  16 
  30 conv     16       1 x 1/ 1     56 x  56 x  16 ->   56 x  56 x  16 0.002 BF
  31 conv     16/  16  3 x 3/ 1     56 x  56 x  16 ->   56 x  56 x  16 0.001 BF
  32 conv     16       1 x 1/ 1     56 x  56 x  16 ->   56 x  56 x  16 0.002 BF
  33 route  28 32
  34 conv     32       1 x 1/ 1     56 x  56 x  32 ->   56 x  56 x  32 0.006 BF
  35 conv     32/  32  3 x 3/ 1     56 x  56 x  32 ->   56 x  56 x  32 0.002 BF
  36 conv     32       1 x 1/ 1     56 x  56 x  32 ->   56 x  56 x  32 0.006 BF
  37 route  36 6
  38 max               2 x 2/ 2     56 x  56 x  64 ->   28 x  28 x  64 0.000 BF
  39 conv     64       1 x 1/ 1     28 x  28 x  64 ->   28 x  28 x  64 0.006 BF
  40 conv     64/  64  3 x 3/ 1     28 x  28 x  64 ->   28 x  28 x  64 0.001 BF
  41 conv     64       1 x 1/ 1     28 x  28 x  64 ->   28 x  28 x  64 0.006 BF
  42 channel_slice             28 x  28 x  64   ->    28 x  28 x  32 
  43 channel_slice             28 x  28 x  64   ->    28 x  28 x  32 
  44 conv     32       1 x 1/ 1     28 x  28 x  32 ->   28 x  28 x  32 0.002 BF
  45 conv     32/  32  3 x 3/ 1     28 x  28 x  32 ->   28 x  28 x  32 0.000 BF
  46 conv     32       1 x 1/ 1     28 x  28 x  32 ->   28 x  28 x  32 0.002 BF
  47 route  42 46
  48 channel_shuffle                28 x  28 x  64   ->    28 x  28 x  64 
  49 channel_slice             28 x  28 x  64   ->    28 x  28 x  32 
  50 channel_slice             28 x  28 x  64   ->    28 x  28 x  32 
  51 conv     32       1 x 1/ 1     28 x  28 x  32 ->   28 x  28 x  32 0.002 BF
  52 conv     32/  32  3 x 3/ 1     28 x  28 x  32 ->   28 x  28 x  32 0.000 BF
  53 conv     32       1 x 1/ 1     28 x  28 x  32 ->   28 x  28 x  32 0.002 BF
  54 route  49 53
  55 channel_shuffle                28 x  28 x  64   ->    28 x  28 x  64 
  56 channel_slice             28 x  28 x  64   ->    28 x  28 x  32 
  57 channel_slice             28 x  28 x  64   ->    28 x  28 x  32 
  58 conv     32       1 x 1/ 1     28 x  28 x  32 ->   28 x  28 x  32 0.002 BF
  59 conv     32/  32  3 x 3/ 1     28 x  28 x  32 ->   28 x  28 x  32 0.000 BF
  60 conv     32       1 x 1/ 1     28 x  28 x  32 ->   28 x  28 x  32 0.002 BF
  61 route  56 60
  62 channel_shuffle                28 x  28 x  64   ->    28 x  28 x  64 
  63 channel_slice             28 x  28 x  64   ->    28 x  28 x  32 
  64 channel_slice             28 x  28 x  64   ->    28 x  28 x  32 
  65 conv     32       1 x 1/ 1     28 x  28 x  32 ->   28 x  28 x  32 0.002 BF
  66 conv     32/  32  3 x 3/ 1     28 x  28 x  32 ->   28 x  28 x  32 0.000 BF
  67 conv     32       1 x 1/ 1     28 x  28 x  32 ->   28 x  28 x  32 0.002 BF
  68 route  63 67
  69 conv     64       1 x 1/ 1     28 x  28 x  64 ->   28 x  28 x  64 0.006 BF
  70 conv     64/  64  3 x 3/ 1     28 x  28 x  64 ->   28 x  28 x  64 0.001 BF
  71 conv     64       1 x 1/ 1     28 x  28 x  64 ->   28 x  28 x  64 0.006 BF
  72 route  71 41
  73 max               2 x 2/ 2     28 x  28 x 128 ->   14 x  14 x 128 0.000 BF
  74 conv    128       1 x 1/ 1     14 x  14 x 128 ->   14 x  14 x 128 0.006 BF
  75 conv    128       3 x 3/ 1     14 x  14 x 128 ->   14 x  14 x 128 0.058 BF
  76 conv    128       1 x 1/ 1     14 x  14 x 128 ->   14 x  14 x 128 0.006 BF
  77 channel_slice             14 x  14 x 128   ->    14 x  14 x  64 
  78 channel_slice             14 x  14 x 128   ->    14 x  14 x  64 
  79 conv     64       1 x 1/ 1     14 x  14 x  64 ->   14 x  14 x  64 0.002 BF
  80 conv     64/  64  3 x 3/ 1     14 x  14 x  64 ->   14 x  14 x  64 0.000 BF
  81 conv     64       1 x 1/ 1     14 x  14 x  64 ->   14 x  14 x  64 0.002 BF
  82 route  77 81
  83 channel_shuffle                14 x  14 x 128   ->    14 x  14 x 128 
  84 channel_slice             14 x  14 x 128   ->    14 x  14 x  64 
  85 channel_slice             14 x  14 x 128   ->    14 x  14 x  64 
  86 conv     64       1 x 1/ 1     14 x  14 x  64 ->   14 x  14 x  64 0.002 BF
  87 conv     64/  64  3 x 3/ 1     14 x  14 x  64 ->   14 x  14 x  64 0.000 BF
  88 conv     64       1 x 1/ 1     14 x  14 x  64 ->   14 x  14 x  64 0.002 BF
  89 route  84 88
  90 channel_shuffle                14 x  14 x 128   ->    14 x  14 x 128 
  91 channel_slice             14 x  14 x 128   ->    14 x  14 x  64 
  92 channel_slice             14 x  14 x 128   ->    14 x  14 x  64 
  93 conv     64       1 x 1/ 1     14 x  14 x  64 ->   14 x  14 x  64 0.002 BF
  94 conv     64/  64  3 x 3/ 1     14 x  14 x  64 ->   14 x  14 x  64 0.000 BF
  95 conv     64       1 x 1/ 1     14 x  14 x  64 ->   14 x  14 x  64 0.002 BF
  96 route  91 95
  97 channel_shuffle                14 x  14 x 128   ->    14 x  14 x 128 
  98 channel_slice             14 x  14 x 128   ->    14 x  14 x  64 
  99 channel_slice             14 x  14 x 128   ->    14 x  14 x  64 
 100 conv     64       1 x 1/ 1     14 x  14 x  64 ->   14 x  14 x  64 0.002 BF
 101 conv     64/  64  3 x 3/ 1     14 x  14 x  64 ->   14 x  14 x  64 0.000 BF
 102 conv     64       1 x 1/ 1     14 x  14 x  64 ->   14 x  14 x  64 0.002 BF
 103 route  98 102
 104 conv    128       1 x 1/ 1     14 x  14 x 128 ->   14 x  14 x 128 0.006 BF
 105 conv    128/ 128  3 x 3/ 1     14 x  14 x 128 ->   14 x  14 x 128 0.000 BF
 106 conv    128       1 x 1/ 1     14 x  14 x 128 ->   14 x  14 x 128 0.006 BF
 107 route  106 76
 108 max               2 x 2/ 2     14 x  14 x 256 ->    7 x   7 x 256 0.000 BF
 109 conv    256       1 x 1/ 1      7 x   7 x 256 ->    7 x   7 x 256 0.006 BF
 110 conv    256/ 256  3 x 3/ 1      7 x   7 x 256 ->    7 x   7 x 256 0.000 BF
 111 conv    256       1 x 1/ 1      7 x   7 x 256 ->    7 x   7 x 256 0.006 BF
 112 channel_slice              7 x   7 x 256   ->     7 x   7 x 128 
 113 channel_slice              7 x   7 x 256   ->     7 x   7 x 128 
 114 conv    128       1 x 1/ 1      7 x   7 x 128 ->    7 x   7 x 128 0.002 BF
 115 conv    128/ 128  3 x 3/ 1      7 x   7 x 128 ->    7 x   7 x 128 0.000 BF
 116 conv    128       1 x 1/ 1      7 x   7 x 128 ->    7 x   7 x 128 0.002 BF
 117 route  112 116
 118 channel_shuffle                 7 x   7 x 256   ->     7 x   7 x 256 
 119 channel_slice              7 x   7 x 256   ->     7 x   7 x 128 
 120 channel_slice              7 x   7 x 256   ->     7 x   7 x 128 
 121 conv    128       1 x 1/ 1      7 x   7 x 128 ->    7 x   7 x 128 0.002 BF
 122 conv    128/ 128  3 x 3/ 1      7 x   7 x 128 ->    7 x   7 x 128 0.000 BF
 123 conv    128       1 x 1/ 1      7 x   7 x 128 ->    7 x   7 x 128 0.002 BF
 124 route  119 123
 125 channel_shuffle                 7 x   7 x 256   ->     7 x   7 x 256 
 126 channel_slice              7 x   7 x 256   ->     7 x   7 x 128 
 127 channel_slice              7 x   7 x 256   ->     7 x   7 x 128 
 128 conv    128       1 x 1/ 1      7 x   7 x 128 ->    7 x   7 x 128 0.002 BF
 129 conv    128/ 128  3 x 3/ 1      7 x   7 x 128 ->    7 x   7 x 128 0.000 BF
 130 conv    128       1 x 1/ 1      7 x   7 x 128 ->    7 x   7 x 128 0.002 BF
 131 route  126 130
 132 channel_shuffle                 7 x   7 x 256   ->     7 x   7 x 256 
 133 channel_slice              7 x   7 x 256   ->     7 x   7 x 128 
 134 channel_slice              7 x   7 x 256   ->     7 x   7 x 128 
 135 conv    128       1 x 1/ 1      7 x   7 x 128 ->    7 x   7 x 128 0.002 BF
 136 conv    128/ 128  3 x 3/ 1      7 x   7 x 128 ->    7 x   7 x 128 0.000 BF
 137 conv    128       1 x 1/ 1      7 x   7 x 128 ->    7 x   7 x 128 0.002 BF
 138 route  133 137
 139 conv    256       1 x 1/ 1      7 x   7 x 256 ->    7 x   7 x 256 0.006 BF
 140 conv    256/ 256  3 x 3/ 1      7 x   7 x 256 ->    7 x   7 x 256 0.000 BF
 141 conv    256       1 x 1/ 1      7 x   7 x 256 ->    7 x   7 x 256 0.006 BF
 142 route  141 110
 143 conv    512       1 x 1/ 1      7 x   7 x 512 ->    7 x   7 x 512 0.026 BF
 144 conv    512/ 512  3 x 3/ 1      7 x   7 x 512 ->    7 x   7 x 512 0.000 BF
 145 conv    512       1 x 1/ 1      7 x   7 x 512 ->    7 x   7 x 512 0.026 BF
 146 conv   1000       1 x 1/ 1      7 x   7 x 512 ->    7 x   7 x1000 0.050 BF
 147 avg                             7 x   7 x1000 ->   1000
 148 softmax                                        1000
 149 cost                                           1000
Total BFLOPS 0.375 
 Allocate additional workspace_size = 1.64 MB

Here are the cfg and weights.
shuffle_imagenet.cfg.txt
shuffle.weights[google] OR [baidupan]

Can you update this information.

dexception · 2019-09-16T07:08:32Z

@AlexeyAB
Will this be merged ?

AlexeyAB · 2019-09-16T22:07:00Z

@dexception @gmayday1997 @WongKinYiu

Just we should understand if it is necessary.

What Top1/5 or mAP can be achieved with shuffle net?

WongKinYiu · 2019-09-20T12:50:05Z

@AlexeyAB

framework	type	top-1 acc.
Darknet.CG	split+shuffle	60.4%
Darknet	equivalent split	69.2%
Pytorch	split+shuffle	69.54%
Pytorch	equivalent split	69.48%

so i think there r some problems in Darknet.CG
but i do not find the problems in c code of channel split and channel shuffle in Darknet.CG

update
channel split works fine (but slow)
maybe the problem is in channel shuffle layer.

AlexeyAB · 2019-09-20T13:07:20Z

@WongKinYiu Yes, it seems something wrong with Darknet.CG

Can you provide the model (cfg + weights) for Darknet | equivalent split | 69.2% there ? #3874
Since there is only 52% Top1 :

Model	BFLOPs	Inference Time (ms)	Top1, %	URL
shufflenetv2 and weights	0.375	32	52%	URL

WongKinYiu · 2019-09-20T14:30:35Z

i can not share the cfg file currently.
the bflops of the model is about 0.8, so 69% top1 acc is normal.

AlexeyAB · 2019-09-20T14:41:29Z

@WongKinYiu Is it different than this file https://github.com/AlexeyAB/darknet/files/3619916/shuffle_imagenet.cfg.txt ?

WongKinYiu · 2019-09-20T23:13:30Z

Yes, it is different.
I do not use depth-wise convolutional layers.
The model is modified from the paper of https://github.com/WongKinYiu/PartialResidualNetworks

AlexeyAB · 2019-09-20T23:23:47Z

@WongKinYiu

Do you have plan to measure Inference time or FPS on CPU and GPU in addition to the BFLOPs for all these https://github.com/WongKinYiu/PartialResidualNetworks models?

WongKinYiu · 2019-09-21T00:17:37Z

@AlexeyAB OK, i have updated fps information.

dexception · 2019-09-21T11:28:16Z

@WongKinYiu

Have you integrated Yolo with Shufflenetv2 ?
if yes, can you share what FPS your getting with Yolo-ShuffleNetv2 ?

WongKinYiu · 2019-09-21T11:49:12Z

@dexception

No, I haven't.
If you could help for making accuracy of shufflenetv2 on imagenet to be normal.
I would like to integrate it.

dexception · 2019-09-21T12:31:31Z

We haven't looked at operator fusion and still dependent on TVM or TensorRT for that. Won't be a bad idea to look into that. The benefits would apply to all almost everything we are doing.

@AlexeyAB
Is it possible to implement in this repo ?
Is it too much of an effort ?

AlexeyAB · 2019-09-21T21:53:26Z

@dexception
I don't know, should we do this?

If the model Darknet | equivalent split | 69.2% (which already works with this repo) gives us the same accuracy and the same speed, then why should we implement channel_split + shuffle for shuffle_net?

@WongKinYiu
Can you provide FPS on GPU and CPU for these models?
#3750 (comment)

WongKinYiu · 2019-09-21T22:08:27Z

@AlexeyAB

GPU: ~120 FPS; CPU: ~7 FPS
when applied PRN head and using input size as 416 by 416.

AlexeyAB · 2019-09-21T22:20:38Z

@WongKinYiu Thanks.

But what FPS GPU/CPU for other models?
Is Darknet | equivalent split | 69.2% faster or slower than other models from this table?

WongKinYiu · 2019-09-21T22:47:03Z

@AlexeyAB
Yes, It is the fastest model in the table.
GPU: GTX 1080ti, CPU: i7 6700

I am in ICIP now, so I can not provide the exact FPS of those models immediately.

AlexeyAB · 2019-09-21T23:09:49Z

@dexception
@WongKinYiu In this case, we should not implement layers channel_split + shuffle. Just wait until you put this model in open access. Since it already works in this repository without any changes to the source code, as I understand it.

dexception · 2019-09-22T06:02:45Z

@AlexeyAB
At minimum we should be able to get 1.5x increase in FPS.

https://github.com/NVIDIA/TensorRT is now open source so it won't be bad idea.

deimsdeutsch · 2019-09-26T10:05:35Z

@WongKinYiu
Can you share the cfg file ? I would like to try it on my dataset.

WongKinYiu · 2019-09-26T10:17:59Z

@deimsdeutsch

Sorry for that i can not share the cfg file for #3750 (comment)

For cfg of sufflenetv2, you can check #3750 (comment)

But currently, i do not suggest you train these models.
The channel split layer and channel shuffle layer seems have some problems. #3750 (comment)

Maybe mobilenetv2 is more stable on darknet now.
https://github.com/WePCf/darknet-mobilenet-v2 for your reference

deimsdeutsch · 2019-09-26T14:02:40Z

@WongKinYiu
MobileNetV2 seems to suffer the same issue with group convolution implementation.

WongKinYiu · 2019-09-26T14:40:56Z

@dexception yes.

for general gpus, resnet18 is a good choice.
it can run >140 fps on gpu and get 28.1 ap@.5:.95 on coco using centernet.
https://github.com/xingyizhou/CenterNet

i m also training resnet18 based models now.

deimsdeutsch · 2019-09-27T07:19:57Z

Since you mentioned Resnet18. Nvidia is using Resnet10 with deepstream4.
On Tesla T4 they can manage 35-68 streams running at 30FPS all of them 1080P.

Here is the model they are using:
https://ngc.nvidia.com/catalog/models/nvidia:tlt_iva_object_detection_resnet10

WongKinYiu · 2019-09-27T11:18:25Z

Thank you for sharing the information

dexception · 2019-09-29T18:14:16Z

@deimsdeutsch
Nice share. Just ran my demo and this is where all the magic is happening.
Custom plugins for TensorRT is where you should all dig and stay all night.

spaul13 · 2020-01-31T04:54:14Z

@AlexeyAB @gmayday1997 @WongKinYiu @dexception @beHappy666 while using shuffle_swiss.cfg. (provided by @WongKinYiu) as my configuration file, I am getting the following error.

setting up CUDA Devices compute_capability = 610, cudnn_half = 0
layer filters size/strd(dil) input output
0 conv 16 3 x 3/ 1 224 x 224 x 3 -> 224 x 224 x 16 0.043 BF
1 max 2x 2/ 2 224 x 224 x 16 -> 112 x 112 x 16 0.001 BF
2 conv 16 1 x 1/ 1 112 x 112 x 16 -> 112 x 112 x 16 0.006 BF
3 max 2x 2/ 2 112 x 112 x 16 -> 56 x 56 x 16 0.000 BF
4 conv 32 1 x 1/ 1 56 x 56 x 16 -> 56 x 56 x 32 0.003 BF
5 conv 32/ 32 3 x 3/ 1 56 x 56 x 32 -> 56 x 56 x 32 0.002 BF
6 conv 32 1 x 1/ 1 56 x 56 x 32 -> 56 x 56 x 32 0.006 BF
7 Type not recognized: [channel_slice]
Unused field: 'from = -1'
Unused field: 'axis = 1'
Unused field: 'start = 0'
Unused field: 'end = 16'
8 Type not recognized: [channel_slice]
Unused field: 'from = -2'
Unused field: 'axis = 1'
Unused field: 'start = 16'
Unused field: 'end = 32'
9 Layer before convolutional layer must output image.: No error
Assertion failed: 0, file c:\yolo\darknet\src\utils.c, line 293

can anyone plz tell me how to get rid of this error?
shuffle_swish.cfg.txt

WongKinYiu · 2020-01-31T06:10:00Z

Hello, this cfg seems for https://github.com/gmayday1997/darknet.CG.

spaul13 · 2020-01-31T15:36:47Z

@WongKinYiu, @AlexeyAB is there any shufflenet_swiss.cfg file for this darknet repo?

AlexeyAB · 2020-01-31T15:40:10Z

@spaul13 No, since I have not seen evidence that this is better than SOTA models.

AlexeyAB added the want enhancement Want to improve accuracy, speed or functionality label Aug 11, 2019

This was referenced Nov 7, 2019

MixNet (Mix_Conv) - 0.360 (0.5) BFlops - 77.0% (71.5%) Top1 #4203

Closed

How to implement slice layer in darknet #3132

Open

huseyinuri mentioned this issue Jul 27, 2020

training slightly modified yolov4-tiny with imagenet #6352

Open

shufflenetV2: an extremely light-weight architecture | Implementation #3750

shufflenetV2: an extremely light-weight architecture | Implementation #3750

Comments

gmayday1997 commented Aug 11, 2019

How to use

WongKinYiu commented Aug 15, 2019 • edited Loading

gmayday1997 commented Aug 15, 2019

WongKinYiu commented Aug 15, 2019 • edited Loading

gmayday1997 commented Aug 15, 2019

WongKinYiu commented Aug 15, 2019 • edited Loading

gmayday1997 commented Aug 15, 2019

WongKinYiu commented Aug 15, 2019 • edited Loading

gmayday1997 commented Aug 15, 2019

WongKinYiu commented Aug 15, 2019 • edited Loading

gmayday1997 commented Aug 15, 2019

WongKinYiu commented Aug 15, 2019 • edited Loading

gmayday1997 commented Aug 15, 2019 • edited Loading

WongKinYiu commented Aug 16, 2019

dexception commented Aug 16, 2019

WongKinYiu commented Aug 17, 2019

gmayday1997 commented Aug 18, 2019

AlexeyAB commented Aug 18, 2019

WongKinYiu commented Aug 18, 2019

beHappy666 commented Aug 20, 2019

gmayday1997 commented Aug 20, 2019

LukeAI commented Aug 20, 2019

WongKinYiu commented Aug 21, 2019

beHappy666 commented Aug 21, 2019

pcorner commented Aug 21, 2019 via email

WongKinYiu commented Aug 21, 2019

pcorner commented Aug 21, 2019 via email

jamessmith90 commented Aug 22, 2019

dexception commented Sep 16, 2019

AlexeyAB commented Sep 16, 2019

WongKinYiu commented Sep 20, 2019 • edited Loading

AlexeyAB commented Sep 20, 2019

WongKinYiu commented Sep 20, 2019

AlexeyAB commented Sep 20, 2019

WongKinYiu commented Sep 20, 2019

AlexeyAB commented Sep 20, 2019

WongKinYiu commented Sep 21, 2019

dexception commented Sep 21, 2019

WongKinYiu commented Sep 21, 2019

dexception commented Sep 21, 2019

AlexeyAB commented Sep 21, 2019

WongKinYiu commented Sep 21, 2019

AlexeyAB commented Sep 21, 2019

WongKinYiu commented Sep 21, 2019

AlexeyAB commented Sep 21, 2019

dexception commented Sep 22, 2019

deimsdeutsch commented Sep 26, 2019

WongKinYiu commented Sep 26, 2019

deimsdeutsch commented Sep 26, 2019

WongKinYiu commented Sep 26, 2019 • edited Loading

deimsdeutsch commented Sep 27, 2019

WongKinYiu commented Sep 27, 2019

dexception commented Sep 29, 2019

spaul13 commented Jan 31, 2020

WongKinYiu commented Jan 31, 2020

spaul13 commented Jan 31, 2020

AlexeyAB commented Jan 31, 2020

WongKinYiu commented Aug 15, 2019 •

edited

Loading

WongKinYiu commented Aug 15, 2019 •

edited

Loading

WongKinYiu commented Aug 15, 2019 •

edited

Loading

WongKinYiu commented Aug 15, 2019 •

edited

Loading

WongKinYiu commented Aug 15, 2019 •

edited

Loading

WongKinYiu commented Aug 15, 2019 •

edited

Loading

gmayday1997 commented Aug 15, 2019 •

edited

Loading

WongKinYiu commented Sep 20, 2019 •

edited

Loading

WongKinYiu commented Sep 26, 2019 •

edited

Loading