-
Notifications
You must be signed in to change notification settings - Fork 8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
shufflenetV2: an extremely light-weight architecture | Implementation #3750
Comments
@gmayday1997 hello, i v tried several models with channel_shuffle layers, all model got nan after training 30~80k epochs. thanks a lot. |
hi, @WongKinYiu , I have trained model with channel_shuffle and channel_slice for 100k epochs, it got 56% top5 precision. Training is still in progress.
Here are the cfg and weights. |
@gmayday1997 thank you for sharing the cfg file. i checked my cfgs, it seems all models with lrelu activation function are failed. the mainly difference between ur cfg and mine are: |
@WongKinYiu Yes, you are right. In fact, I tried to implement the proposed down sampling module, but it seems hard to converge. Do you mind show your cfg file? |
@gmayday1997 here is the cfg file. SNet49.cfg.txt |
Hi @WongKinYiu
|
@gmayday1997 Hello, depthwise convolutional layers of shufflenetv2 do not have activation function. |
@WongKinYiu There is really no activation function in depthwise convolution module. Thank you for pointing this. |
@gmayday1997 i m sorry about that i can not provide such information. |
@WongKinYiu so the model never learn anything from training. |
@gmayday1997 module using route: module using channel split: maybe i will check the code of channel split layer and channel shuffle layer after my busy weeks. |
@WongKinYiu sorry for late reply. |
i do not use caffe, but the behavior of split and route are almost same. |
I am assuming the weights are trained on imagenet. 69.06% Top 1 Accuracy |
@gmayday1997 I have finished training of ur provided cfg and some alternatives.
|
@WongKinYiu wow, thank you for sharing so valuable experiment comparisons. It seems that warm up learning gives major improvement. |
@WongKinYiu Will you try to train Detector with ShuffleNet backbone? |
@AlexeyAB Hello, So i won't train a detector with shufflenetv2 backbone now. Thanks. |
@gmayday1997 Hi, i have a question about the slice layer, is it the same as slice layer in caffe which is using for cut the channels? |
@beHappy666 Yes. As darknet don't support multi-ouputs, so we need pay attention to propagate gradients correctly.
we use "from" to indicate which feature blobs are sliced and use "start" and "end" to set the slice point. |
What is the inference time on these guys? |
@LukeAI |
@gmayday1997 Ok,thank you. |
Dear Yolo friends.
I'm look for best advice on training a job where I'll need the locate crops ( one type ) like tomatoes sitting at the plant . I'll easily be able to collect a training and trstset with all the required characteristics to foster solid variance in all dimensions. But I'm somewhat in doubt about what would be the best settings and workflow.
Btw I have a navidia cuda backbone in place .
Kind regards
Get Outlook for Android<https://aka.ms/ghei36>
…________________________________
From: LukeAI <notifications@github.com>
Sent: Tuesday, August 20, 2019 6:05:17 PM
To: AlexeyAB/darknet <darknet@noreply.github.com>
Cc: Subscribed <subscribed@noreply.github.com>
Subject: Re: [AlexeyAB/darknet] shufflenetV2: an extremely light-weight architecture | Implementation (#3750)
@gmayday1997<https://github.com/gmayday1997> I have finished training of ur provided cfg and some alternatives.
1. shufflenet with swish activation function
[shuffle_swish.cfg.txt](https://github.com/AlexeyAB/darknet/files/3511851/shuffle_swish.cfg.txt)
it got 31.8% top-1 acc and 55.6% top-5 acc.
2. shufflenet with swish activation function + warm up learning rate scheduler
[shuffle_swish_warmup.txt](https://github.com/AlexeyAB/darknet/files/3511852/shuffle_swish_warmup.txt)
it got 33.4% top-1 acc and 57.4% top-5 acc.
3. shufflenet with leaky relu activation function
[shuffle_leaky.cfg.txt](https://github.com/AlexeyAB/darknet/files/3511854/shuffle_leaky.cfg.txt)
it got 29.0% top-1 acc and 52.1% top-5 acc.
4. shufflenet with leaky relu activation function + warm up learning rate scheduler
[shuffle_leaky_warmup.txt](https://github.com/AlexeyAB/darknet/files/3511856/shuffle_leaky_warmup.txt)
it got 31.5% top-1 acc and 55.1% top-5 acc.
What is the inference time on these guys?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub<#3750?email_source=notifications&email_token=ABY4RBNFR57XDWE3YPUGLPDQFQI33A5CNFSM4IK23722YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4WZZIA#issuecomment-523082912>, or mute the thread<https://github.com/notifications/unsubscribe-auth/ABY4RBNUFKSQEE4FQT7CQADQFQI33ANCNFSM4IK2372Q>.
|
@pcorner Do you mean https://youtu.be/sY4tLRI6pYc ? |
Yes similar to this. Im just looking for 2D detection, as I have other means to achieve the 3D location postprocessing.
Best regards / med venlig hilsen /真诚
Preben Hjornet
Special Advisor
Moble +45 2460 9899 skype preben.hjornet LinkedIn.<http://www.linkedin.com/pub/preben-hj%C3%B8rnet/5/b88/126> Profile
[Mailsignatur2]
From: Kin-Yiu, Wong <notifications@github.com>
Sent: 21. august 2019 09:38
To: AlexeyAB/darknet <darknet@noreply.github.com>
Cc: Preben Hjornet <preben.hjornet@gmail.com>; Mention <mention@noreply.github.com>
Subject: Re: [AlexeyAB/darknet] shufflenetV2: an extremely light-weight architecture | Implementation (#3750)
@pcorner<https://github.com/pcorner> Do you mean https://youtu.be/sY4tLRI6pYc ?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#3750?email_source=notifications&email_token=ABY4RBIKLH73AXHB32VPLGLQFTWEZA5CNFSM4IK23722YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4YXW7Y#issuecomment-523336575>, or mute the thread<https://github.com/notifications/unsubscribe-auth/ABY4RBLBUYYQEP4WBZRZLP3QFTWEZANCNFSM4IK2372Q>.
|
Can you update this information. |
@AlexeyAB |
@dexception @gmayday1997 @WongKinYiu Just we should understand if it is necessary. What Top1/5 or mAP can be achieved with shuffle net? |
so i think there r some problems in Darknet.CG update |
@WongKinYiu Yes, it seems something wrong with Darknet.CG Can you provide the model (cfg + weights) for
|
i can not share the cfg file currently. |
@WongKinYiu Is it different than this file https://github.com/AlexeyAB/darknet/files/3619916/shuffle_imagenet.cfg.txt ? |
Yes, it is different. |
Do you have plan to measure Inference time or FPS on CPU and GPU in addition to the BFLOPs for all these https://github.com/WongKinYiu/PartialResidualNetworks models? |
@AlexeyAB OK, i have updated fps information. |
Have you integrated Yolo with Shufflenetv2 ? |
No, I haven't. |
We haven't looked at operator fusion and still dependent on TVM or TensorRT for that. Won't be a bad idea to look into that. The benefits would apply to all almost everything we are doing. @AlexeyAB |
@dexception If the model @WongKinYiu |
GPU: ~120 FPS; CPU: ~7 FPS |
@WongKinYiu Thanks. But what FPS GPU/CPU for other models? |
@AlexeyAB I am in ICIP now, so I can not provide the exact FPS of those models immediately. |
@dexception |
@AlexeyAB https://github.com/NVIDIA/TensorRT is now open source so it won't be bad idea. |
@WongKinYiu |
Sorry for that i can not share the cfg file for #3750 (comment) For cfg of sufflenetv2, you can check #3750 (comment) But currently, i do not suggest you train these models. Maybe mobilenetv2 is more stable on darknet now. |
@WongKinYiu |
@dexception yes. for general gpus, resnet18 is a good choice. i m also training resnet18 based models now. |
Since you mentioned Resnet18. Nvidia is using Resnet10 with deepstream4. Here is the model they are using: |
Thank you for sharing the information |
@deimsdeutsch |
@AlexeyAB @gmayday1997 @WongKinYiu @dexception @beHappy666 while using shuffle_swiss.cfg. (provided by @WongKinYiu) as my configuration file, I am getting the following error. setting up CUDA Devices compute_capability = 610, cudnn_half = 0 can anyone plz tell me how to get rid of this error? |
Hello, this cfg seems for https://github.com/gmayday1997/darknet.CG. |
@WongKinYiu, @AlexeyAB is there any shufflenet_swiss.cfg file for this darknet repo? |
@spaul13 No, since I have not seen evidence that this is better than SOTA models. |
shufflenetV2: Practical Guidelines for Efficient CNN Architecture Design
paper: https://arxiv.org/abs/1807.11164
source code(caffe): https://github.com/miaow1988/ShuffleNet_V2_pytorch_caffe
I have implemented channel shuffle layer and channel_slice layer. Everyone who is interested in this work can try it.
How to use
The text was updated successfully, but these errors were encountered: