Not getting a good accuracy #4

leochli · 2017-07-24T01:47:22Z

I ran your deploy prototxt on imagenet this weekend yet still got a bad accuracy output. (exactly the same prototxt

I'd be appreciated if you could share your solver file with me to check.

much thanks!

farmingyard · 2017-07-24T02:18:02Z

here is an example, batch size is 64, you can try it!

net: "train_val.prototxt"
#test_initialization: false
#test_iter: 100
#test_interval: 5000
display: 40
average_loss: 40
base_lr: 0.01
lr_policy: "poly"
power: 1.0
max_iter: 1000000
momentum: 0.9
weight_decay: 0.0001
snapshot: 5000
snapshot_prefix: "shufflenet"

leochli · 2017-07-24T02:28:02Z

@farmingyard
thanks man! Btw what's your acc for this? I only got 54% as top1_acc and 79% as top5_acc. According to the paper it's only around 34.1% error rate.

I tested on two GPUs, this might cause some problem if the ShuffleChannel layer doesn't support multiple-GPU. I'm not sure tho. I'll try your solver to see.

thanks a lot!

farmingyard · 2017-07-24T03:57:33Z

@LeoLee96

I got 62.8% top1 acc and 84.7% top 5 acc, the result is not good enough with paper's, it still needs tuning...

KeyKy · 2017-07-24T08:01:26Z

mark

zimenglan-sysu-512 · 2017-08-01T13:09:15Z

hi @farmingyard i just wonder that how do you write the prototxt? do you code to write? if, can you share it? thanks.

farmingyard · 2017-08-02T04:53:00Z

@zimenglan-sysu-512
You can find this: https://github.com/farmingyard/Caffe-Net-Generator

leochli · 2017-08-11T09:40:14Z

Hi @farmingyard ,

Do you finally reach the 65.9% top 1 acc in the paper?

I trained with:
batchsize 256,
totally 100 epochs,
base lr: 0.1
decay the learning rate by 0.1 every 30 epochs.

Yet I only got around 64% acc at the end.

I'd be appreciated if you could share with me some tricks in your training process.

Thx a lot!

farmingyard · 2017-08-12T02:49:21Z

@LeoLee96
Your model is better than mine，i didn't keep on training anymore，so my result is still same to the above.

7oud · 2017-08-24T07:07:14Z

Hi @farmingyard ,@LeoLee96
I trained shufflenet on our data , but got a worse output than alexnet.
I'd be appreciated if you could share your curve of train loss.
Thanks !

zhangleiedu · 2017-09-05T12:45:01Z

hi, @LeoLee96 can you share your pre-trained model.
Thanks.

xiaomr · 2017-09-06T02:22:05Z

hi, @LeoLee96 , when you train shuffle net on two GPUs,you said this might cause some problem beacause the ShuffleChannel layer doesn't support multiple-GPU. how do you solve ? I got "Multi-GPU execution not available - rebuild with USE_NCCL" error, could you give me some advice

leochli · 2017-09-06T02:58:33Z

@xiaomr
Hi, I'm not sure tho. Since the depthwise conv layer are not designed for all parallel-GPU systems, if you have your own parallel GPU system, you may need to modify this layers to fit your system. I didn't get this USE_NCLL error even before the modification. Anyway, try to run shuffle net on a single GPU first.

xiaomr · 2017-09-07T09:05:03Z

@thank you for your advice! I have fixed the problem, it seems that depose layer can support multi gnu, the problem is because I chose the wrong branch of caffe~

adapt-image-models · 2017-09-09T03:26:10Z

Hi, @LeoLee96 , do you finally reach the 65.9% val acc ?
I trained 90 epochs, with a batch_size of 256 on 4 GPUs, base_lr=0.1 and divide it by 10 every 30 epochs, wd=4e-5. But I only get 63.3 val acc. Can you give me some advice ?

anlongstory · 2017-09-25T10:23:35Z

@LeoLee96 Hi, I am a new guy to learn deep learning ,now, I want to use Caffe to train ShuffleNet on my own data ,but just with one .prototxt file I have no idea ,could you give me some direction or advises?

ppwwyyxx · 2017-10-10T01:33:50Z

I can reproduce the paper's accuracy of a 40Mflops shufflenet with tensorflow (https://github.com/tensorpack/tensorpack/tree/master/examples/ImageNetModels#shufflenet). You can use the configuration there as a reference.

andeyeluguo · 2017-10-16T02:05:09Z

I only get 43% val acc when the epoch is 400000, I use your solver.prototxt and change the deploy.prototxt into train_val.prototxt. Is it not sufficient to train？ or the preprocess of data is not true?
Mine is:
transform_param {
mirror: true
crop_size: 224
scale: 0.017
mean_value: [103.94,116.78,123.68]
}
should I change the preprocess into ：
transform_param {
mirror: false
crop_size: 224
mean_file: "data/ilsvrc12/imagenet_mean.binaryproto"
}
or anything else?

wang5566 · 2018-09-17T06:05:36Z

@VectorYYYY
You mean batchsize 256 for every GPU or total batchsize 256 for 4 GPUs?

ppwwyyxx · 2018-09-17T06:10:43Z

According to the paper the batch size is 256 on each GPU making a total batch size of 1024. Other settings such as learning rate schedule are also clear so I don't know why would people invent their own settings if the goal is to reproduce the result.

wang5566 · 2018-09-17T06:51:04Z

1080ti can only set batchsize to 64 and I set 4 gpus for training. But I found loss around 2.1 cannot decrease and the model top1 accuracy is around 53%

ppwwyyxx · 2018-09-17T07:01:56Z

According to https://arxiv.org/abs/1706.02677 you can use 1/4 learning rate together with 1/4 batch size and train 4x more steps to get roughly the same results.

Besides that, my implementation can actually train a shufflenet 1x with batchsize 128 on a 1080ti, and shufflenet 0.5x with batchsize 256.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Not getting a good accuracy #4

Not getting a good accuracy #4

leochli commented Jul 24, 2017

farmingyard commented Jul 24, 2017 •

edited

Loading

leochli commented Jul 24, 2017

farmingyard commented Jul 24, 2017

KeyKy commented Jul 24, 2017

zimenglan-sysu-512 commented Aug 1, 2017

farmingyard commented Aug 2, 2017

leochli commented Aug 11, 2017

farmingyard commented Aug 12, 2017

7oud commented Aug 24, 2017

zhangleiedu commented Sep 5, 2017

xiaomr commented Sep 6, 2017

leochli commented Sep 6, 2017

xiaomr commented Sep 7, 2017

adapt-image-models commented Sep 9, 2017

anlongstory commented Sep 25, 2017

ppwwyyxx commented Oct 10, 2017 •

edited

Loading

andeyeluguo commented Oct 16, 2017 •

edited

Loading

wang5566 commented Sep 17, 2018

ppwwyyxx commented Sep 17, 2018

wang5566 commented Sep 17, 2018

ppwwyyxx commented Sep 17, 2018

Not getting a good accuracy #4

Not getting a good accuracy #4

Comments

leochli commented Jul 24, 2017

farmingyard commented Jul 24, 2017 • edited Loading

leochli commented Jul 24, 2017

farmingyard commented Jul 24, 2017

KeyKy commented Jul 24, 2017

zimenglan-sysu-512 commented Aug 1, 2017

farmingyard commented Aug 2, 2017

leochli commented Aug 11, 2017

farmingyard commented Aug 12, 2017

7oud commented Aug 24, 2017

zhangleiedu commented Sep 5, 2017

xiaomr commented Sep 6, 2017

leochli commented Sep 6, 2017

xiaomr commented Sep 7, 2017

adapt-image-models commented Sep 9, 2017

anlongstory commented Sep 25, 2017

ppwwyyxx commented Oct 10, 2017 • edited Loading

andeyeluguo commented Oct 16, 2017 • edited Loading

wang5566 commented Sep 17, 2018

ppwwyyxx commented Sep 17, 2018

wang5566 commented Sep 17, 2018

ppwwyyxx commented Sep 17, 2018

farmingyard commented Jul 24, 2017 •

edited

Loading

ppwwyyxx commented Oct 10, 2017 •

edited

Loading

andeyeluguo commented Oct 16, 2017 •

edited

Loading