Would PENCIL decrease the label accuracy? #4

JiarunLiu · 2019-11-20T09:33:06Z

Hi, I'm trying to re-implement your work.

I'm using ResNet-32 as backbone follow the paper's but without pretrained model parameters. Other hyper-parameters follow the paper suggession in Section 4.2 and listed below. The net was training on CIFAR-10 dataset with 10% asymmeric noise. However, experimental result was far from Paper's. Final Model best top1 accuracy is 84.02%. The accuracy of data grew very rapidly but soon began to decline and fell below the accuracy of the original labels.

Are there any tricks that didn't get written into the paper or that I ignored？Thanks.

Here is the plot of label accuracy changing

And hyper-parameter was shown as follows:

batch_size: 128
lr: 0.06
lr2: 0.2
alpha: 0.1
beta: 0.4
lambda1: 600
momentum: 0.9
Weight-decay: 1e-4

ruirui88 · 2020-04-29T14:21:18Z

hi，I met the same problem with your's where the experimental result was far from Paper's. Did you ever solve it?

congyang1996 · 2020-07-08T09:35:03Z

Here is my guess. The total loss was calculated with defferent weight between code and paper. Class number was used in paper, which increase the weight of Lo, so, yd was more close to original noise label. But in code, class number was not used in total loss. Welcome to any feedback about my guess.

jixiaonanzhuaizhuai · 2020-09-09T03:26:41Z

@JiarunLiu

I run PENCL model on cifar-10 datasets. the "--arch" uses resnet-18 as backbone network.

when running.....:

File "PENCIL-master/PENCIL.py", line 355, in train
lc = torch.mean(softmax(output)*(logsoftmax(output)-torch.log((last_y_var))))
RuntimeError: The size of tensor a (1000) must match the size of tensor b (10) at non-singleton dimension 1

output size: (128,1000). last_y_var size: (128,10)

so, is this code missing a classifier? How do you deal with this problem？

jingzhengli · 2020-10-30T08:32:09Z

I recomment you could read the paper. The implementation of training has three stages: The first is warm-up; The second is label correction. The third stage is fine tune.

JiarunLiu · 2020-10-30T10:00:26Z

hi，I met the same problem with your's where the experimental result was far from Paper's. Did you ever solve it?

It seems like there is a certain level of samples that PENCIL couldn't correct it properly. It depends on both loss function and classifier prediction. Another reasons may because the wrong label in the GT label. I'm working on it to solve such problem.

JiarunLiu · 2020-10-30T10:03:30Z

@JiarunLiu

I run PENCL model on cifar-10 datasets. the "--arch" uses resnet-18 as backbone network.

when running.....:

File "PENCIL-master/PENCIL.py", line 355, in train
lc = torch.mean(softmax(output)*(logsoftmax(output)-torch.log((last_y_var))))
RuntimeError: The size of tensor a (1000) must match the size of tensor b (10) at non-singleton dimension 1

output size: (128,1000). last_y_var size: (128,10)

so, is this code missing a classifier? How do you deal with this problem？

You can set the output dimensions=10 of backbone to solve this problem.

JiarunLiu · 2020-10-30T10:10:01Z

Here is my guess. The total loss was calculated with defferent weight between code and paper. Class number was used in paper, which increase the weight of Lo, so, yd was more close to original noise label. But in code, class number was not used in total loss. Welcome to any feedback about my guess.

I think lo could keep the label distribution not so far from the noisy one, but it would limit the performance with high noise level.

jingzhengli · 2020-10-31T09:47:19Z

Hi. The label correction accuracy occurs decrease in third stage that we don’t care about it, because the label is not update in third stage. It is normal. You can overlook the label correction accuracy or remove the label accuracy in third stage. Good luck. The code has no problem in my view. Recently, I also learn the pencil. Nice work. | | 李京政 | | maxlijingzheng@163.com | 签名由网易邮箱大师定制 On 10/30/2020 18:10，JiarunLiu<notifications@github.com> wrote： Here is my guess. The total loss was calculated with defferent weight between code and paper. Class number was used in paper, which increase the weight of Lo, so, yd was more close to original noise label. But in code, class number was not used in total loss. Welcome to any feedback about my guess. I think lo could keep the label distribution not so far from the noisy one, but it would limit the performance with high noise level. — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Would PENCIL decrease the label accuracy? #4

Would PENCIL decrease the label accuracy? #4

JiarunLiu commented Nov 20, 2019

ruirui88 commented Apr 29, 2020

congyang1996 commented Jul 8, 2020

jixiaonanzhuaizhuai commented Sep 9, 2020

jingzhengli commented Oct 30, 2020

JiarunLiu commented Oct 30, 2020

JiarunLiu commented Oct 30, 2020

JiarunLiu commented Oct 30, 2020

jingzhengli commented Oct 31, 2020 via email

Would PENCIL decrease the label accuracy? #4

Would PENCIL decrease the label accuracy? #4

Comments

JiarunLiu commented Nov 20, 2019

ruirui88 commented Apr 29, 2020

congyang1996 commented Jul 8, 2020

jixiaonanzhuaizhuai commented Sep 9, 2020

jingzhengli commented Oct 30, 2020

JiarunLiu commented Oct 30, 2020

JiarunLiu commented Oct 30, 2020

JiarunLiu commented Oct 30, 2020

jingzhengli commented Oct 31, 2020 via email