make auxiliary heads in pretrained models optional #828

Separius · 2019-04-01T07:12:22Z

codecov-io · 2019-04-01T07:26:10Z

Codecov Report

Merging #828 into master will decrease coverage by 0.3%.
The diff coverage is 15.78%.

@@            Coverage Diff             @@
##           master     #828      +/-   ##
==========================================
- Coverage   52.96%   52.65%   -0.31%     
==========================================
  Files          35       35              
  Lines        3389     3405      +16     
  Branches      538      543       +5     
==========================================
- Hits         1795     1793       -2     
- Misses       1464     1480      +16     
- Partials      130      132       +2

Impacted Files	Coverage Δ
torchvision/models/inception.py	`86.54% <0%> (-2.81%)`	⬇️
torchvision/models/googlenet.py	`73.52% <25%> (-4.43%)`	⬇️
torchvision/transforms/transforms.py	`82.84% <0%> (-0.68%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f566fac...b74977b. Read the comment docs.

fmassa

I think this story around aux-logits is not great.

Given that we only load the model weights after having defined the model structure, I think that we should unconditionally have the branches.

The forward should decide then if it should use the branches or not (which in general should depend only if the model is in training mode or not).

Thoughts?

fmassa · 2019-04-01T08:56:22Z

torchvision/models/inception.py

+            original_aux_logits = kwargs['aux_logits']
+            kwargs['aux_logits'] = True
+        else:
+            original_aux_logits = True


This won't work if the model was trained without the aux_logits, because load_state_dict will not match, right?

yes, I assumed that the pretrained models all have aux_logits and they are trained but based on #821 it seems that they are actually not trained(but included in the pretrained models)

so doesn't it make sense to disable aux_heads in the pretrained models and update the pretrained models(.pth files)?

it breaks backward compatibility though and maybe it makes sense to set strict to False in load_from_state_dict?

edit: I see it still breaks backward compatibility

So, one of my ideas was to completely disable .train(True) mode, and not have the aux branches at all.
Did you train a checkpoint on top of inception v3 without the aux heads?

completely disabling train mode(for pretrained models) is a bit harsh, don't you think?
I guess most of the time people freeze the net, train just the fc layer and then unfreeze the whole thing with a small lr. disabling train will break those codes.

No no, by reading #821 I thought that maybe aux paths are not trained at all in the .pth file and that's the reason they are discarded, am I mistaken?

@Separius only the aux heads of GoogLeNet don't have pretrained weights, the aux branch of inception v3 however is trained.

In this case, I think we should leave GoogleNet as is, and then this is good to be merged?

@TheCodez so I guess we have two options then, either delete aux branch after loading it in the inception model(like my code) or do not allow aux_logits=False in the load_pretrained function

torchvision/models/googlenet.py

fmassa

LGTM. @TheCodez do you have any further thoughts?

TheCodez · 2019-04-02T10:52:10Z

My question is why should we delete the aux branch for inception v3 but not for GoogleNet?
I agree with @fmassa that we should always create those branches and then just disable them if not desired.

@Separius code with the lines del model.AuxLogits and if aux_logits: removed should be fine if I'm not mistaken?
But we should double check that to make sure we don't break any existing models.

Separius · 2019-04-03T12:04:28Z

ok based on our discussion, here is what I propose:

both pretrained models should accept aux_logits=True(and False) and both models should treat it similarly(right now when aux_logits is false in the inception we don't instantiate those heads but we do it in the googlenet) but we must warn the user that using aux_logits=True in googlenet is not recommended(because they are not pretrained).

it won't break BC and it won't take extra space for the unused parameters and more importantly it allows fine-tuning, either with or without auxiliary heads.

what do you think?, @TheCodez @fmassa

@TheCodez ragarding breaking existing models, right now load_pretraind_inception with aux_logits=True fails, so I think it's safe to say that we won't break anything.

TheCodez · 2019-04-03T16:55:18Z

@Separius your reasoning seems good 👍

but we must warn the user that using aux_logits=True in googlenet is not recommended(because they are not pretrained).

This is important to document that they aren't pretrained.

@TheCodez ragarding breaking existing models, right now load_pretraind_inception with aux_logits=True fails, so I think it's safe to say that we won't break anything.

Yeah my idea of always creating the aux branch for inception v3 as well would cause bc issues with finetuned models.
So it's probably a good idea to be consistent in both googlenet and inception.

it is related to pytorch/pytorch#18668

Separius · 2019-04-03T20:16:06Z

@TheCodez I also changed the ordering of the returned values from googlenet when aux is on to make it consistent with inception, I know it's a breaking change but I believe googlenet is not in the latest torchvision release, so we are safe to change it, right?

we could also return a namedtuple in both models when aux is on. but I don't know what to name them, GooglenetAllOutputs?, I don't think so

TheCodez · 2019-04-03T22:48:59Z

@Separius I think that’s fine.
What about ‘GoogLeNetOuputs’?

Separius · 2019-04-04T05:34:57Z

should I also add some tests to the test_models.py?
I mean, is it ok for the tests to download pretrained models? (I live in Iran and downloading all the pretrained models takes a lot of time. that's why I'm asking :)))

fmassa · 2019-04-04T08:48:25Z

@Separius let's not download pre-trained models in the tests. This can make the tests flaky due to connection errors

fmassa

LGTM, thanks!

Separius changed the title ~~make auxilary heads in pretrained models optional~~ make auxiliary heads in pretrained models optional Apr 1, 2019

Separius mentioned this pull request Apr 1, 2019

return missing keys from load_state_dict pytorch/pytorch#18668

Closed

fmassa requested changes Apr 1, 2019

View reviewed changes

fmassa approved these changes Apr 2, 2019

View reviewed changes

fmassa mentioned this pull request Apr 2, 2019

[JIT] Trace not consistent for torchvision models (specifically inception_v3()) pytorch/pytorch#18723

Closed

Separius added 4 commits April 4, 2019 00:09

add aux_logits support to inception

e2a0f20

it is related to pytorch/pytorch#18668

instantiate InceptionAux only when requested

7221079

it is related to pytorch/pytorch#18668

revert googlenet

3af0366

support and aux_logits in pretrained models

c390a6e

return namedtuple when aux_logit is True

b74977b

fmassa approved these changes Apr 4, 2019

View reviewed changes

fmassa merged commit 50ea596 into pytorch:master Apr 4, 2019

fmassa mentioned this pull request Jun 25, 2019

inception_v3 of vision 0.3.0 does not fit in DataParallel of torch 1.1.0 #1048

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

make auxiliary heads in pretrained models optional #828

make auxiliary heads in pretrained models optional #828

Separius commented Apr 1, 2019

codecov-io commented Apr 1, 2019 •

edited

Loading

fmassa left a comment

fmassa Apr 1, 2019

Separius Apr 1, 2019 •

edited

Loading

fmassa Apr 1, 2019

Separius Apr 1, 2019

TheCodez Apr 1, 2019

fmassa Apr 1, 2019

Separius Apr 2, 2019

fmassa left a comment

TheCodez commented Apr 2, 2019

Separius commented Apr 3, 2019

TheCodez commented Apr 3, 2019

Separius commented Apr 3, 2019 •

edited

Loading

TheCodez commented Apr 3, 2019

Separius commented Apr 4, 2019

fmassa commented Apr 4, 2019

fmassa left a comment

make auxiliary heads in pretrained models optional #828

make auxiliary heads in pretrained models optional #828

Conversation

Separius commented Apr 1, 2019

codecov-io commented Apr 1, 2019 • edited Loading

Codecov Report

fmassa left a comment

Choose a reason for hiding this comment

fmassa Apr 1, 2019

Choose a reason for hiding this comment

Separius Apr 1, 2019 • edited Loading

Choose a reason for hiding this comment

fmassa Apr 1, 2019

Choose a reason for hiding this comment

Separius Apr 1, 2019

Choose a reason for hiding this comment

TheCodez Apr 1, 2019

Choose a reason for hiding this comment

fmassa Apr 1, 2019

Choose a reason for hiding this comment

Separius Apr 2, 2019

Choose a reason for hiding this comment

fmassa left a comment

Choose a reason for hiding this comment

TheCodez commented Apr 2, 2019

Separius commented Apr 3, 2019

TheCodez commented Apr 3, 2019

Separius commented Apr 3, 2019 • edited Loading

TheCodez commented Apr 3, 2019

Separius commented Apr 4, 2019

fmassa commented Apr 4, 2019

fmassa left a comment

Choose a reason for hiding this comment

codecov-io commented Apr 1, 2019 •

edited

Loading

Separius Apr 1, 2019 •

edited

Loading

Separius commented Apr 3, 2019 •

edited

Loading