Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initialize random numbers generator at start #408

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

VaKonS
Copy link

@VaKonS VaKonS commented May 31, 2017

When model is being loaded / rebuild, Torch's random numbers generator is not yet initialized with manual seed, which makes always random images with NIN ImageNet model, even with "-seed" option.

Because NIN ImageNet model uses dropout layer with random values.

Placing RNG initialization at start allows repeatable results (with manual seed) with NIN ImageNet model, like with VGG models.

When model is being loaded/rebuild, Torch's random numbers generator is not yet initialized with manual seed, which makes always random images with NIN ImageNet model, even with "-seed" option.

Because NIN ImageNet model (https://github.com/BVLC/caffe/wiki/Model-Zoo#network-in-network-model) uses dropout layer with random values.

Placing RNG initialization at start allows repeatable results (with manual seed) with NIN ImageNet model, like with VGG models.
@htoyryla
Copy link

htoyryla commented May 31, 2017

Good find. Makes me think further that a dropout layer probably does not do any good in style transfer. Shouldn't we omit them altogether like I did in neural-mirage by adding

 if layer_type ~= "nn.Dropout" then

here https://github.com/jcjohnson/neural-style/blob/master/neural_style.lua#L130

Perhaps this is the reason that NIN has been considered to give poor results?

An alternative to removing the Dropout layer(s) would be to call evaluate() for the model after rebuilding it; this sets train = false for the model. I checked the code of Dropout, there train is initialized to True.

PS. Made quick test with and without the Dropout layer. Did not use manual seed so the results are not exactly comparable. The effect of the Dropout layer is not so drastic. In fact, what it does is masking out features randomly, thus a bit like masking of channels we experimented with recently.

NIN, No dropout,
content_layers relu0,relu3,relu7
style_layers relu0,relu3,relu7
style_weight 1e5
image_size 960
out

NIN, dropout, same settings
nin_dropout

@VaKonS
Copy link
Author

VaKonS commented Jun 1, 2017

@htoyryla, this dropout layer seems to be a way to additionally randomize an image.

When process is initialized with source image (not random noise):

  • VGG models (not using dropout layer) make identical images with any seed number.
  • NIN ImageNet without dropout layer makes identical images too.
  • NIN ImageNet with dropout layer makes different images for different seeds.

Here is an example with and without dropout layer (NIN ImageNet, L-BFGS optimizer), at 150, 600 and ~2400 iterations.
Some features appear without dropout, other features vanish. It looks like dropout doesn't affect optimization speed or improve the image, it simply makes different variation of style:

drop-nodrop_1024

@htoyryla
Copy link

htoyryla commented Jun 1, 2017

I know. The Dropout layer is a random mask. Usually used for training to prevent overfitting. That is why I suggested that leaving it out might improve the quality when using NIN or any other model with Dropout between conv layers.

Then, as I also noted in my previous comment, using the Dropout layer produces variations for the same reason as using selected channels only which I experimented with recently (as Dropout masks away random outputs in each channel).

So I was not arguing against using Dropout really; both using dropout and leaving it out makes sense. And so does adding a dropout into a VGG model. As long as one can choose.

PS. Torch.nn now also has a SpatialDropout layer, might be interesting to try how it would affect the results.

PPS. I had a wrong impression how the SpatialDropout works... "extends this dropout value across the entire feature map" might not be a good idea here.

VaKonS added a commit to VaKonS/neural-style that referenced this pull request Jul 2, 2017
After discussion with @htoyryla (jcjohnson#408 (comment)), I think that processing fragments with exactly same sequence without random variations could make overlapping parts more conforming to each other.
Or maybe not. Anyway, it doesn't seem to makes results worse, and then it's simply faster.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants