Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use torch instead of scipy for random initialization of inception and googlenet weights #4256

Merged
merged 9 commits into from
Aug 6, 2021

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Aug 5, 2021

Follow-up to #4250 (comment)

weights generated with truncnorm from scipy are conditioned by np global seed (or a RandomState). We want to avoid setting np.random.seed in our tests and also we don't want to pass a RandomState object in the model args. As such, we move to generating weights with torch.nn.init.trunc_normal_ instead, which is conditioned by torch.manual_seed.

Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @vmoens ,

I realize this is still WIP but I'm seeing lots of changed pkl files: according to https://app.circleci.com/pipelines/github/pytorch/vision/9719/workflows/9ad29d07-4475-418c-88e9-54e8f09b8c5c/jobs/721968 it seems that only one test was failing, so we probably don't need to update all of those.

Also it seems like we still have a typing error for float(m.stddev)

@vmoens vmoens force-pushed the trunc_normal_torch branch from ebdde0a to 80f7c10 Compare August 5, 2021 12:58
@vmoens vmoens force-pushed the trunc_normal_torch branch from 4d3bf4e to e3a84ef Compare August 5, 2021 16:21
Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @vmoens !

I tagged as BC breaking because the results might change slightly, but it's fine as we don't have guarantees w.r.t. randomness

@vmoens vmoens changed the title WIP using nn.init.trunc_normal_ instead of scipy.stats.truncnorm using nn.init.trunc_normal_ instead of scipy.stats.truncnorm Aug 6, 2021
@vmoens vmoens changed the title using nn.init.trunc_normal_ instead of scipy.stats.truncnorm Use torch instead of scipy for random initialization of inception and googlenet weights Aug 6, 2021
@vmoens vmoens marked this pull request as ready for review August 6, 2021 07:54
@vmoens vmoens merged commit 7e987bf into pytorch:master Aug 6, 2021
@datumbox
Copy link
Contributor

@vmoens Thanks for the PR. The change looks useful. Though it's correctly marked as BC-breaking, hopefully the two methods don't produce significantly different results and the average user should be OK.

@NicolasHug It might be worth confirming that a model initialized under the new scheme does not diverge. A rudimental check in this case would be to run the model for 1-2 epochs and confirm that the loss decreases on the new branch. Is this something we ran already or plan to run on the future?

@NicolasHug
Copy link
Member

@vmoens would you mind trying @datumbox 's suggestion, just to make sure?

@NicolasHug NicolasHug mentioned this pull request Aug 16, 2021
facebook-github-bot pushed a commit that referenced this pull request Aug 20, 2021
…ption and googlenet weights (#4256)

Summary:
using nn.init.trunc_normal_ instead of scipy.stats.truncnorm

Reviewed By: NicolasHug

Differential Revision: D30417203

fbshipit-source-id: 6b04f6bf7f6d30dfbc65980a4036a9dc539e4651

Co-authored-by: Vincent Moens <vmoens@fb.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants