Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving BatchNorm #633

Merged
merged 2 commits into from
Mar 6, 2019
Merged

Improving BatchNorm #633

merged 2 commits into from
Mar 6, 2019

Conversation

sklan
Copy link
Contributor

@sklan sklan commented Feb 20, 2019

Fixes 2 issues

  1. ϵ is declared but never used.
  2. Makes BatchNorm notation similar to that of the original paper with similar performance.

Ubuntu 18.04 GPU K-80

Test

bn = gpu(BatchNorm(32))
testimg = gpu(randn(Float32, 416, 416, 32, 1))
@time gpu(bn(testimg))
@time gpu(bn(testimg))

Old Code

3.327136 seconds (3.25 M allocations: 167.726 MiB, 1.83% gc time)
0.001245 seconds (155 allocations: 5.391 KiB)

New Code

3.542283 seconds (3.46 M allocations: 178.192 MiB, 2.74% gc time)
0.001185 seconds (149 allocations: 5.281 KiB)

OSX CPU

Test

using Flux
bn = BatchNorm(32)
testimg = randn(Float32, 416, 416, 32, 1)
@time bn(testimg)
@time bn(testimg)

Old Code

14.939357 seconds (225.12 M allocations: 3.799 GiB, 2.11% gc time)
12.857950 seconds (215.97 M allocations: 3.342 GiB, 1.32% gc time)

New Code

2.701686 seconds (8.37 M allocations: 487.830 MiB, 9.61% gc time)
0.077704 seconds (135 allocations: 63.381 MiB, 24.40% gc time)

@DhairyaLGandhi
Copy link
Member

Ref: #624
Thanks for the PR :)

@KristofferC
Copy link
Contributor

KristofferC commented Feb 20, 2019

Locally I get

julia> @time bn(testimg);
  0.032135 seconds (170 allocations: 63.381 MiB, 15.82% gc time)

for old code on cpu. Not sure why you still have the superslow version even after #586. All CI also passes the allocation test so something seems odd with your installation. Anyway, if this fixes it might as well go with this.

src/layers/normalise.jl Outdated Show resolved Hide resolved
@staticfloat
Copy link
Contributor

This seems fine to me, but I'm not sure how it would have such a performance difference. @sklan what version are you testing when you say "old"?

@sklan
Copy link
Contributor Author

sklan commented Feb 27, 2019

@staticfloat I am using Pkg.dev that clones the master branch.
The performance difference is only on the mac cpu and I can't figure out why this works either.

@dhpollack dhpollack mentioned this pull request Mar 6, 2019
@MikeInnes
Copy link
Member

GPU tests pass locally, so I think we can take this. Thanks a lot @sklan!

@MikeInnes MikeInnes merged commit fc6232b into FluxML:master Mar 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants