-
Notifications
You must be signed in to change notification settings - Fork 146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Broadcast views to avoid allocations. #353
Conversation
Can the zero allocations be tested? |
This unfortunately doesn't seem to do the trick. Here's a simple benchmark (it only tests one of these using ForwardDiff, BenchmarkTools
x = rand(1000);
cfg = ForwardDiff.GradientConfig(nothing, x);
@btime ForwardDiff.seed!($(cfg.duals), $x, $(cfg.seeds)); Before #351: julia> @btime ForwardDiff.seed!($(cfg.duals), $x, $(cfg.seeds));
12.895 ns (0 allocations: 0 bytes) After #351: julia> @btime ForwardDiff.seed!($(cfg.duals), $x, $(cfg.seeds));
1.078 μs (16 allocations: 3.80 KiB) After this PR: julia> @btime ForwardDiff.seed!($(cfg.duals), $x, $(cfg.seeds));
1.032 μs (16 allocations: 3.69 KiB) Unfortunately, I think the best thing to do at the moment is to revert (EDIT: this part of) #351 so we can tag ForwardDiff v0.9.0 (which is needed for people to get a version of ForwardDiff working on Julia v1.0). Then we can take our time trying to get this working without regressions. Sorry again, I really should have caught this when reviewing #351 originally... |
This reverts commit 19003af. See #353 (comment)
This reverts commit 19003af. See #353 (comment)
Turns out the overhead came from the tuple slicing. Avoiding that, I can get the timings down most of the way, but there's still some overhead due to broadcast + view allocation:
https://gist.github.com/maleadt/d794ff3de39c4e5a94e98370288fcf09 How performance sensitive is |
Nice!
It gets called in the "inner loop" of the pertubation chunking algorithms for API functions e.g. |
Then this can't work until JuliaLang/julia#14955 |
Would it be possible to specialize this implementation on CuArrays' types so they can at least get the desired behavior? The big downside there would obviously be that the overloads would need to live in a separate companion package (unless we'd make ForwardDiff depend on CuArrays, or CuArrays depend on ForwardDiff, neither of which is desirable). |
That seems like needlessly duplicating functionality that's actually pretty generic. |
Sounds good to me. That still requires a companion package though, right? Or do you mean a trait like this in Base? If the latter, we could avoid having to wait for a |
Now that JuliaLang/julia#14955 is resolved, can we revive this work? It seems that the solution in #406 yields zero allocations: julia> using ForwardDiff, BenchmarkTools
[ Info: Precompiling ForwardDiff [f6369f11-7733-5829-9624-2563aa707210]
julia> x = rand(1000);
julia> cfg = ForwardDiff.GradientConfig(nothing, x);
julia> @btime ForwardDiff.seed!($(cfg.duals), $x, $(cfg.seeds));
55.533 ns (0 allocations: 0 bytes) |
I think we can close this now that #472 has merged. |
Improves #351, ref JuliaLang/METADATA.jl#17515 (comment)