Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP/RFC optionally use faster cache #99

Closed
wants to merge 2 commits into from

Conversation

KristofferC
Copy link
Collaborator

This is a very early WIP that is more a proof of concept and it would be nice with some comments before I continue.

In #92 I noticed that the dict used to cache arrays is sometimes a bit slow. There are many levels of indirections, an anonymous function -> double key lookup in a dict to get back maybe an array of size 10. The point of this PR would be that a user could optionally declare input size + input type when creating the function that computes the derivative. If this is done, we know the exact data types and sizes that are needed and can therefore use a specialized type to cache the arrays instead of the dict method. This can lead to significantly faster run times.

Right now, what is implemented is an extra method to gradient where the input size and input type is used. A new cache GradientCache is then used instead of ForwardDiffCache.

For a quite simple function of size 10, this gives about a 2x speedup (benchmarked on 0.4.3).

function f{T}(x::Vector{T})
    s = zero(T)
    @inbounds for i in eachindex(x)
        s += sin(x[i]) + tan(x[i]) + sqrt(x[i])
    end
    return s
end

x = rand(10)
J = zeros(10)

@time for i = 1:10^5 g_PR!(J, x) end
#  0.164378 seconds (400.00 k allocations: 36.621 MB, 5.18% gc time)

@time for i = 1:10^5 g!(J, x) end
#  0.348962 seconds (600.00 k allocations: 41.199 MB, 1.57% gc time)

What would be left to do is:

  • Deal with the type instability in the return value of the returned AD function. I believe this is because the keyword argument cache is no longer concrete.
  • Reduce the code duplication from creating a very similar gradient function
  • Implement similar methods for hessians, tensors and jacobians (have to specify output length as well?),
  • Proper benchmarking
  • Documentation

Note, as I said, this is mostly a proof of concept to get the ball going. If you (@jrevels) had somethins else in mind that you want to go with, feel free to do that.

@@ -59,11 +83,51 @@ function gradient{A}(f, ::Type{A}=Void;
end
end

function gradient{A, T}(f, input_size::Int, input_type::Type{T},
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole 30 line block here is basically only to use a @genereated function to compute the type of the GradientCache (to avoid type instability) and create the cache...

@jrevels
Copy link
Member

jrevels commented Feb 2, 2016

You're right that the cache code has a lot of problems, and desperately needs a refactor. Before you get too far down the line, though, you should be aware that I'm in the middle of a hugely breaking refactor targeting Julia v0.5 that I hope will land either this week or next week. Unfortunately, it will very likely result in the code here becoming obsolete - the API, ForwardDiffNumber types, and cache layer are all going to change dramatically. I'm going to open a WIP PR soon that should explain things in more detail.

That being said, please continue with this PR if you're interested in getting better performance using ForwardDiff in Julia v0.4 - we can update the versions separately.

@KristofferC
Copy link
Collaborator Author

If you are in progress of some big changes I will let this rest. I already have a local branch that fixes the cache performance for my use case.

Would be cool to see a wip PR to play with.

@KristofferC KristofferC closed this Feb 2, 2016
@KristofferC KristofferC mentioned this pull request Feb 9, 2016
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants