Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Refactor ForwardDiff to utilize NDuals and replace API #27

Merged
merged 1 commit into from
Aug 14, 2015

Conversation

jrevels
Copy link
Member

@jrevels jrevels commented Jul 16, 2015

This PR does a lot of stuff...likely way more than should fit in one PR, but I got a bit carried away with my refactor.

Changes

  • Removes the old, non-implemented matrix_fad, powerseries_fad, and dual_fad folders, and moved the contents of typed_fad to src.
  • Now that all the other methods (matrix_fad, powerseries_fad) are gone, the API can be simplified, so I took a crack at it. I'll be working on documentation soon, but the exported functions list and fad_api.jl contain the relevant code necessary to understand my attempt.
  • The fad_api contains a port and extension of the work from AutoDiffJSoC.jl, with a bunch of generalizations and optimizations. Notably, it makes more use of my "generated function closure caching trick (TM)" in order to speed up/reuse memory when repeatedly using the same NDuals type to take the gradient/Hessian/etc. This technique has the potential to dangerously behave like a memory leak in a specific edge case (read the code comments), but I can't imagine a user ever reaching that case in a real world situation. It also supports a caching layer for re-usage of work arrays (documented in the README).
  • Renames the FAD* types to *Number (e.g. FADHessian --> HessianNumber).
  • Refactors FADHessian and FADTensor to utilize NDuals.jl, thereby allowing the removal of GraDual and dependence on DualNumbers.jl. Ports in the work from NDuals.jl as a new type, GradientNumber.
  • Bumps up the required Julia version from v0.3 to v0.4, which is required for GradientNum and downstream types.
  • Major refactor of FADHessian.jl HessianNum.jl and FADTensor.jl TensorNum.jl:
    • Improved readability by lengthening some single-character variable names
    • Automatically generate definitions for univariate mathematical functions using Calculus.jl. This reduces potential for bugs, greatly expands supported functionality, and greatly reduces the duplication of code.
    • Unified the code/behavior shared between the GradientNumber, HessianNumber, and TensorNumber types under a new supertype, ForwardDiffNumber <: Number.
    • Replaced all the .-based accesses that plagued the old implementation with consistent calls to proper accessor methods .
    • Removed temporary array usage where possible, cut down on repetitive array accesses, and rewrote loops to be more concise.
  • On top of flipping everything else on its head, I switched to using 4 spaces for indents instead of 2. This is just my preference - I think it reads better and it fits Julia’s contributing guidelines for Base (which we obviously don’t have to follow, but it’s nice to fit in).

Still in progress

  • Tests for new API
  • Tests for F<:ForwardDiffNumber types
    • GradientNumber
      • Utility/accessor function tests
      • Math tests
    • HessianNumber
      • Utility/accessor function tests
      • Math tests
    • TensorNumber
      • Utility/accessor function tests
      • Math tests
  • Deprecate old API
  • Documentation to correspond to new API
  • Add Coveralls.io support

julia 0.3-
DualNumbers
julia 0.4-
NDuals
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense just to move over the NDuals code.

@mlubin
Copy link
Contributor

mlubin commented Jul 18, 2015

Cool stuff! Could you update some of the tests so that we can get a feel for what the new API looks like?

@mlubin
Copy link
Contributor

mlubin commented Jul 18, 2015

Also this won't pass on travis with a dependency on NDuals.

@jrevels
Copy link
Member Author

jrevels commented Jul 20, 2015

I think it makes sense just to move over the NDuals code.

Cool stuff! Could you update some of the tests so that we can get a feel for what the new API looks like?

Both of these sound like good next steps to me!

-(h::HessianNum) = HessianNum(-grad(h), -hess(h))

const h_univar_funcs = Tuple{Symbol, Expr}[
(:sqrt, :((-grad(h,i)*grad(h,j)+2*value(h)*hess(h,i)) / (4*(value(h)^(1.5))))),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you document the format for this expression table? It's not really clear to me what's going on

@jrevels
Copy link
Member Author

jrevels commented Aug 5, 2015

It seems that the tests are erroring on Travis due to a noisy Calculus.jl deprecation warning. The cause of this warning was recently fixed on Calculus.jl's master branch (in this commit). Locally, I'm running off of Calculus.jl's master branch, and can confirm all the tests pass on my machine.

What's the right path forward? Perhap's there a flag to suppress the deprecation warnings that we could turn on just for testing?

@mlubin
Copy link
Contributor

mlubin commented Aug 5, 2015

I'll just tag Calculus.

@felipenoris
Copy link
Contributor

This is awesome work. Do you need any help on coveralls support?

@jrevels
Copy link
Member Author

jrevels commented Aug 5, 2015

@felipenoris Thanks! I just added a line to the Travis file to submit the coverage report to Coveralls, but Coveralls doesn't seem to be updating the repo page here (I can't see other branches besides master, which is also strange).

@felipenoris
Copy link
Contributor

It seems that you got some error on Coveralls.process_folder():

Coverage.process_file: Detecting coverage for src/ForwardDiff.jl
ERROR: ParseError("invalid \"import\" statement")

I´m not sure that this is the cause, but you can test it by running this on your own machine:

julia -e 'cd(Pkg.dir("ForwardDiff")); Pkg.add("Coverage"); using Coverage; Coveralls.process_folder()'

I would be glad to test this for you when I get home tonight.

@tkelman
Copy link
Contributor

tkelman commented Aug 6, 2015

That looks like it might be JuliaLang/JuliaParser.jl#19, the same thing that's holding up base coverage numbers. The approaches there are either find workaround alternate syntax that JuliaParser is happy with, help fix JuliaParser, or bug Jake until he does it.

@felipenoris
Copy link
Contributor

@tkelman is right. It's the same error as JuliaCI/Coverage.jl#77.

@jrevels
Copy link
Member Author

jrevels commented Aug 6, 2015

@tkelman @felipenoris Thanks for pointing out the source of the issue! I'm going to see whether I can cook up a fix myself today for the import-related part of JuliaLang/JuliaParser.jl#19. In the meantime I'll finagle the syntax used here so that we can get test coverage results.

@felipenoris
Copy link
Contributor

Looks like you're close to a merge. Awesome!
I noticed function gradient conflicts with Base.gradient. Maybe you should consider renaming this function to something like grad.

@felipenoris
Copy link
Contributor

I'm sorry. I see that this was discussed elsewhere.

@mlubin
Copy link
Contributor

mlubin commented Aug 7, 2015

Yeah, I'm in favor of not exporting gradient (to avoid the conflict), so that you can call it with ForwardDiff.gradient.

@jrevels
Copy link
Member Author

jrevels commented Aug 7, 2015

@mlubin I think I'm with you there. Users can always alias it to another name if they wish, but just popping the function into the global namespace could easily cause issues for whoever accidentally calls the Calculus.jl version without realizing it.

How should we go about dealing with the other functions (derivative/hessian/etc.)? Do you think I should remove them from the export list as well for the sake of consistency? It kind of follows that if we can't "assert ownership" over a given function (or more specifically, the function signature) then we should avoid performing that same assertion with the other functions.

@mlubin
Copy link
Contributor

mlubin commented Aug 7, 2015

Extending a function in Base is asserting global ownership of a particular signature, but I don't think it's an issue to export methods which don't conflict (currently) with Base.

@jrevels
Copy link
Member Author

jrevels commented Aug 7, 2015

Sounds good.

end
end

@generated function grad_workvec{N,T}(::Type{Dim{N}}, ::Type{T})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you planning on restructuring this? The memory leak is a real issue

@jrevels
Copy link
Member Author

jrevels commented Aug 9, 2015

@Scidom @mlubin

Is there a way to let users override this?

Are you planning on restructuring this? The memory leak is a real issue

I created an api-refactor branch off of the nduals-refactor branch in order to avoid intermittent breakage on this branch while I deal with these issues. What's currently there is a refactor of the gradient API to implement the solutions we came up with, while the other methods have been replaced with stubs to be filled out in the meantime.

From here, we could either:

a) Merge nduals-refactor to master ---> Open a PR for api-refactor ---> Merge api-refactor to master once it's ready ---> tag and release a new version of ForwardDiff.jl

or

b) Merge api-refactor to nduals-refactor once it's ready ---> Merge nduals-refactor to master ---> tag and release a new version of ForwardDiff.jl

@mlubin
Copy link
Contributor

mlubin commented Aug 9, 2015

You could open up a pull request to merge api-refactor into nduals-refactor (not master), that would make it easier to comment on the specific changes.

@KristofferC
Copy link
Collaborator

I apologize in advance for asking questions in a WIP PR but I figured this was the best place since it seems to likely soon be the new ForwardDiff. My question is if you have thought about implementing an API that would also return the function value and not only the gradient part. I guess I could create dual numbers myself and call the function and extract the gradient and value part. However, I think it is common that a user would want both the value and the gradient so it would be nice to have an API for that.

It could possible be done with an extra Val{Bool} argument to the function to have the implementation be type stable.

@jrevels
Copy link
Member Author

jrevels commented Aug 13, 2015

I apologize in advance for asking questions in a WIP PR but I figured this was the best place since it seems to likely soon be the new ForwardDiff.

Opinions on where the API should go are definitely welcome!

My question is if you have thought about implementing an API that would also return the function value and not only the gradient part.

@mlubin What do you think of this?

The downside is that I would really like to keep the API as simple as possible, and the more arguments we have to support the complicated things can become...

The other argument against this is that, in most cases, the cost of calling the target function on it's own will probably be amortized in the cost of evaluating higher order derivatives anyway, so that it might not really hurt you that much performance-wise to call it twice.

OTOH, if the target function is really costly to evaluate, I suppose it's frustrating to see that work get thrown away rather than reused.

If we do decide support this, it'd only make sense if we support the retrieval of all lower order derivative results (e.g. if you take the Hessian, you're technically computing the gradient along the way). Maybe have a keyword argument to return the resultant F<:ForwardDiffNumber instead of automatically filling it into an Array, and allow folks to call hessian/gradient/etc. on it in order to grab the data they need? It wouldn't support the chunk-based calculations, obviously.

Either way, I don't think working to support this should be a block for merging this PR. We could merge this PR and extend the API later without breakage to support this.

@KristofferC
Copy link
Collaborator

I agree that having an API as simple as the one in this PR is a great strength and significant thought should go into any additions of, for example, function arguments as to not clutter the API.

My application that I am playing around using ForwardDiff for is structural optimization where the function calls are quite expensive (solving a large sparse linear system). I guess it sounds a bit silly but when I know that the function values are there in the dual numbers, so close that I can almost smell them, it really feels like I want to have a method to just go and grab them.

I believe that it is quite common to both want the function value and the derivative so even if the derivative part takes the majority of the time it could be argued that (if we figure out some clean way to do it), being able to get them both in just one function call could be argued being good API practice.

I will think a bit and see if I can get any good, clean ways this could be done.

@jrevels
Copy link
Member Author

jrevels commented Aug 13, 2015

so close that I can almost smell them

lol, I know what you mean.

I do think this would work with a keyword argument in place to specify "don't load my results into an array before giving them to me, instead just give me the resulting ForwardDiffNumber so I can extract what I need from it via a simple interface."

Also, if you want to figure out what an implementation might look like, note that the most up-to-date API implementation is currently here, which we plan on merging in to this PR once it's been approved, then finally merging this PR back to master.

@mlubin
Copy link
Contributor

mlubin commented Aug 14, 2015

There's definitely an argument to be made for returning the lower derivatives or function values if available, but that's out of scope for this PR. We should continue the discussion in an issue after this PR merges and things stabilize a bit.

Allow chunk-sizing options, and a more robust caching layer to fix leaky generated functions
@jrevels jrevels merged commit 4a4285d into master Aug 14, 2015
@mlubin
Copy link
Contributor

mlubin commented Aug 14, 2015

💯

@felipenoris
Copy link
Contributor

Awesome! Thank you!

@KristofferC
Copy link
Collaborator

Great to see this merged.

@tkelman
Copy link
Contributor

tkelman commented Aug 14, 2015

Please don't forget to bump the minor version (or major if this were post-1.0) for the first metadata tag that includes this rewrite.

@mlubin
Copy link
Contributor

mlubin commented Aug 14, 2015

Definitely

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants