-
-
Notifications
You must be signed in to change notification settings - Fork 611
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
manual gradient checks for RNN - implicit and explicit gradients #2215
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you discover anything in your test creation which might give us a lead on where the incorrect grads are coming from?
Sorry no new insights at the moment. For explicit mode Flux.jl/src/layers/recurrent.jl Line 179 in 1ca15f3
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Would it be easy to add explicit errors to the wrong ones? E.g. by overloading |
Unfortunately it's the explicit mode path which is the broken one. |
Sure, but we can dispatch on that too? Haven't tried & not sure whether there's a point at which this could be attached. |
We can, but we have to find that point first. And if we do, it's likely we'll be able fo fix the bug then without having to put up a "this is broken" sign. |
This adds gradient tests for RNN in relation to #2185
Note that for implicit gradient mode, gradients successfully pass all tests on all Julia versions.
Implicit mode gradients only fail on Julia >= 1.7 when in REPL (and gradient isnt' call from within a function).
For explicit mode gradient (new Optimisers.jl), all gradients fail on Julia >= 1.7.
On Julia v1.6, all gradients other than
state0
are correct. The correctstate0
gradient is actually assigned to Recur'sstate
rather than to the cell'sstate0
.PR Checklist