manual gradient checks for RNN - implicit and explicit gradients #2215

jeremiedb · 2023-03-17T21:32:36Z

This adds gradient tests for RNN in relation to #2185

Note that for implicit gradient mode, gradients successfully pass all tests on all Julia versions.
Implicit mode gradients only fail on Julia >= 1.7 when in REPL (and gradient isnt' call from within a function).

For explicit mode gradient (new Optimisers.jl), all gradients fail on Julia >= 1.7.
On Julia v1.6, all gradients other than state0 are correct. The correct state0 gradient is actually assigned to Recur's state rather than to the cell's state0.

PR Checklist

Tests are added
Entry in NEWS.md
Documentation, if applicable

ToucheSir

Did you discover anything in your test creation which might give us a lead on where the incorrect grads are coming from?

test/layers/recurrent.jl

jeremiedb · 2023-03-20T03:22:09Z

Did you discover anything in your test creation which might give us a lead on where the incorrect grads are coming from?

Sorry no new insights at the moment. For explicit mode state0 issue, I suspect some "alias" from

Flux.jl/src/layers/recurrent.jl

Line 179 in 1ca15f3

reset!(m::Recur) = (m.state = m.cell.state0)

.

ToucheSir

Thanks!

mcabbott · 2023-03-21T03:44:56Z

Would it be easy to add explicit errors to the wrong ones? E.g. by overloading Zygote.pullback(::Context{true}, ...) where the struct has a flag to indicate implicit mode.

ToucheSir · 2023-03-21T04:01:12Z

Unfortunately it's the explicit mode path which is the broken one.

mcabbott · 2023-03-21T04:39:02Z

Sure, but we can dispatch on that too? Haven't tried & not sure whether there's a point at which this could be attached.

ToucheSir · 2023-03-21T13:40:54Z

We can, but we have to find that point first. And if we do, it's likely we'll be able fo fix the bug then without having to put up a "this is broken" sign.

jeremiedb added 2 commits March 17, 2023 17:26

manual gradient checks for RNN - implicit and explicit gradients

e6705a7

docs update

15c285e

ToucheSir reviewed Mar 18, 2023

View reviewed changes

test/layers/recurrent.jl Show resolved Hide resolved

new RNN tests - format revert

cba35ff

ToucheSir approved these changes Mar 21, 2023

View reviewed changes

ToucheSir merged commit 0038a60 into FluxML:master Mar 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

manual gradient checks for RNN - implicit and explicit gradients #2215

manual gradient checks for RNN - implicit and explicit gradients #2215

jeremiedb commented Mar 17, 2023

ToucheSir left a comment

jeremiedb commented Mar 20, 2023

ToucheSir left a comment

mcabbott commented Mar 21, 2023

ToucheSir commented Mar 21, 2023

mcabbott commented Mar 21, 2023

ToucheSir commented Mar 21, 2023

manual gradient checks for RNN - implicit and explicit gradients #2215

manual gradient checks for RNN - implicit and explicit gradients #2215

Conversation

jeremiedb commented Mar 17, 2023

PR Checklist

ToucheSir left a comment

Choose a reason for hiding this comment

jeremiedb commented Mar 20, 2023

ToucheSir left a comment

Choose a reason for hiding this comment

mcabbott commented Mar 21, 2023

ToucheSir commented Mar 21, 2023

mcabbott commented Mar 21, 2023

ToucheSir commented Mar 21, 2023