You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I cannot reproduce the behaviour used in model Zoo tutorials for recurrent nets.
Specifically, I cannot reset hidden states on the whole chain. Expectation would be that calling Flux.reset!(rnn) where rnn is the whole chain would reset all recurring layers in it (eg, example here).
Steps to Reproduce
using Flux
using Flux: chunk
rnn = Chain(RNN(1,1,identity),Dense(1, 1, identity))
data=ones(Float32,1,4)
g = gradient(Flux.params(rnn)) do
# Flux.reset!(rnn) # does not work
Flux.reset!(rnn[1]) # does work
sum(rnn(data)) # mock calculation to return a scalar
end
Expected Results
I expect to be able to call Flux.reset!(rnn) to reset hidden states.
Observed Results
Calling Flux.reset!(rnn) leads to error:
ERROR: BoundsError: attempt to access Tuple{} at index [0]
The issue can be solved by calling reset! only on the recurrent layers.
The error does not appear when calling reset! outside of pullback/AD context.
I suspect this line might be insufficient in the pullback context. but I'm not sure how to fix it - any ideas?
Dupe of FluxML/Zygote.jl#1297. See #2057 for some workarounds, though 99% of the time the way to avoid this is just to call reset! outside of gradient/pullback.
Package Version
0.13.8
Julia Version
1.8
OS / Environment
MacOS (arm64)
Describe the bug
I cannot reproduce the behaviour used in model Zoo tutorials for recurrent nets.
Specifically, I cannot reset hidden states on the whole chain. Expectation would be that calling
Flux.reset!(rnn)
wherernn
is the whole chain would reset all recurring layers in it (eg, example here).Steps to Reproduce
Expected Results
I expect to be able to call
Flux.reset!(rnn)
to reset hidden states.Observed Results
Calling
Flux.reset!(rnn)
leads to error:The issue can be solved by calling reset! only on the recurrent layers.
The error does not appear when calling reset! outside of pullback/AD context.
I suspect this line might be insufficient in the pullback context. but I'm not sure how to fix it - any ideas?
Relevant log output
The text was updated successfully, but these errors were encountered: