-
-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
neural_ode_sciml example fails when Dense layer replaced by GRU #432
Comments
Doesn't a |
I think @avik-pal and @DhairyaLGandhi have mentioned something about |
If it's of help, learning-long-term-irregular-ts shows (starting on line 566) code for the GRUODE written in python. |
Note that method is only going to be compatible with |
@avik-pal and @DhairyaLGandhi, note that the solve function runs properly and produces anticipated output. The DimensionMismatch error occurs later when gradients are taken. Also, the same error occurs when using adaptive=false: |
The The exact source of the error you encounter seems to be the sensitivity algorithm. A quick fix would be:
|
We don't close over a number of arguments in function destructure(m; cache = IdDict())
xs = Zygote.Buffer([])
fmap(m) do x
if x isa AbstractArray
push!(xs, x)
else
cache[x] = x
end
return x
end
return vcat(vec.(copy(xs))...), p -> _restructure(m, p, cache = cache)
end
function _restructure(m, XS; cache = IdDict())
i = 0
fmap(m) do x
x isa AbstractArray || return cache[x]
x = reshape(xs[i.+(1:length(x))], size(x))
i += length(x)
return x
end
end This is untested currently, @avik-pal would something like this solve the specific issue you're talking about? |
Thanks. I can verify that the following works with Flux.GRU. I used
|
The fixed restructure/destructure works: import DiffEqFlux
import OrdinaryDiffEq
import Flux
import Optim
import Plots
import Zygote
u0 = Float32[2.0; 0.0]
datasize = 30
tspan = (0.0f0, 1.5f0)
tsteps = range(tspan[1], tspan[2], length = datasize)
function trueODEfunc(du, u, p, t)
true_A = [-0.1 2.0; -2.0 -0.1]
du .= ((u.^3)'true_A)'
end
prob_trueode = OrdinaryDiffEq.ODEProblem(trueODEfunc, u0, tspan)
ode_data = Array(OrdinaryDiffEq.solve(prob_trueode, OrdinaryDiffEq.Tsit5(), saveat = tsteps))
dudt2 = Flux.Chain(
x -> x.^3,
Flux.Dense(2, 50, tanh),
#Flux.Dense(50, 2)
Flux.GRU(50, 2)
)
sf
function destructure(m; cache = IdDict())
xs = Zygote.Buffer([])
Flux.fmap(m) do x
if x isa AbstractArray
push!(xs, x)
else
cache[x] = x
end
return x
end
return vcat(vec.(copy(xs))...), p -> _restructure(m, p, cache = cache)
end
function _restructure(m, xs; cache = IdDict())
i = 0
Flux.fmap(m) do x
x isa AbstractArray || return cache[x]
x = reshape(xs[i.+(1:length(x))], size(x))
i += length(x)
return x
end
end
p, re = destructure(dudt2)
neural_ode_f(u, p, t) = re(p)(u)
prob = OrdinaryDiffEq.ODEProblem(neural_ode_f, u0, tspan, p)
function predict_neuralode(p)
tmp_prob = OrdinaryDiffEq.remake(prob,p=p)
res = Array(OrdinaryDiffEq.solve(tmp_prob, OrdinaryDiffEq.Tsit5(), saveat=tsteps, dt=0.01, adaptive=false))
return res
end
function loss_neuralode(p)
pred = predict_neuralode(p) # (2,30)
loss = sum(abs2, ode_data .- pred) # scalar
return loss, pred
end
callback = function (p, l, pred; doplot = true)
display(l)
# plot current prediction against data
plt = Plots.scatter(tsteps, ode_data[1,:], label = "data")
Plots.scatter!(plt, tsteps, pred[1,:], label = "prediction")
if doplot
display(Plots.plot(plt))
end
return false
end
result_neuralode = DiffEqFlux.sciml_train(
loss_neuralode,
p,
Flux.ADAM(0.05),
cb = callback,
maxiters = 3000
)
result_neuralode2 = DiffEqFlux.sciml_train(
loss_neuralode,
result_neuralode.minimizer,
Flux.ADAM(0.05),
cb = callback,
maxiters = 1000,
) The method isn't very good, but it does what you asked for. |
Excellent @ChrisRackauckas. I see that The method is a means to an end (eventually, GRU-ODE), but it does work reasonably well as is if The custom GRU is as follows. In this problem, I use
|
Cool yeah. The other thing to try is |
It would be good to turn this into a tutorial when all is said and done. @DhairyaLGandhi could you add that restructure/destructure patch to Flux and then tag a release? @John-Boik would you be willing to contribute a tutorial? |
Or @mkg33 might be able to help out here. |
Sure, I would be happy to help if I can. |
Of course, I'll add it to my tasks. |
Has this "fix" been released to FluxML yet? |
|
I'm working on similar models, which also use re/de structure. The fix to re/de structure has been released, and both functions are working fine as far as I know. |
I think @DhairyaLGandhi didn't merge the fix yet FluxML/Flux.jl#1353 It's still a bad model though. |
This was fixed by FluxML/Flux.jl#1901, and one can now use Lux which makes the state explicit. Cheers! |
As a first step leading up to GRU-ODE or ODE-LSTM implementations, I'd like to switch out the Dense layer in the neural_ode_sciml example with a GRU layer. However, doing so raises the error
LoadError: DimensionMismatch("array could not be broadcast to match destination")
. I don't understand where the problem is occuring, exactly, or how to fix it. Any ideas?Code is as follows, with the main differences from the original example being:
using
statements have been changed toimport
statements (for clarity)FastChain
has been changed toChain
include("./TestDiffEq3b.jl")
Dense
layer has been changed to aGRU
layerThis issue is loosely related to Training of UDEs with recurrent networks #391 and Flux.destructure doesn't preserve RNN state #1329. See also
ODE-LSTM layer #422 .
The code is as follows, with the Dense layer commented out and replaced by the GRU layer:
The error message is:
The text was updated successfully, but these errors were encountered: