Error in the UODEs training when the network enlarges #689

junsebas97 · 2022-03-03T13:46:22Z

Hi, currently, I'm modeling a dynamic system with a universal ordinary differential equation. This system and the UODE receive external inputs f(t) and respond according to them. To train the UODE, I dispose of experimental data in only one case. However, for another two cases, I know some properties of the output; thus, I implement them as constraints. Here, it is an MWE, where the constraints correspond to a positive u1 at five seconds when the external input is zero and a u1 equal to zero at 100 seconds for a triangular pulse.

## libraries and functions:
using DiffEqFlux, DifferentialEquations, Interpolations

function callback(θ, loss)
   println("loss: $loss")
   return (loss < 1e-3)
end



function my_solver(f, u0, θ, ti)
    tspan = (ti[1], ti[end])   # integration time

    # incorporate the external input into the model
    aux_model(du, u, θ, t) = model(du, u, θ, t, f)

    prob = ODEProblem(aux_model, u0, tspan, θ)
    sol  = solve(prob, RK4(), saveat = ti, sensealg = ForwardSensitivity())

    return Array(sol)'
end



function loss(θ, example)
    # assess the model response for case 1 (experimental data)
    (t1, f1, u01) = example
    u1 = my_solver(f1, u01, θ, t1)

    # assess the model response for case 2
    t2    = 0:0.1:100
    f2(t) = 0.0
    u02   = [1.0, 0.0]
    u2    = my_solver(f2, u02, θ, t2);

    # assess the model response for case 3
    t3  = 0.0:0.1:100
    f3  = LinearInterpolation([0., 30, 60, 100], [0, 20, 0, 0])
    u03 = [0.0, 0.0]
    u3  = my_solver(f3, u03, θ, t3);
    
    # extract the needed quantities of each response
    y1 = u1[:, 1]
    y2 = u2[:, 1]
    y3 = u3[:, 1]

    # compute the loss
    return sum(abs2, 0 .- y1) + (1 - sign(y2[50])) + (abs(y3[end]))
end


## data: define the input of experimental data
t       = collect(0.0:0.1:30)
f(t)    = sin(t)
u0      = [0.0, 0.0]
example = [t, f, u0]

## model:
ANN = FastChain(FastDense(2, 20, tanh), FastDense(20, 1))
function model(du, u, θ, t, f)
    du[1]  = u[2]
    du[2]  = f(t) - ANN(u, θ)[1]
end

## training:
θ_model = initial_params(ANN)
loss(θ) = loss(θ, example)       # incorporate the experimental data in the loss
DiffEqFlux.sciml_train(loss, θ_model, ADAM(0.01), maxiters = 100, cb = callback)

When I run the above code with ten neurons in the hidden layer, it works perfectly. However, if I use 20 neurons in the hidden layer, I get this message ERROR: LoadError: MethodError: no method matching getindex(::ChainRulesCore.NotImplemented, ::Int64); the full stracktrace is:

ERROR: LoadError: MethodError: no method matching getindex(::ChainRulesCore.NotImplemented, ::Int64)
Stacktrace:
 [1] (::ChainRules.var"#1209#1211"{ChainRulesCore.NotImplemented, Tuple{ChainRulesCore.ProjectTo{Float64, NamedTuple{(), Tuple{}}}, ChainRulesCore.ProjectTo{Float64, NamedTuple{(), Tuple{}}}}})(n::Int64)
    @ ChainRules ~/.julia/packages/ChainRules/QbPMo/src/rulesets/Base/array.jl:34
  [2] ntuple
    @ ./ntuple.jl:19 [inlined]
  [3] (::ChainRules.var"#vect_pullback#1210"{2, Tuple{ChainRulesCore.ProjectTo{Float64, NamedTuple{(), Tuple{}}}, ChainRulesCore.ProjectTo{Float64, NamedTuple{(), Tuple{}}}}})(ȳ::ChainRulesCore.NotImplemented)
    @ ChainRules ~/.julia/packages/ChainRules/QbPMo/src/rulesets/Base/array.jl:34
  [4] (::Zygote.ZBack{ChainRules.var"#vect_pullback#1210"{2, Tuple{ChainRulesCore.ProjectTo{Float64, NamedTuple{(), Tuple{}}}, ChainRulesCore.ProjectTo{Float64, NamedTuple{(), Tuple{}}}}}})(dy::ChainRulesCore.NotImplemented)
    @ Zygote ~/.julia/packages/Zygote/bJn8I/src/compiler/chainrules.jl:204
  [5] Pullback
    @ ~/Desktop/short_gilb/main_1.jl:39 [inlined]
  [6] (::typeof(∂(loss)))(Δ::Float64)
    @ Zygote ~/.julia/packages/Zygote/bJn8I/src/compiler/interface2.jl:0
  [7] Pullback
    @ ~/Desktop/short_gilb/main_1.jl:67 [inlined]
  [8] (::typeof(∂(loss)))(Δ::Float64)
    @ Zygote ~/.julia/packages/Zygote/bJn8I/src/compiler/interface2.jl:0
  [9] Pullback
    @ ~/.julia/packages/DiffEqFlux/jpIWG/src/train.jl:84 [inlined]
 [10] #214
    @ ~/.julia/packages/Zygote/bJn8I/src/lib/lib.jl:203 [inlined]
 [11] #1739#back
    @ ~/.julia/packages/ZygoteRules/AIbCs/src/adjoint.jl:67 [inlined]
 [12] Pullback
    @ ~/.julia/packages/SciMLBase/7GnZA/src/problems/basic_problems.jl:107 [inlined]
 [13] #214
    @ ~/.julia/packages/Zygote/bJn8I/src/lib/lib.jl:203 [inlined]
 [14] #1739#back
    @ ~/.julia/packages/ZygoteRules/AIbCs/src/adjoint.jl:67 [inlined]
 [15] Pullback
    @ ~/.julia/packages/GalacticOptim/DHxE0/src/function/zygote.jl:6 [inlined]
 [16] (::typeof(∂(#260)))(Δ::Float64)
    @ Zygote ~/.julia/packages/Zygote/bJn8I/src/compiler/interface2.jl:0
 [17] #214
    @ ~/.julia/packages/Zygote/bJn8I/src/lib/lib.jl:203 [inlined]
 [18] #1739#back
    @ ~/.julia/packages/ZygoteRules/AIbCs/src/adjoint.jl:67 [inlined]
 [19] Pullback
    @ ~/.julia/packages/GalacticOptim/DHxE0/src/function/zygote.jl:8 [inlined]
 [20] (::typeof(∂(#263)))(Δ::Float64)
    @ Zygote ~/.julia/packages/Zygote/bJn8I/src/compiler/interface2.jl:0
 [21] (::Zygote.var"#57#58"{typeof(∂(#263))})(Δ::Float64)
    @ Zygote ~/.julia/packages/Zygote/bJn8I/src/compiler/interface.jl:41
 [22] gradient(f::Function, args::Vector{Float32})
    @ Zygote ~/.julia/packages/Zygote/bJn8I/src/compiler/interface.jl:76
 [23] (::GalacticOptim.var"#261#271"{GalacticOptim.var"#260#270"{OptimizationFunction{true, GalacticOptim.AutoZygote, DiffEqFlux.var"#82#87"{typeof(loss)}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Nothing}})(::Vector{Float32}, ::Vector{Float32})
    @ GalacticOptim ~/.julia/packages/GalacticOptim/DHxE0/src/function/zygote.jl:8
 [24] macro expansion
    @ ~/.julia/packages/GalacticOptim/DHxE0/src/solve/flux.jl:41 [inlined]
 [25] macro expansion
    @ ~/.julia/packages/GalacticOptim/DHxE0/src/utils.jl:35 [inlined]
 [26] __solve(prob::OptimizationProblem{false, OptimizationFunction{false, GalacticOptim.AutoZygote, OptimizationFunction{true, GalacticOptim.AutoZygote, DiffEqFlux.var"#82#87"{typeof(loss)}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, GalacticOptim.var"#261#271"{GalacticOptim.var"#260#270"{OptimizationFunction{true, GalacticOptim.AutoZygote, DiffEqFlux.var"#82#87"{typeof(loss)}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Nothing}}, GalacticOptim.var"#264#274"{GalacticOptim.var"#260#270"{OptimizationFunction{true, GalacticOptim.AutoZygote, DiffEqFlux.var"#82#87"{typeof(loss)}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Nothing}}, GalacticOptim.var"#269#279", Nothing, Nothing, Nothing}, Vector{Float32}, SciMLBase.NullParameters, Nothing, Nothing, Nothing, Nothing, Base.Pairs{Symbol, typeof(callback), Tuple{Symbol}, NamedTuple{(:cb,), Tuple{typeof(callback)}}}}, opt::ADAM, data::Base.Iterators.Cycle{Tuple{GalacticOptim.NullData}}; maxiters::Int64, cb::Function, progress::Bool, save_best::Bool, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ GalacticOptim ~/.julia/packages/GalacticOptim/DHxE0/src/solve/flux.jl:39
 [27] #solve#480
    @ ~/.julia/packages/SciMLBase/7GnZA/src/solve.jl:3 [inlined]
 [28] sciml_train(::typeof(loss), ::Vector{Float32}, ::ADAM, ::Nothing; lower_bounds::Nothing, upper_bounds::Nothing, maxiters::Int64, kwargs::Base.Pairs{Symbol, typeof(callback), Tuple{Symbol}, NamedTuple{(:cb,), Tuple{typeof(callback)}}})
    @ DiffEqFlux ~/.julia/packages/DiffEqFlux/jpIWG/src/train.jl:89
 [29] top-level scope
    @ ~/Desktop/short_gilb/main_1.jl:68
in expression starting at /home/junsebas97/Desktop/short_gilb/main_1.jl:68

Does anybody know why this happens by increasing the number of neurons? Is there any solution?

The text was updated successfully, but these errors were encountered:

junsebas97 · 2022-03-28T23:29:38Z

anyone? please help!

lungd · 2022-04-01T15:59:29Z

When I run the above code with ten neurons in the hidden layer, it works perfectly.

When I c/p your code I get a

┌ Warning: Instability detected. Aborting
└ @ SciMLBase ~/.julia/packages/SciMLBase/GW7GW/src/integrator_interface.jl:351

It does not crash and the loss decreases but this fact could be relevant.
With 20 neurons I get the instability warning then it errors with the message you've posted.

It throws the error at the line u03 = [0.0, 0.0]

Interestingly, I still got the same error after I changed every array and range to Float32s by adding some f0s or using Float32.([]) (you get Float32 params from initial_params(model) by default), but then it worked with zeros(Float32, 2)

So here is a working example of your code

using DiffEqFlux, OrdinaryDiffEq, Interpolations


function callback(θ, loss)
   println("loss: $loss")
   return (loss < 1e-3)
end



function my_solver(f, u0, θ, ti)
    tspan = (ti[1], ti[end])   # integration time

    # incorporate the external input into the model
    aux_model(du, u, θ, t) = model(du, u, θ, t, f)

    prob = ODEProblem(aux_model, u0, tspan, θ)
    sol  = solve(prob, RK4(), saveat = ti, sensealg = ForwardSensitivity())

    return Array(sol)'
end



function loss(θ, example)
    # assess the model response for case 1 (experimental data)
    (t1, f1, u01) = example
    u1 = my_solver(f1, u01, θ, t1)

    # assess the model response for case 2
    t2    = 0f0:0.1f0:100
    f2(t) = 0.0f0
    u02   = Float32.([1.0, 0.0])
    u2    = my_solver(f2, u02, θ, t2);

    # assess the model response for case 3
    t3  = 0.0f0:0.1f0:100
    f3  = Interpolations.LinearInterpolation(Float32.([0.0, 30.0, 60.0, 100.0]), Float32.([0.0, 20.0, 0.0, 0.0]))
    # u03 = Float32.([0.0, 0.0])         ### ERROR
    # u03 = [0f0, 0f0]                   ### ERROR
    u03 = zeros(Float32, 2)              ### check
    u3  = my_solver(f3, u03, θ, t3);

    # extract the needed quantities of each response
    y1 = u1[:, 1]
    y2 = u2[:, 1]
    y3 = u3[:, 1]

    # compute the loss
    return sum(abs2, 0 .- y1) + (1 - sign(y2[50])) + (abs(y3[end]))
end


## data: define the input of experimental data
t       = collect(0.0f0:0.1f0:30)
f(t)    = sin(t)
u0      = Float32.([0.0, 0.0])
example = [t, f, u0]

## model:
ANN = FastChain(FastDense(2, 20, tanh), FastDense(20, 1))
function model(du, u, θ, t, f)
    du[1]  = u[2]
    du[2]  = f(t) - ANN(u, θ)[1]
end


## training:
θ_model = initial_params(ANN)
loss(θ) = loss(θ, example)       # incorporate the experimental data in the loss
DiffEqFlux.sciml_train(loss, θ_model, ADAM(0.01), maxiters = 100, cb = callback)

julia> versioninfo()
Julia Version 1.7.2
Commit bf53498635 (2022-02-06 15:21 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: AMD Ryzen Threadripper 2970WX 24-Core Processor
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-12.0.1 (ORCJIT, znver1)

st
  [aae7a2af] DiffEqFlux v1.45.3
  [a98d9a8b] Interpolations v0.13.5
  [1dea7af3] OrdinaryDiffEq v6.7.1

ChrisRackauckas · 2022-06-11T04:05:58Z

Has a working example.

ChrisRackauckas closed this as completed Jun 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error in the UODEs training when the network enlarges #689

Error in the UODEs training when the network enlarges #689

junsebas97 commented Mar 3, 2022

junsebas97 commented Mar 28, 2022

lungd commented Apr 1, 2022

ChrisRackauckas commented Jun 11, 2022

Error in the UODEs training when the network enlarges #689

Error in the UODEs training when the network enlarges #689

Comments

junsebas97 commented Mar 3, 2022

junsebas97 commented Mar 28, 2022

lungd commented Apr 1, 2022

ChrisRackauckas commented Jun 11, 2022