diff --git a/docs/src/differentiable_programming.md b/docs/src/differentiable_programming.md index 4f082d3d4e..c2aff0b229 100644 --- a/docs/src/differentiable_programming.md +++ b/docs/src/differentiable_programming.md @@ -223,6 +223,145 @@ Here, we used some knowledge about the internal memory layout of Trixi, an array with the conserved variables as fastest-varying index in memory. +### Differentiating through a complete simulation + +It is also possible to differentiate through a complete simulation. As an example, let's differentiate +the total energy of a simulation using the linear scalar advection equation with respect to the +wave number (frequency) of the initial data. + +```jldoctest advection_differentiate_simulation +julia> using Trixi, OrdinaryDiffEq, ForwardDiff, Plots + +julia> function energy_at_final_time(k) # k is the wave number of the initial condition + equations = LinearScalarAdvectionEquation2D(1.0, -0.3) + mesh = TreeMesh((-1.0, -1.0), (1.0, 1.0), initial_refinement_level=3, n_cells_max=10^4) + solver = DGSEM(3, flux_lax_friedrichs) + initial_condition = (x, t, equation) -> begin + x_trans = Trixi.x_trans_periodic_2d(x - equation.advectionvelocity * t) + return SVector(sinpi(k * sum(x_trans))) + end + semi = SemidiscretizationHyperbolic(mesh, equations, initial_condition, solver, + uEltype=typeof(k)) + ode = semidiscretize(semi, (0.0, 1.0)) + sol = solve(ode, BS3(), save_everystep=false) + Trixi.integrate(energy_total, sol.u[end], semi) + end +energy_at_final_time (generic function with 1 method) + +julia> k_values = range(0.9, 1.1, length=101) +0.9:0.002:1.1 + +julia> plot(k_values, energy_at_final_time.(k_values), label="Energy"); +``` + +You should see a plot of a curve that resembles a parabola with local maximum around `k = 1.0`. +Why's that? Well, the domain is fixed but the wave number changes. Thus, if the wave number is +not chosen as an integer, the initial condition will not be a smooth periodic function in the +given domain. Hence, the dissipative surface flux (`flux_lax_friedrichs` in this example) +will introduce more dissipation. In particular, it will introduce more dissipation for "less smooth" +initial data, corresponding to wave numbers `k` further away from integers. + +We can compute the discrete derivative of the energy at the final time with respect to the wave +number `k` as follows. +```jldoctest advection_differentiate_simulation +julia> round(ForwardDiff.derivative(energy_at_final_time, 1.0), sigdigits=2) +1.4e-5 +``` + +This is rather small and we can treat it as zero in comparison to the value of this derivative at +other wave numbers `k`. + +```jldoctest advection_differentiate_simulation +julia> dk_values = ForwardDiff.derivative.((energy_at_final_time,), k_values); + +julia> plot(k_values, dk_values, label="Derivative"); +``` + +If you remember basic calculus, a sufficient condition for a local maximum is that the first derivative +vanishes and the second derivative is negative. We can also check this discretely. + +```jldoctest advection_differentiate_simulation +julia> round(ForwardDiff.derivative( + k -> Trixi.ForwardDiff.derivative(energy_at_final_time, k), + 1.0), sigdigits=2) +-0.9 +``` + +Having seen this application, let's break down what happens step by step. +```julia +julia> function energy_at_final_time(k) # k is the wave number of the initial condition + equations = LinearScalarAdvectionEquation2D(1.0, -0.3) + mesh = TreeMesh((-1.0, -1.0), (1.0, 1.0), initial_refinement_level=3, n_cells_max=10^4) + solver = DGSEM(3, flux_lax_friedrichs) + initial_condition = (x, t, equation) -> begin + x_trans = Trixi.x_trans_periodic_2d(x - equation.advectionvelocity * t) + return SVector(sinpi(k * sum(x_trans))) + end + semi = SemidiscretizationHyperbolic(mesh, equations, initial_condition, solver, + uEltype=typeof(k)) + ode = semidiscretize(semi, (0.0, 1.0)) + sol = solve(ode, BS3(), save_everystep=false) + Trixi.integrate(energy_total, sol.u[end], semi) + end + +julia> round(ForwardDiff.derivative(energy_at_final_time, 1.0), sigdigits=2) +1.4e-5 +``` +When calling `ForwardDiff.derivative(energy_at_final_time, 1.0)`, ForwardDiff.jl +will basically use the chain rule and known derivatives of existing basic functions +to calculate the derivative of the energy at the final time with respect to the +wave number `k` at `k0 = 1.0`. To do this, ForwardDiff.jl uses dual numbers, which +basically store the result and its derivative w.r.t. a specified parameter at the +same time. Thus, we need to make sure that we can treat these `ForwardDiff.Dual` +numbers everywhere during the computation. Fortunately, generic Julia code usually +supports these operations. The most basic problem for a developer is to ensure +that all types are generic enough, in particular the ones of internal caches. + +The first step in this example creates some basic ingredients of our simulation. +```julia +equations = LinearScalarAdvectionEquation2D(1.0, -0.3) +mesh = TreeMesh((-1.0, -1.0), (1.0, 1.0), initial_refinement_level=3, n_cells_max=10^4) +solver = DGSEM(3, flux_lax_friedrichs) +``` +These do not have internal caches storing intermediate values of the numerical +solution, so we do not need to adapt them. In fact, we could also define them +outside of `energy_at_final_time` (but would need to take care of globals or +wrap everything in another function). + +Next, we define the initial condition +```julia +initial_condition = (x, t, equation) -> begin + x_trans = Trixi.x_trans_periodic_2d(x - equation.advectionvelocity * t) + return SVector(sinpi(k * sum(x_trans))) +end +``` +as a closure capturing the wave number `k` passed to `energy_at_final_time`. +If you call `energy_at_final_time(1.0)`, `k` will be a `Float64`. Thus, the +return values of `initial_condition` will be `SVector`s of `Float64`s. When +calculating the `ForwardDiff.derivative`, `k` will be a `ForwardDiff.Dual` number. +Hence, the `initial_condition` will return `SVector`s of `ForwardDiff.Dual` +numbers. + +The semidiscretization `semi` uses some internal caches to avoid repeated allocations +and speed up the computations, e.g. for numerical fluxes at interfaces. Thus, we +need to tell Trixi to allow `ForwardDiff.Dual` numbers in these caches. That's what +the keyword argument `uEltype=typeof(k)` in +```julia +semi = SemidiscretizationHyperbolic(mesh, equations, initial_condition, solver, + uEltype=typeof(k)) +``` +does. This is basically the only part where you need to modify your standard Trixi +code to enable automatic differentiation. From there on, the remaining steps +```julia +ode = semidiscretize(semi, (0.0, 1.0)) +sol = solve(ode, BS3(), save_everystep=false) +Trixi.integrate(energy_total, sol.u[end], semi) +``` +do not need any modifications since they are sufficiently generic (and enough effort +has been spend to allow general types inside thee calls). + + + ## Finite difference approximations Trixi provides the convenience function [`jacobian_fd`](@ref) to approximate the Jacobian diff --git a/src/semidiscretization/semidiscretization.jl b/src/semidiscretization/semidiscretization.jl index 8e68bf0eb7..09d9b12720 100644 --- a/src/semidiscretization/semidiscretization.jl +++ b/src/semidiscretization/semidiscretization.jl @@ -216,15 +216,26 @@ function jacobian_ad_forward(semi::AbstractSemidiscretization; t0=zero(real(semi)), u0_ode=compute_coefficients(t0, semi)) - J = ForwardDiff.jacobian((du_ode, u_ode) -> begin - new_semi = remake(semi, uEltype=eltype(u_ode)) + du_ode = similar(u0_ode) + config = ForwardDiff.JacobianConfig(nothing, du_ode, u0_ode) + + # Use a function barrier since the generation of the `config` we use above + # is not type-stable + _jacobian_ad_forward(semi, t0, u0_ode, du_ode, config) +end + +function _jacobian_ad_forward(semi, t0, u0_ode, du_ode, config) + + new_semi = remake(semi, uEltype=eltype(config)) + J = ForwardDiff.jacobian(du_ode, u0_ode, config) do du_ode, u_ode Trixi.rhs!(du_ode, u_ode, new_semi, t0) - end, similar(u0_ode), u0_ode) + end return J end + # Sometimes, it can be useful to save some (scalar) variables associated with each element, # e.g. AMR indicators or shock indicators. Since these usually have to be re-computed # directly before IO and do not necessarily need to be stored in memory before, diff --git a/src/solvers/dg/containers_1d.jl b/src/solvers/dg/containers_1d.jl index 64f75f3273..dd7de77cfd 100644 --- a/src/solvers/dg/containers_1d.jl +++ b/src/solvers/dg/containers_1d.jl @@ -13,6 +13,7 @@ end nvariables(::ElementContainer1D{RealT, uEltype, NVARS, POLYDEG}) where {RealT, uEltype, NVARS, POLYDEG} = NVARS polydeg(::ElementContainer1D{RealT, uEltype, NVARS, POLYDEG}) where {RealT, uEltype, NVARS, POLYDEG} = POLYDEG +Base.eltype(::ElementContainer1D{RealT, uEltype}) where {RealT, uEltype} = uEltype # Only one-dimensional `Array`s are `resize!`able in Julia. # Hence, we use `Vector`s as internal storage and `resize!` @@ -123,6 +124,7 @@ end nvariables(::InterfaceContainer1D{uEltype, NVARS, POLYDEG}) where {uEltype, NVARS, POLYDEG} = NVARS polydeg(::InterfaceContainer1D{uEltype, NVARS, POLYDEG}) where {uEltype, NVARS, POLYDEG} = POLYDEG +Base.eltype(::InterfaceContainer1D{uEltype}) where {uEltype} = uEltype # See explanation of Base.resize! for the element container function Base.resize!(interfaces::InterfaceContainer1D, capacity) @@ -277,6 +279,7 @@ end nvariables(::BoundaryContainer1D{RealT, uEltype, NVARS, POLYDEG}) where {RealT, uEltype, NVARS, POLYDEG} = NVARS polydeg(::BoundaryContainer1D{RealT, uEltype, NVARS, POLYDEG}) where {RealT, uEltype, NVARS, POLYDEG} = POLYDEG +Base.eltype(::BoundaryContainer1D{RealT, uEltype}) where {RealT, uEltype} = uEltype # See explanation of Base.resize! for the element container function Base.resize!(boundaries::BoundaryContainer1D, capacity) diff --git a/src/solvers/dg/containers_2d.jl b/src/solvers/dg/containers_2d.jl index db7d509c59..88f0f48470 100644 --- a/src/solvers/dg/containers_2d.jl +++ b/src/solvers/dg/containers_2d.jl @@ -13,6 +13,7 @@ end nvariables(::ElementContainer2D{RealT, uEltype, NVARS, POLYDEG}) where {RealT, uEltype, NVARS, POLYDEG} = NVARS polydeg(::ElementContainer2D{RealT, uEltype, NVARS, POLYDEG}) where {RealT, uEltype, NVARS, POLYDEG} = POLYDEG +Base.eltype(::ElementContainer2D{RealT, uEltype}) where {RealT, uEltype} = uEltype # Only one-dimensional `Array`s are `resize!`able in Julia. # Hence, we use `Vector`s as internal storage and `resize!` @@ -128,6 +129,7 @@ end nvariables(::InterfaceContainer2D{uEltype, NVARS, POLYDEG}) where {uEltype, NVARS, POLYDEG} = NVARS polydeg(::InterfaceContainer2D{uEltype, NVARS, POLYDEG}) where {uEltype, NVARS, POLYDEG} = POLYDEG +Base.eltype(::InterfaceContainer2D{uEltype}) where {uEltype} = uEltype # See explanation of Base.resize! for the element container function Base.resize!(interfaces::InterfaceContainer2D, capacity) @@ -294,6 +296,7 @@ end nvariables(::BoundaryContainer2D{RealT, uEltype, NVARS, POLYDEG}) where {RealT, uEltype, NVARS, POLYDEG} = NVARS polydeg(::BoundaryContainer2D{RealT, uEltype, NVARS, POLYDEG}) where {RealT, uEltype, NVARS, POLYDEG} = POLYDEG +Base.eltype(::BoundaryContainer2D{RealT, uEltype}) where {RealT, uEltype} = uEltype # See explanation of Base.resize! for the element container function Base.resize!(boundaries::BoundaryContainer2D, capacity) @@ -493,6 +496,7 @@ end nvariables(::L2MortarContainer2D{uEltype, NVARS, POLYDEG}) where {uEltype, NVARS, POLYDEG} = NVARS polydeg(::L2MortarContainer2D{uEltype, NVARS, POLYDEG}) where {uEltype, NVARS, POLYDEG} = POLYDEG +Base.eltype(::L2MortarContainer2D{uEltype}) where {uEltype} = uEltype # See explanation of Base.resize! for the element container function Base.resize!(mortars::L2MortarContainer2D, capacity) @@ -693,6 +697,7 @@ end nvariables(::MPIInterfaceContainer2D{uEltype, NVARS, POLYDEG}) where {uEltype, NVARS, POLYDEG} = NVARS polydeg(::MPIInterfaceContainer2D{uEltype, NVARS, POLYDEG}) where {uEltype, NVARS, POLYDEG} = POLYDEG +Base.eltype(::MPIInterfaceContainer2D{uEltype}) where {uEltype} = uEltype # See explanation of Base.resize! for the element container function Base.resize!(mpi_interfaces::MPIInterfaceContainer2D, capacity) diff --git a/src/solvers/dg/containers_3d.jl b/src/solvers/dg/containers_3d.jl index 6ea22626ea..0fbb81bc2c 100644 --- a/src/solvers/dg/containers_3d.jl +++ b/src/solvers/dg/containers_3d.jl @@ -13,6 +13,7 @@ end nvariables(::ElementContainer3D{RealT, uEltype, NVARS, POLYDEG}) where {RealT, uEltype, NVARS, POLYDEG} = NVARS polydeg(::ElementContainer3D{RealT, uEltype, NVARS, POLYDEG}) where {RealT, uEltype, NVARS, POLYDEG} = POLYDEG +Base.eltype(::ElementContainer3D{RealT, uEltype}) where {RealT, uEltype} = uEltype # Only one-dimensional `Array`s are `resize!`able in Julia. # Hence, we use `Vector`s as internal storage and `resize!` @@ -130,6 +131,7 @@ end nvariables(::InterfaceContainer3D{uEltype, NVARS, POLYDEG}) where {uEltype, NVARS, POLYDEG} = NVARS polydeg(::InterfaceContainer3D{uEltype, NVARS, POLYDEG}) where {uEltype, NVARS, POLYDEG} = POLYDEG +Base.eltype(::InterfaceContainer3D{uEltype}) where {uEltype} = uEltype # See explanation of Base.resize! for the element container function Base.resize!(interfaces::InterfaceContainer3D, capacity) @@ -292,6 +294,7 @@ end nvariables(::BoundaryContainer3D{RealT, uEltype, NVARS, POLYDEG}) where {RealT, uEltype, NVARS, POLYDEG} = NVARS polydeg(::BoundaryContainer3D{RealT, uEltype, NVARS, POLYDEG}) where {RealT, uEltype, NVARS, POLYDEG} = POLYDEG +Base.eltype(::BoundaryContainer3D{RealT, uEltype}) where {RealT, uEltype} = uEltype # See explanation of Base.resize! for the element container function Base.resize!(boundaries::BoundaryContainer3D, capacity) @@ -511,6 +514,7 @@ end nvariables(::L2MortarContainer3D{uEltype, NVARS, POLYDEG}) where {uEltype, NVARS, POLYDEG} = NVARS polydeg(::L2MortarContainer3D{uEltype, NVARS, POLYDEG}) where {uEltype, NVARS, POLYDEG} = POLYDEG +Base.eltype(::L2MortarContainer3D{uEltype}) where {uEltype} = uEltype # See explanation of Base.resize! for the element container function Base.resize!(mortars::L2MortarContainer3D, capacity) diff --git a/src/solvers/dg/dg.jl b/src/solvers/dg/dg.jl index d87c079772..8c9cb2502e 100644 --- a/src/solvers/dg/dg.jl +++ b/src/solvers/dg/dg.jl @@ -249,7 +249,7 @@ end function allocate_coefficients(mesh::TreeMesh, equations, dg::DG, cache) # We must allocate a `Vector` in order to be able to `resize!` it (AMR). # cf. wrap_array - zeros(real(dg), nvariables(equations) * nnodes(dg)^ndims(mesh) * nelements(dg, cache)) + zeros(eltype(cache.elements), nvariables(equations) * nnodes(dg)^ndims(mesh) * nelements(dg, cache)) end diff --git a/src/solvers/dg/dg_2d.jl b/src/solvers/dg/dg_2d.jl index 6d13be7436..af8be80795 100644 --- a/src/solvers/dg/dg_2d.jl +++ b/src/solvers/dg/dg_2d.jl @@ -99,10 +99,12 @@ end function create_cache(mesh::TreeMesh{2}, equations, mortar_l2::LobattoLegendreMortarL2, uEltype) # TODO: Taal compare performance of different types MA2d = MArray{Tuple{nvariables(equations), nnodes(mortar_l2)}, uEltype} - # A2d = Array{uEltype, 2} + fstar_upper_threaded = MA2d[MA2d(undef) for _ in 1:Threads.nthreads()] + fstar_lower_threaded = MA2d[MA2d(undef) for _ in 1:Threads.nthreads()] - fstar_upper_threaded = [MA2d(undef) for _ in 1:Threads.nthreads()] - fstar_lower_threaded = [MA2d(undef) for _ in 1:Threads.nthreads()] + # A2d = Array{uEltype, 2} + # fstar_upper_threaded = [A2d(undef, nvariables(equations), nnodes(mortar_l2)) for _ in 1:Threads.nthreads()] + # fstar_lower_threaded = [A2d(undef, nvariables(equations), nnodes(mortar_l2)) for _ in 1:Threads.nthreads()] (; fstar_upper_threaded, fstar_lower_threaded) end