ModelWrappers.jl is a utility package that makes it easier to work with Model parameters stated as (nested) NamedTuples. It handles
- flattening/unflattening model parameter fields of arbitrary dimensions.
- constraining/unconstraining parameter (if a corresponding constraint is provided).
- using Automatic Differentation for Model parameter given as a
NamedTuple
for a user specified function, i.e., a log-posterior distribution.
ModelWrappers.jl allows you to flatten
a (nested) NamedTuple to a vector, and also returns an unflatten
function to convert a vector back to a NamedTuple. By default, discrete parameter are not flattened and the default flatten type is Float64
. One can construct flatten/unflatten via a ReConstructor
.
using ModelWrappers
myparameter = (a = Float32(1.), b = 2, c = [3., 4.], d = [5, 6])
reconstruct = ReConstructor(myparameter)
vals_vec = flatten(reconstruct, myparameter) #Vector{Float64} with 3 elements (1., 3., 4.)
vals = unflatten(reconstruct, vals_vec) #(a = 1.0f0, b = 2, c = [3.0, 4.0], d = [5, 6])
You can adjust these settings by using the FlattenDefault
struct. For instance, the following settings will map myparameter
to a Float16
vector and also flatten the Integer values.
flattendefault = FlattenDefault(; output = Float16, flattentype = FlattenAll())
reconstruct = ReConstructor(flattendefault, myparameter)
vals_vec = flatten(reconstruct, myparameter) #Vector{Float16} with 6 elements (1., 2., 3., 4., 5., 6.)
vals = unflatten(reconstruct, vals_vec) #(a = 1.0f0, b = 2, c = [3.0, 4.0], d = [5, 6])
Flatten/Unflatten can also be used for Automatic Differentiation. The functions flattenAD
and unflattenAD
return output based on the input type. Check the differences to the first two cases in this example:
myparameter = (a = Float32(1.), b = 2, c = [3., 4.], d = [5, 6])
flattendefault = FlattenDefault(; output = Float32, flattentype = FlattenAll())
reconstruct = ReConstructor(flattendefault, myparameter)
vals_vec = flattenAD(reconstruct, myparameter) #Vector{Float64} with 6 elements (1.00 2.00 3.00 4.00 5.00 6.00)
vals = unflattenAD(reconstruct, vals_vec) #(a = 1.0, b = 2.0, c = [3.0, 4.0], d = [5.0, 6.0])
A ReConstructor
will assign buffers for flatten
and unflatten
, so most operations can be performed without allocations. Unflatten can usually be performed free of most allocations, even if arrays are involved:
using BenchmarkTools
myparameter2 = (a = Float32(1.), b = 2, c = [3., 4.], d = [5, 6], e = randn(1000), f = rand(1:2, 1000), g = randn(1000, 2))
reconstruct = ReConstructor(myparameter2)
vals_vec = flatten(reconstruct, myparameter2)
vals_vec #Vector{Float64} with 3003 element
@btime unflatten($reconstruct, $vals_vec) # 419.095 ns (0 allocations: 0 bytes)
@btime flatten($reconstruct, $myparameter2) # 3.475 μs (8 allocations: 39.83 KiB)
Note that it is possible to nest NamedTuples, and use arbitrary Array-of-Arrays structures for your parameter, but this will often come with a performance penalty:
myparameter3 = (a = myparameter, b = (c = (d = myparameter2, ), ), e = [rand(10), rand(15), rand(20)])
reconstruct = ReConstructor(myparameter3)
vals_vec = flatten(reconstruct, myparameter3)
vals_vec #Vector{Float64} with 3051 element
@btime unflatten($reconstruct, $vals_vec) # 1.220 μs (32 allocations: 3.19 KiB)
@btime flatten($reconstruct, $myparameter3) # 7.275 μs (19 allocations: 88.17 KiB)
Consider now the following problem: you have a model that consists of various (unknown) parameters and you want to estimate these parameter with a custom algorithm. Many common algorithms not only require you to take a vector as function argument, but also require you to know in which space your parameter operate.
If a corresponding prior distribution is provided, ModelWrappers.jl allows you to efficiently constrain and unconstrain your parameter tuple. To do so, one can initiate a Param
struct, which is a temporary constructor that checks if the package can handle the (value, constraint) combination. The initial NamedTuple
can then be wrapped in a ModelWrapper
struct.
using Distributions
myparameter4 = (μ = Param(Normal(), 0.0,), σ = Param(Gamma(), 1.0, ))
mymodel = ModelWrapper(myparameter4)
Note that providing a distribution from 'Distributions.jl' in Param
will just assign a bijector from 'Bijectors.jl' to the parameter. Other valid constraint options for a Param
struct at the moment include
- a bijector from Bijectors.jl,
- all distributions that work with Bijectors.jl,
- a
Fixed
struct, which keepsval
fixed and excludes it from flatten/unflatten, - an
Unconstrained
struct, which flattensval
without taking into account any constraint, - and a
Constrained
struct, which flattensval
without taking into account any constraint, but will take into account the constraints when constraining values.
using Bijectors
myparameter_constraints = (
μ = Param(Normal(), 0.0,),
σ = Param(Bijection(bijector(Gamma(2,2))), 1.0,),
buffer1 = Param(Fixed(), zeros(Int64, 2,3,4), ),
buffer2 = Param(Unconstrained(), [zeros(10), zeros(20)], ),
buffer3 = Param(Constrained(1., 5.), 3., ),
)
model_constraints = ModelWrapper(myparameter_constraints)
flatten(model_constraints) #Vector{Float64}
A ModelWrapper
struct is mutable, and contains the values of your NamedTuple
field. Values can be flattened or unconstrained, and may be updated by new values/samples. Also, when a ModelWrapper
struct is created, an unflatten function for strict and variable type conversion is stored. To show this, we will a create ModelWrapper
struct, flatten its values, and update the struct with new values:
using Distributions, Random
_rng = MersenneTwister(2)
myparameter4 = (μ = Param(Normal(), 0.0, ), σ = Param(Gamma(), 1.0, ))
mymodel = ModelWrapper(myparameter4)
#Flatten/Unconstrain Model parameter
vals_vec = flatten(mymodel) #Vector{Float64} with 2 elements
unconstrain(mymodel) #(μ = 0.0, σ = 0.0)
unconstrain_flatten(mymodel) #Vector{Float64} with 2 elements
#Unflatten/Constrain proposed parameter from unconstrained space
θ_proposed = randn(_rng, length(vals_vec)) #Vector{Float64} with 2 elements
ModelWrappers.unflatten(mymodel, θ_proposed) #(μ = 0.7396206598864331, σ = -0.7445071021408705)
unflatten_constrain(mymodel, θ_proposed) #(μ = 0.7396206598864331, σ = 0.4749683531374296)
#Replacing current model parameter with proposed parameter
mymodel.val #(μ = 0.0, σ = 1.0)
unflatten_constrain!(mymodel, θ_proposed)
mymodel.val #(μ = 0.7396206598864331, σ = 0.4749683531374296)
ModelWrappers.jl
supports the usage of various Automatic Differentiation backends by providing an immutable Objective
struct that contains your ModelWrapper
, data, and all parameter that you want to get derivative information from. Objective
is a functor, and you can manually add a target function wrt your original parameter NamedTuple
that should be included in the AD call, i.e., a log-posterior density.
Let us work with the model from before. We first sample data, create the objective and then define a function that we want to use AD for:
using UnPack
#Create Model and data
myparameter4 = (μ = Param(Normal(), 0.0, ), σ = Param(Gamma(), 1.0, ))
mymodel = ModelWrapper(myparameter4)
data = randn(1000)
#Create objective for both μ and σ and define a target function for it
myobjective = Objective(mymodel, data, (:μ, :σ))
function (objective::Objective{<:ModelWrapper{BaseModel}})(θ::NamedTuple)
@unpack data = objective
lprior = Distributions.logpdf(Distributions.Normal(),θ.μ) + Distributions.logpdf(Distributions.Exponential(), θ.σ)
llik = sum(Distributions.logpdf( Distributions.Normal(θ.μ, θ.σ), data[iter] ) for iter in eachindex(data))
return lprior + llik
end
myobjective
can take a vector from an unconstrained space as input, constrains and converts the argument to a NamedTuple
, checks if all conversions are finite, and adds all eventual Jacobian adjustments from the transformations before your target function is evaluated. This can usually be done efficiently:
#Sample new parameter, and evaluate target function wrt to Vector (not NamedTuple)
θ_proposed = randn(_rng, length(vals_vec))
myobjective(θ_proposed)
#Functor call wrt NamedTuple parameter
@btime $myobjective($mymodel.val) #6.420 μs (0 allocations: 0 bytes)
#Functor call wrt proposed Parameter Vector
@btime $myobjective($θ_proposed) #6.480 μs (0 allocations: 0 bytes)
Objective
can also be called from various AD frameworks:
using ForwardDiff, ReverseDiff, Zygote
grad_fwd = ForwardDiff.gradient(myobjective, θ_proposed)
grad_rvd = ReverseDiff.gradient(myobjective, θ_proposed)
grad_zyg = Zygote.gradient(myobjective, θ_proposed)
all(grad_fwd .≈ grad_rvd .≈ grad_zyg[1]) #true
This package is still highly experimental - suggestions and comments are always welcome! New constraints should be reasonable simple to add, check out src/Core/constrain/constraints/constrained.jl
as an example with guidance in the comments.