Need for MvNormal with sparse precision matrix #89

jrfaulkner · 2020-06-24T19:54:42Z

After much digging and testing, it appears you are currently not supporting sparse matrix constructions for AD. Is this correct? In particular, I am in need of a MvNormal distribution that is specified using a sparse precision matrix and will work with the autodiff in Turing. This is common for large spatial problems using Gaussian Markov random fields. I came to Turing for this because stan does not support sparse matrix operations. This would be a really valuable addition to your package if you can make it work.

devmotion · 2020-06-26T10:24:29Z

No, MvNormalCanon (which you are referring to) is not supported yet. Would be a valuable addition.

jrfaulkner · 2020-06-26T14:26:21Z

One that allows sparse precision matrix is the key.

devmotion · 2020-06-26T14:38:09Z

Sure, I understand, I was just referring to the fact that MvNormalCanon is the distribution type in Distributions for multivariate normal distributions that are parameterized by precision matrices.

jrfaulkner · 2020-06-26T15:14:14Z

Yes, thanks for pointing that out. Is there any reason that sparse operations would cause problems for autodiff?

ElOceanografo · 2020-08-22T00:07:10Z

This is also something I would find really valuable. From trying out a toy model, am I correct that Distributions is using PDMats which, when it constructs a PDSparseMat, calls a cholesky routine from CHOLMOD, which chokes on the Dual numbers?

using Turing
using SparseArrays
using PDMats

@model function MvNormalCanonTest(x)
    n = length(x)
    ρ ~ Uniform(0, 0.49)
    Q = spdiagm(-1 => ρ * ones(n-1), 0 => ones(n), 1 => ρ * ones(n-1))
    x ~ MvNormalCanon(PDSparseMat(Q))
end

x = randn(10)
m = MvNormalCanonTest(x)
sample(m, NUTS(), 100)

TypeError: in Sparse, in Tv, expected Tv<:Union{Complex{Float64}, Float64}, got Type{ForwardDiff.Dual{ForwardDiff.Tag{Turing.Core.var"#f#7"{DynamicPPL.VarInfo{NamedTuple{(:ρ,),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:ρ,Tuple{}},Int64},Array{Uniform{Float64},1},Array{DynamicPPL.VarName{:ρ,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64},DynamicPPL.Model{var"###evaluator#299",(:x,),Tuple{Array{Float64,1}},(),DynamicPPL.ModelGen{var"###generator#300",(:x,),(),Tuple{}}},DynamicPPL.Sampler{NUTS{Turing.Core.ForwardDiffAD{40},(),AdvancedHMC.DiagEuclideanMetric},Turing.Inference.SamplerState{DynamicPPL.VarInfo{NamedTuple{(:ρ,),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:ρ,Tuple{}},Int64},Array{Uniform{Float64},1},Array{DynamicPPL.VarName{:ρ,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64}}}},Float64},Float64,1}}
SuiteSparse.CHOLMOD.Sparse(::SparseMatrixCSC{ForwardDiff.Dual{ForwardDiff.Tag{Turing.Core.var"#f#7"{DynamicPPL.VarInfo{NamedTuple{(:ρ,),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:ρ,Tuple{}},Int64},Array{Uniform{Float64},1},Array{DynamicPPL.VarName{:ρ,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64},DynamicPPL.Model{var"###evaluator#299",(:x,),Tuple{Array{Float64,1}},(),DynamicPPL.ModelGen{var"###generator#300",(:x,),(),Tuple{}}},DynamicPPL.Sampler{NUTS{Turing.Core.ForwardDiffAD{40},(),AdvancedHMC.DiagEuclideanMetric},Turing.Inference.SamplerState{DynamicPPL.VarInfo{NamedTuple{(:ρ,),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:ρ,Tuple{}},Int64},Array{Uniform{Float64},1},Array{DynamicPPL.VarName{:ρ,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64}}}},Float64},Float64,1},Int64}, ::Int64) at cholmod.jl:933
SuiteSparse.CHOLMOD.Sparse(::SparseMatrixCSC{ForwardDiff.Dual{ForwardDiff.Tag{Turing.Core.var"#f#7"{DynamicPPL.VarInfo{NamedTuple{(:ρ,),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:ρ,Tuple{}},Int64},Array{Uniform{Float64},1},Array{DynamicPPL.VarName{:ρ,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64},DynamicPPL.Model{var"###evaluator#299",(:x,),Tuple{Array{Float64,1}},(),DynamicPPL.ModelGen{var"###generator#300",(:x,),(),Tuple{}}},DynamicPPL.Sampler{NUTS{Turing.Core.ForwardDiffAD{40},(),AdvancedHMC.DiagEuclideanMetric},Turing.Inference.SamplerState{DynamicPPL.VarInfo{NamedTuple{(:ρ,),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:ρ,Tuple{}},Int64},Array{Uniform{Float64},1},Array{DynamicPPL.VarName{:ρ,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64}}}},Float64},Float64,1},Int64}) at cholmod.jl:939
cholesky(::SparseMatrixCSC{ForwardDiff.Dual{ForwardDiff.Tag{Turing.Core.var"#f#7"{DynamicPPL.VarInfo{NamedTuple{(:ρ,),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:ρ,Tuple{}},Int64},Array{Uniform{Float64},1},Array{DynamicPPL.VarName{:ρ,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64},DynamicPPL.Model{var"###evaluator#299",(:x,),Tuple{Array{Float64,1}},(),DynamicPPL.ModelGen{var"###generator#300",(:x,),(),Tuple{}}},DynamicPPL.Sampler{NUTS{Turing.Core.ForwardDiffAD{40},(),AdvancedHMC.DiagEuclideanMetric},Turing.Inference.SamplerState{DynamicPPL.VarInfo{NamedTuple{(:ρ,),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:ρ,Tuple{}},Int64},Array{Uniform{Float64},1},Array{DynamicPPL.VarName{:ρ,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64}}}},Float64},Float64,1},Int64}; kws::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at cholmod.jl:1458
cholesky at cholmod.jl:1458 [inlined]
PDSparseMat(::SparseMatrixCSC{ForwardDiff.Dual{ForwardDiff.Tag{Turing.Core.var"#f#7"{DynamicPPL.VarInfo{NamedTuple{(:ρ,),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:ρ,Tuple{}},Int64},Array{Uniform{Float64},1},Array{DynamicPPL.VarName{:ρ,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64},DynamicPPL.Model{var"###evaluator#299",(:x,),Tuple{Array{Float64,1}},(),DynamicPPL.ModelGen{var"###generator#300",(:x,),(),Tuple{}}},DynamicPPL.Sampler{NUTS{Turing.Core.ForwardDiffAD{40},(),AdvancedHMC.DiagEuclideanMetric},Turing.Inference.SamplerState{DynamicPPL.VarInfo{NamedTuple{(:ρ,),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:ρ,Tuple{}},Int64},Array{Uniform{Float64},1},Array{DynamicPPL.VarName{:ρ,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64}}}},Float64},Float64,1},Int64}) at pdsparsemat.jl:22
macro expansion at untitled-d3c70e9b18556a75b02f77b5bb73e87c:245 [inlined]
##evaluator#299(::Random._GLOBAL_RNG, ::DynamicPPL.Model{var"###evaluator#299",(:x,),Tuple{Array{Float64,1}},(),DynamicPPL.ModelGen{var"###generator#300",(:x,),(),Tuple{}}}, ::DynamicPPL.ThreadSafeVarInfo{DynamicPPL.VarInfo{NamedTuple{(:ρ,),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:ρ,Tuple{}},Int64},Array{Uniform{Float64},1},Array{DynamicPPL.VarName{:ρ,Tuple{}},1},Array{ForwardDiff.Dual{ForwardDiff.Tag{Turing.Core.var"#f...

jkbest2 · 2020-09-16T22:10:42Z

This PR is probably relevant: JuliaDiff/ChainRules.jl#246. If I understand the AD machinery being used here (I don't), once sparse array support is added, rules for sparse Choleskys can be added that will enable differentiating through. Unfortunately CHOLMOD doesn't know what Dual numbers are, so ForwardDiff AD fails.

ElOceanografo · 2020-10-13T20:15:58Z

Reading the stacktrace for my error above more carefully, I the error is actually coming from the Sparse constructor, not the Cholesky decomposition, so hopefully JuliaDiff/ChainRules.jl#246 will solve the problem.

ElOceanografo · 2020-11-13T21:36:08Z

I've done more investigating and think that it might be easier to implement a custom TuringMvNormalCanon type in this package, at least as long as Distributions uses PDMats. The issue is that even with a custom adjoint defined for the Cholesky decomposition, the rrule still has to calculate the factorization--which, as long as it is done via SuiteSparse.CHOLMOD, means it won't be able to handle dual numbers.

However, I recently found LDLFactorizations.jl, a pure-Julia port of the LDL decomposition from SuiteSparse--which can be used to calculate the Cholesky factors, even for non-float matrices. Using this approach, I've hacked together a custom MvNormalCanon implementation that works in my example Turing model above. @devmotion , does this sound like a worthwhile PR?

devmotion · 2020-11-13T22:37:32Z

The issue is that even with a custom adjoint defined for the Cholesky decomposition, the rrule still has to calculate the factorization--which, as long as it is done via SuiteSparse.CHOLMOD, means it won't be able to handle dual numbers.

I don't understand why there is a problem with the custom adjoint here - what is missing or broken in the definition of the adjoint in ChainRules (https://github.com/JuliaDiff/ChainRules.jl/blob/f63f385e7a8477c5112f0b2cf24790ca1abf6320/src/rulesets/LinearAlgebra/factorization.jl#L73)?

However, I recently found LDLFactorizations.jl, a pure-Julia port of the LDL decomposition from SuiteSparse--which can be used to calculate the Cholesky factors, even for non-float matrices.

Interesting, I didn't know this package. I wonder if it is problematic that it uses the LGPL license.

ElOceanografo · 2020-11-13T23:49:07Z

In my example above, the precision matrix Q is defined in terms of the unknown parameter ρ, so when applying any autodiff routine to the model, Q ends up being a SparseMatrixCSC with elements of some dual number type. The error is coming from the PDSparseMat constructor trying to precalculate cholesky(Q) using CHOLMOD, not the actual calculation of the likelihood or its derivative.

The rrule seems to be defined correctly, but it still calls CHOLMOD if the matrix X is sparse (line 74 in the file you linked). My understanding of how rrules work is pretty superficial, so I may be wrong, but I think it would run into the same problem there, right?

Regarding the LGPL...I'm not a lawyer, but I believe it's okay to use LGPL libraries in non-GPL code, as long as the non-GPL code isn't a derivative work. SuiteSparse itself is under the LGPL, so if using LDLFactorizations causes license issues, they're at least not new ones...

devmotion · 2020-11-14T07:14:37Z

Ah OK, I think I understand now. I'm a bit surprised that ForwardDiff doesn't support cholesky of sparse matrices but I've never checked.

Did you try to use Zygote instead of ForwardDiff? It uses the adjoints in ChainRules and does not operate with dual numbers.

ElOceanografo · 2020-11-16T17:18:47Z

Zygote doesn't work either, it fails with a Mutating arrays is not supported error. Some of the linear algebra functions build up their results in-place.

Traceback

```julia Mutating arrays is not supported error(::String) at error.jl:33 (::Zygote.var"#364#365")(::Nothing) at array.jl:58 (::Zygote.var"#2245#back#366"{Zygote.var"#364#365"})(::Nothing) at adjoint.jl:49 diag at cholmod.jl:1792 [inlined] (::typeof(∂(diag)))(::Array{Float64,1}) at interface2.jl:0 logdet at cholmod.jl:1801 [inlined] (::typeof(∂(logdet)))(::Float64) at interface2.jl:0 logdet at pdsparsemat.jl:55 [inlined] (::typeof(∂(logdet)))(::Float64) at interface2.jl:0 logdetcov at mvnormalcanon.jl:165 [inlined] (::typeof(∂(logdetcov)))(::Float64) at interface2.jl:0 mvnormal_c0 at mvnormal.jl:99 [inlined] (::typeof(∂(mvnormal_c0)))(::Int64) at interface2.jl:0 _logpdf at mvnormal.jl:127 [inlined] (::typeof(∂(_logpdf)))(::Int64) at interface2.jl:0 JuliaLang/julia#119 at multivariates.jl:262 [inlined] (::typeof(∂(λ)))(::Int64) at interface2.jl:0 JuliaLang/julia#1071 at broadcast.jl:140 [inlined] JuliaLang/julia#3 at generator.jl:36 [inlined] iterate at generator.jl:47 [inlined] collect(::Base.Generator{Base.Iterators.Zip{Tuple{Array{typeof(∂(λ)),1},FillArrays.Fill{Int64,1,Tuple{Base.OneTo{Int64}}}}},Base.var"#3#4"{Zygote.var"#1071#1078"}}) at array.jl:686 map at abstractarray.jl:2248 [inlined] (::Zygote.var"#1070#1077"{Tuple{UnitRange{Int64}},Val{2},Array{typeof(∂(λ)),1}})(::FillArrays.Fill{Int64,1,Tuple{Base.OneTo{Int64}}}) at broadcast.jl:140 JuliaLang/julia#3862#back at adjoint.jl:49 [inlined] (::Zygote.var"#150#151"{Zygote.var"#3862#back#1081"{Zygote.var"#1070#1077"{Tuple{UnitRange{Int64}},Val{2},Array{typeof(∂(λ)),1}}},Tuple{Tuple{Nothing,Nothing,Nothing},Tuple{}}})(::FillArrays.Fill{Int64,1,Tuple{Base.OneTo{Int64}}}) at lib.jl:191 JuliaLang/julia#1693#back at adjoint.jl:49 [inlined] broadcasted at broadcast.jl:1257 [inlined] (::typeof(∂(broadcasted)))(::FillArrays.Fill{Int64,1,Tuple{Base.OneTo{Int64}}}) at interface2.jl:0 JuliaLang/julia#556 at array.jl:243 [inlined] (::typeof(∂(#556)))(::Int64) at interface2.jl:0 JuliaLang/julia#41 at interface.jl:45 [inlined] JuliaLang/julia#557 at array.jl:244 [inlined] sum at reducedim.jl:720 [inlined] (::typeof(∂(sum)))(::Int64) at interface2.jl:0 loglikelihood at multivariates.jl:262 [inlined] (::typeof(∂(loglikelihood)))(::Int64) at interface2.jl:0 observe at context_implementations.jl:152 [inlined] observe at hmc.jl:557 [inlined] _tilde at context_implementations.jl:109 [inlined] tilde at context_implementations.jl:67 [inlined] (::typeof(∂(tilde)))(::Int64) at interface2.jl:0 tilde_observe at context_implementations.jl:89 [inlined] (::typeof(∂(tilde_observe)))(::Nothing) at interface2.jl:0 JuliaLang/julia#7 at untitled-80c47956c154a0947dfbe4a5c1a877ba:92 [inlined] (::typeof(∂(#7)))(::Nothing) at interface2.jl:0 macro expansion at model.jl:0 [inlined] _evaluate at model.jl:160 [inlined] (::typeof(∂(_evaluate)))(::Nothing) at interface2.jl:0 evaluate_threadsafe at model.jl:150 [inlined] (::typeof(∂(evaluate_threadsafe)))(::Nothing) at interface2.jl:0 Model at model.jl:94 [inlined] (::typeof(∂(λ)))(::Nothing) at interface2.jl:0 JuliaLang/julia#150 at lib.jl:191 [inlined] JuliaLang/julia#1693#back at adjoint.jl:49 [inlined] Model at model.jl:98 [inlined] f at ad.jl:165 [inlined] (::typeof(∂(λ)))(::Int64) at interface2.jl:0 (::Zygote.var"#41#42"{typeof(∂(λ))})(::Int64) at interface.jl:45 gradient_logp(::Turing.Core.ZygoteAD, ::Array{Float64,1}, ::DynamicPPL.VarInfo{NamedTuple{(:ρ,),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:ρ,Tuple{}},Int64},Array{Uniform{Float64},1},Array{DynamicPPL.VarName{:ρ,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64}, ::DynamicPPL.Model{var"#7#8",(:x,),(),(),Tuple{Array{Float64,1}},Tuple{}}, ::DynamicPPL.Sampler{NUTS{Turing.Core.ZygoteAD,(),AdvancedHMC.DiagEuclideanMetric},Turing.Inference.SamplerState{DynamicPPL.VarInfo{NamedTuple{(:ρ,),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:ρ,Tuple{}},Int64},Array{Uniform{Float64},1},Array{DynamicPPL.VarName{:ρ,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64}}}, ::DynamicPPL.DefaultContext) at ad.jl:171 gradient_logp at ad.jl:83 [inlined] gradient_logp at ad.jl:83 [inlined] ∂logπ∂θ at hmc.jl:474 [inlined] ∂H∂θ at hamiltonian.jl:31 [inlined] phasepoint at hamiltonian.jl:69 [inlined] find_good_stepsize(::Random._GLOBAL_RNG, ::AdvancedHMC.Hamiltonian{AdvancedHMC.DiagEuclideanMetric{Float64,Array{Float64,1}},Turing.Inference.var"#logπ#55"{DynamicPPL.VarInfo{NamedTuple{(:ρ,),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:ρ,Tuple{}},Int64},Array{Uniform{Float64},1},Array{DynamicPPL.VarName{:ρ,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64},DynamicPPL.Sampler{NUTS{Turing.Core.ZygoteAD,(),AdvancedHMC.DiagEuclideanMetric},Turing.Inference.SamplerState{DynamicPPL.VarInfo{NamedTuple{(:ρ,),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:ρ,Tuple{}},Int64},Array{Uniform{Float64},1},Array{DynamicPPL.VarName{:ρ,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64}}},DynamicPPL.Model{var"#7#8",(:x,),(),(),Tuple{Array{Float64,1}},Tuple{}}},Turing.Inference.var"#∂logπ∂θ#54"{DynamicPPL.VarI... ```

devmotion · 2020-11-16T19:37:08Z

Sorry, it's not clear to me what exactly you tested here. Is it the Turing model above? That might still require some additional work but I was just referring to the statements about AD of cholesky above - I assumed that this should work with sparse matrices since Zygote should use the custom adjoint defined in ChainRules.

ElOceanografo · 2020-11-16T21:36:00Z

Yes I just ran the Turing model above after setting the AD backend to Zygote...sorry for not being clearer. However, I also get the same error with a hand-coded likelihood function :

using Distributions, Zygote, SparseArrays, PDMats

const xobs = randn(10)

function loglik(θ)
    ρ = 0.5tanh(θ[1])
    n = length(xobs)
    Q = spdiagm(-1 => ρ * ones(n-1), 0 => ones(n), 1 => ρ * ones(n-1))
    d = MvNormalCanon(PDSparseMat(Q))
    return logpdf(d, xobs)
end

Zygote.gradient(loglik, [1.0])
# ERROR: Mutating arrays is not supported

jkbest2 · 2020-11-16T22:07:08Z

For what it's worth, the pullback and pushforward of cholesky with a sparse argument appears to work:

julia> n = 100
julia> ρ = 0.5
julia> Q = spdiagm(-1 => ρ * ones(n-1), 0 => ones(n), 1 => ρ * ones(n-1))

julia> pullback(cholesky, Q)
(SuiteSparse.CHOLMOD.Factor{Float64}
type:    LLt
method:  simplicial
maxnnz:  19
nnz:     19
success: true
, Zygote.var"#41#42"{Zygote.ZBack{ChainRules.var"#cholesky_pullback#1615"{SuiteSparse.CHOLMOD.Factor{Float64}}}}(Zygote.ZBack{ChainRules.var"#cholesky_pullback#1615"{SuiteSparse.CHOLMOD.Factor{Float64}}}(ChainRules.var"#cholesky_pullback#1615"{SuiteSparse.CHOLMOD.Factor{Float64}}(SuiteSparse.CHOLMOD.Factor{Float64}
type:    LLt
method:  simplicial
maxnnz:  19
nnz:     19
success: true
))))

julia> pushforward(cholesky, Q)
JuliaLang/julia#8 (generic function with 1 method)

But trying to get the gradient of logdet with respect to a sparse matrix fails:

julia> Zygote.gradient(logdet, Q)
ERROR: MethodError: no method matching logabsdet(::SuiteSparse.UMFPACK.UmfpackLU{Float64,Int64})
Closest candidates are:
  logabsdet(::SymTridiagonal; shift) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/LinearAlgebra/src/tridiag.jl:441
  logabsdet(::UnitUpperTriangular{T,S} where S<:AbstractArray{T,2}) where T at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/LinearAlgebra/src/triangular.jl:2652
  logabsdet(::UnitLowerTriangular{T,S} where S<:AbstractArray{T,2}) where T at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/LinearAlgebra/src/triangular.jl:2653
  ...
Stacktrace:
 [1] logabsdet(::SparseMatrixCSC{Float64,Int64}) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/LinearAlgebra/src/generic.jl:1573
 [2] adjoint at /home/jkbest/.julia/packages/Zygote/c0awc/src/lib/array.jl:378 [inlined]
 [3] _pullback at /home/jkbest/.julia/packages/ZygoteRules/6nssF/src/adjoint.jl:47 [inlined]
 [4] logdet at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/LinearAlgebra/src/generic.jl:1596 [inlined]
 [5] _pullback(::Zygote.Context, ::typeof(logdet), ::SparseMatrixCSC{Float64,Int64}) at /home/jkbest/.julia/packages/Zygote/c0awc/src/compiler/interface2.jl:0
 [6] _pullback(::Function, ::SparseMatrixCSC{Float64,Int64}) at /home/jkbest/.julia/packages/Zygote/c0awc/src/compiler/interface.jl:38
 [7] pullback(::Function, ::SparseMatrixCSC{Float64,Int64}) at /home/jkbest/.julia/packages/Zygote/c0awc/src/compiler/interface.jl:44
 [8] gradient(::Function, ::SparseMatrixCSC{Float64,Int64}) at /home/jkbest/.julia/packages/Zygote/c0awc/src/compiler/interface.jl:53
 [9] top-level scope at REPL[22]:1

And if you use a function to create a AR-1 precision matrix and try to take logdet with respect to ρ, you get the same error about mutating arrays.

julia> spchol(ρ; n = 100) =  cholesky(spdiagm(-1 => ρ * ones(n-1), 0 => ones(n), 1 => ρ * ones(n-1)))
julia> Zygote.gradient(ρ -> logdet(spchol(ρ)), 0.5)
ERROR: Mutating arrays is not supported
Stacktrace:
 [1] error(::String) at ./error.jl:33
 [2] (::Zygote.var"#364#365")(::Nothing) at /home/jkbest/.julia/packages/Zygote/c0awc/src/lib/array.jl:58
 [3] (::Zygote.var"#2245#back#366"{Zygote.var"#364#365"})(::Nothing) at /home/jkbest/.julia/packages/ZygoteRules/6nssF/src/adjoint.jl:49
 [4] diag at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/SuiteSparse/src/cholmod.jl:1792 [inlined]
 [5] (::typeof(∂(diag)))(::Array{Float64,1}) at /home/jkbest/.julia/packages/Zygote/c0awc/src/compiler/interface2.jl:0
 [6] logdet at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/SuiteSparse/src/cholmod.jl:1801 [inlined]
 [7] (::typeof(∂(logdet)))(::Float64) at /home/jkbest/.julia/packages/Zygote/c0awc/src/compiler/interface2.jl:0
 [8] JuliaLang/julia#4 at ./REPL[20]:1 [inlined]
 [9] (::typeof(∂(#4)))(::Float64) at /home/jkbest/.julia/packages/Zygote/c0awc/src/compiler/interface2.jl:0
 [10] (::Zygote.var"#41#42"{typeof(∂(#4))})(::Float64) at /home/jkbest/.julia/packages/Zygote/c0awc/src/compiler/interface.jl:45
 [11] gradient(::Function, ::Float64) at /home/jkbest/.julia/packages/Zygote/c0awc/src/compiler/interface.jl:54
 [12] top-level scope at REPL[20]:1

I'm not sure if any of this is helpful, but maybe a Zygote issue would be more productive?

devmotion · 2020-11-16T22:11:54Z

For what it's worth, the pullback and pushforward of cholesky with a sparse argument appears to work:

That's exactly what I assumed and why I thought that for Zygote the problem mentioned above for ForwardDiff

The error is coming from the PDSparseMat constructor trying to precalculate cholesky(Q) using CHOLMOD, not the actual calculation of the likelihood or its derivative.

should not exist.

But trying to get the gradient of logdet with respect to a sparse matrix fails:

It seems logabsdet is not defined for sparse matrices (and its LU decomposition)?!

ElOceanografo · 2020-11-17T02:20:49Z

It seems logabsdet is not defined for sparse matrices (and its LU decomposition)?!

Weird that we'd be the first people to notice this, but that seems to be the case. There are a few closed issues for other missing logabsdet methods, but not for SparseMatrixCSC. I've opened an issue about it: JuliaLang/LinearAlgebra.jl#788

Anyway, in the meantime...what do you think is the best way to move forward on this issue?

ElOceanografo · 2021-05-14T17:04:38Z

Some progress on this: logabsdet will now work for sparse matrices (on Julia master, should be released in v1.7). Once the AD issues get sorted out, this issue can probably be closed.

ElOceanografo mentioned this issue May 28, 2023

Errors for gradient and hessian of logabsdet of sparse matrix JuliaDiff/ChainRules.jl#719

Closed

ElOceanografo mentioned this issue Aug 9, 2023

det, logdet, and logabsdet rrules for SparseMatrixCSC JuliaDiff/ChainRules.jl#730

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Need for MvNormal with sparse precision matrix #89

Need for MvNormal with sparse precision matrix #89

jrfaulkner commented Jun 24, 2020

devmotion commented Jun 26, 2020

jrfaulkner commented Jun 26, 2020

devmotion commented Jun 26, 2020

jrfaulkner commented Jun 26, 2020

ElOceanografo commented Aug 22, 2020

jkbest2 commented Sep 16, 2020

ElOceanografo commented Oct 13, 2020

ElOceanografo commented Nov 13, 2020

devmotion commented Nov 13, 2020

ElOceanografo commented Nov 13, 2020

devmotion commented Nov 14, 2020

ElOceanografo commented Nov 16, 2020

devmotion commented Nov 16, 2020

ElOceanografo commented Nov 16, 2020

jkbest2 commented Nov 16, 2020

devmotion commented Nov 16, 2020

ElOceanografo commented Nov 17, 2020

ElOceanografo commented May 14, 2021

Need for MvNormal with sparse precision matrix #89

Need for MvNormal with sparse precision matrix #89

Comments

jrfaulkner commented Jun 24, 2020

devmotion commented Jun 26, 2020

jrfaulkner commented Jun 26, 2020

devmotion commented Jun 26, 2020

jrfaulkner commented Jun 26, 2020

ElOceanografo commented Aug 22, 2020

jkbest2 commented Sep 16, 2020

ElOceanografo commented Oct 13, 2020

ElOceanografo commented Nov 13, 2020

devmotion commented Nov 13, 2020

ElOceanografo commented Nov 13, 2020

devmotion commented Nov 14, 2020

ElOceanografo commented Nov 16, 2020

devmotion commented Nov 16, 2020

ElOceanografo commented Nov 16, 2020

jkbest2 commented Nov 16, 2020

devmotion commented Nov 16, 2020

ElOceanografo commented Nov 17, 2020

ElOceanografo commented May 14, 2021