Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add vectorize method for LKJCholesky #485

Merged
merged 30 commits into from
Jun 23, 2023
Merged

Add vectorize method for LKJCholesky #485

merged 30 commits into from
Jun 23, 2023

Conversation

harisorgn
Copy link
Member

@harisorgn harisorgn commented Jun 16, 2023

I opted for the simplest way to vectorize LKJCholesky samples, that is to keep the entire matrix with the extra zeroes, not just the triangular one that is sampled.

EDIT: Do you think it's worth keeping just the relevant triangular part?

test/utils.jl Outdated Show resolved Hide resolved
@github-actions
Copy link
Contributor

github-actions bot commented Jun 16, 2023

Pull Request Test Coverage Report for Build 5356891002

  • 5 of 8 (62.5%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-0.04%) to 76.408%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/utils.jl 5 8 62.5%
Totals Coverage Status
Change from base Build 5335170467: -0.04%
Covered Lines: 1927
Relevant Lines: 2522

💛 - Coveralls

@codecov
Copy link

codecov bot commented Jun 16, 2023

Codecov Report

Patch coverage: 62.50% and project coverage change: -0.05 ⚠️

Comparison is base (5f74696) 76.45% compared to head (c5b9ad9) 76.40%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #485      +/-   ##
==========================================
- Coverage   76.45%   76.40%   -0.05%     
==========================================
  Files          21       21              
  Lines        2514     2522       +8     
==========================================
+ Hits         1922     1927       +5     
- Misses        592      595       +3     
Impacted Files Coverage Δ
src/DynamicPPL.jl 100.00% <ø> (ø)
src/abstract_varinfo.jl 89.13% <ø> (ø)
src/utils.jl 78.40% <62.50%> (-0.63%) ⬇️

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@harisorgn harisorgn requested a review from torfjelde June 16, 2023 13:32
@harisorgn
Copy link
Member Author

Oh wait, reconstruct will also need some new methods. Looking into it.

@yebai
Copy link
Member

yebai commented Jun 16, 2023

Thanks, @harisorgn.
CC @sethaxen who might have thoughts on this

@devmotion
Copy link
Member

Does this require to update the Bijectors compat entry?

@torfjelde
Copy link
Member

torfjelde commented Jun 16, 2023

Oh wait, reconstruct will also need some new methods. Looking into it.

That should already be covered by the implementation for MatrixDistribution:)

EDIT: Aaah nah, this is a Cholesky; my bad. You're right!

EDIT 2: But in this case we just need

reconstruct(::Distribution{CholeskyVariate}, val::Cholesky) = copy(val)

and you should be good:)

@torfjelde
Copy link
Member

Does this require to update the Bijectors compat entry?

You mean because the value that will hit reconstruct won't the type as implemented here? I believe it would simply error before, i.e. even if you fix this part but then use lower versions of Bijectors.jl, you'll just run into an error where Bijectors.jl complains about not knowing how to handle Cholesky.

So technically the change in this PR isn't dependent on fixing Bijectors.jl, but it's not particularly useful without the most recent version of Bijectors.jl 🤷

Personally don't have any strong opinions about this.

Copy link
Member

@torfjelde torfjelde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

Happy to merge once the following has been done:

  1. Added the `reconstruct impl mentioned above.
  2. Bump patch version so we can easily release.
  3. Maybe bump compat entry of Bijectors.jl to 0.12.7.

@sethaxen
Copy link
Member

If I understand this correctly, by including also the zero elements, this will cause the induced distribution to be improper. It really is quite important that in unconstrained space all parameters bijectively map to the target space (here Cholesky factors of correlation).

@torfjelde
Copy link
Member

torfjelde commented Jun 16, 2023

It shouldn't:) This is just a default reconstruct which is not hit when we're linking:

"""
maybe_invlink_and_reconstruct(vi::AbstractVarInfo, vn::VarName, dist, val)
Return reconstructed `val`, possibly invlinked if `istrans(vi, vn)` is `true`.
"""
function maybe_invlink_and_reconstruct(vi::AbstractVarInfo, vn::VarName, dist, val)
return if istrans(vi, vn)
invlink_and_reconstruct(vi, vn, dist, val)
else
reconstruct(dist, val)
end
end

@torfjelde
Copy link
Member

@harisorgn Actually, it would be nice with a test for a model using this!

@sethaxen
Copy link
Member

It shouldn't:) This is just a default reconstruct which is not hit when we're linking:

Ah ok, so this would e.g. be used when converting Cholesky to something that MCMCChains can use? I wonder if it makes more sense then to use vec(Matrix(r)) instead of vec(r.UL). While it's true the unique entries are contained in the triangle, and for diagnostics/statistics it might make more sense to analyze that triangle, when a user uses a CholeskyVariate, the parameter is a Cholesky, which behaves like Matrix(r) and is probably used by the user similarly to Matrix(r); could be confusing if when they go to analyze their draws they get something triangular.

@torfjelde
Copy link
Member

So this isn't necessarily related to how it's represented in the chain. This is mainly just about converting from the (usually) linearized representation in varinfo into the corresponding type that the distribution expects, which in this case is a Cholesky.

As for representing in a Chains, this will be a different thing (and probably something that goes in Turing rather than DPPL)

@harisorgn
Copy link
Member Author

harisorgn commented Jun 19, 2023

But in this case we just need

reconstruct(::Distribution{CholeskyVariate}, val::Cholesky) = copy(val)

But in general reconstruct gets called with val in the unconstrained space too, right? But we don't need to account for that case here because of the generic :

reconstruct(d::Distribution, val::AbstractVector) = reconstruct(size(d), val)

Just to check my understanding.

There is an issue even after adding this method. It seems that when ForwardDiff.Duals are passed through this simple model

using Turing 

@model function lkj_demo()
   R ~ LKJ(2,1) 
end
sample(lkj_demo(), HMC(0.05,10), 10)

it looks like differentiation wrt a univariate distribution with a single value and a single partial. Same thing when LKJCholesky is used instead of LKJ (but doesn't happen with other matrix-variates like InverseWishart). Does this have to do with the fact that these two distributions are not supported in DistributionsAD? Not well versed in Turing x AD issues 😅

EDIT: To be clear, because of the shape of the ForwardDiff.Dual passing through (length of 1), an error is thrown on the model above when the reshape of the generic reconstruct method (linked above) is hit.

@harisorgn
Copy link
Member Author

@harisorgn Actually, it would be nice with a test for a model using this!

Should this test be in Turing.jl instead, as an integration test? The issues I found in this PR were because I was trying to sample from a model using LKJ/LKJCholesky

@torfjelde
Copy link
Member

Ah yes, I completely forgot! My previous suggestion of just implementing reconstruct is not the way to go.

We need to implement

reconstruct(f::Inverse{VecCorrBijector}, dist::LKJ, val::AbstractVector) = val

Basically, now all of our linking and invlinking methods also includes a call to reconstruct(f, dist, val), e.g.

"""
invlink_and_reconstruct(dist, val)
invlink_and_reconstruct(vi::AbstractVarInfo, vn::VarName, dist, val)
Return invlinked and reconstructed `val`.
See also: [`reconstruct_and_link`](@ref), [`reconstruct`](@ref).
"""
invlink_and_reconstruct(f, dist, val) = f(reconstruct(f, dist, val))
function invlink_and_reconstruct(dist, val)
return invlink_and_reconstruct(invlink_transform(dist), dist, val)
end
function invlink_and_reconstruct(::AbstractVarInfo, ::VarName, dist, val)
return invlink_and_reconstruct(dist, val)
end

and so for the transformations which completely changes the type, we need to implement the corresponding reconstruct to work with the link and invlink.

Should this test be in Turing.jl instead, as an integration test?

Yeah, that's probably a good idea.

Copy link
Member

@torfjelde torfjelde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just reversing my previous apprve

@torfjelde
Copy link
Member

torfjelde commented Jun 19, 2023

E.g. the following is working nicely for me locally::

reconstruct(f::Inverse{Bijectors.VecCorrBijector}, dist::LKJ, val::AbstractVector) = copy(val)

EDIT: Note that sampling with NUTS, it still eventually breaks but this seems to be due to numerical instability.

EDIT 2: So LKJCholesky indeed works, even with NUTS:) It fails at the point of bundle_samples in Turing, but that is expected.

@harisorgn
Copy link
Member Author

harisorgn commented Jun 19, 2023

Note that sampling with NUTS, it still eventually breaks but this seems to be due to numerical instability.

There are some issues with positive definiteness when I use it in a more complicated model (EDIT: this is me trying to write a tutorial for LKJ/LKJCholesky).

So LKJCholesky indeed works, even with NUTS:) It fails at the point of bundle_samples in Turing, but that is expected.

Yes, same. The bundling issue I guess needs to be addressed elsewhere.

test/lkj.jl Outdated Show resolved Hide resolved
test/lkj.jl Outdated Show resolved Hide resolved
test/lkj.jl Outdated Show resolved Hide resolved
test/lkj.jl Outdated Show resolved Hide resolved
@harisorgn
Copy link
Member Author

harisorgn commented Jun 23, 2023

The tanh of the inverse link gets saturated outside [-1, 1].

This might have been part of the problem, but not the whole thing, as LKJ is using the same inverse transform internally and its test was much more accurate.

Now LKJCholesky tests match the accuracy of LKJ ones by building the correlation matrix from the sampled Cholesky factors and then taking the sample mean. Don't have a formal proof but this converges to the identity matrix (which is the target mean for both Cholesky factors and correlation matrices) much faster.

harisorgn and others added 4 commits June 23, 2023 10:12
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@harisorgn
Copy link
Member Author

@torfjelde good to go? : D

@yebai yebai enabled auto-merge June 23, 2023 12:23
@yebai yebai added this pull request to the merge queue Jun 23, 2023
@torfjelde torfjelde removed this pull request from the merge queue due to a manual request Jun 23, 2023
Copy link
Member

@torfjelde torfjelde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@harisorgn Again, really awesome work:) It's going to be great to finally have this.

It looks good to me! Just a few minor suggestions, but these should be simple "commit suggestion", so I'll approve:)

Feel free to hit the merge button once you've gone of these few last comments 👍

test/lkj.jl Outdated Show resolved Hide resolved
test/lkj.jl Outdated Show resolved Hide resolved
test/lkj.jl Outdated Show resolved Hide resolved
test/lkj.jl Outdated Show resolved Hide resolved
test/lkj.jl Outdated Show resolved Hide resolved
harisorgn and others added 5 commits June 23, 2023 14:15
Co-authored-by: Tor Erlend Fjelde <tor.erlend95@gmail.com>
Co-authored-by: Tor Erlend Fjelde <tor.erlend95@gmail.com>
Co-authored-by: Tor Erlend Fjelde <tor.erlend95@gmail.com>
test/lkj.jl Outdated Show resolved Hide resolved
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@harisorgn
Copy link
Member Author

Feel free to hit the merge button once you've gone of these few last comments 👍

I'm not authorised to merge. @torfjelde @yebai feel free to do it

@yebai
Copy link
Member

yebai commented Jun 23, 2023

@harisorgn, you're a member of the TuringLang org, so you should have all the developer privileges. The setup of DynamicPPL prevents us from directly merging PRs; instead, you can click the "merge when ready" button. When CI tests pass, it will get automatically merged.

@torfjelde torfjelde enabled auto-merge June 23, 2023 15:43
@torfjelde
Copy link
Member

The excitement is tangible 👀

@torfjelde torfjelde added this pull request to the merge queue Jun 23, 2023
Merged via the queue into master with commit e6dd4ef Jun 23, 2023
@torfjelde torfjelde deleted the ho/vec_lkjcholesky branch June 23, 2023 16:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants