Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crosscat hyperprior grid on variance parameter is broader than it needs to be #90

Open
alxempirical opened this issue Mar 29, 2016 · 9 comments

Comments

@alxempirical
Copy link
Contributor

Crosscat adopts a uniform hyperprior over the parameters of the Normal Gamma prior distribution on the Normals (see footnote 1, p. 8.) For the "variance" parameter, this is done by constructing a grid from roughly 0 to sum((x-x̄)^2) in [construct_continuous_specific_hyper_grid](https://github.com/probcomp/crosscat/blob/6dadb9b33f7111449d5daf5683a1eac6365431a4/cpp_code/src/utils.cpp#L434}. The largest variance which makes sense for a sample {x} is max((x-x̄)^2), though. Since this is a grid of 31 elements, we're potentially losing a fair bit of precision here, and may be able to tighten up convergence a bit by tightening this bound.

@riastradh-probcomp
Copy link
Contributor

Sounds reasonable to me. What about the lower bound? .01*sum((x-x̄)^2) as we currently use seems pretty arbitrary to me -- surely there could be clusters with much smaller variance than that, very far away from one another.

@riastradh-probcomp
Copy link
Contributor

You could specify 0 as a lower bound for log_linspace, and you would get a grid starting at the smallest positive normal floating-point number.

@riastradh-probcomp
Copy link
Contributor

I suppose the smallest distance between any pair would be a better starting point, but that takes O(n^2) time to compute.

@alxempirical
Copy link
Contributor Author

I guess you can do it in O(n*log(n)) by sorting, since closest pairs will be adjacent in the sorted list.

@riastradh-probcomp
Copy link
Contributor

Right -- I was inexplicably thinking of >1-dimensional spaces. That would probably be a reasonable thing to do, then.

@alxempirical
Copy link
Contributor Author

I think you can probably even do closest points in high-dimensional spaces with a KD-tree.

@alxempirical
Copy link
Contributor Author

Oh, there's a wikipedia page about this exact problem.

@riastradh-probcomp
Copy link
Contributor

Golly, my memory of computational geometry has rotted.

However, 0 may also nevertheless be a reasonable choice to start anyway -- for isolated outlying clusters we don't have a reasonable lower bound on their variance.

@alxempirical
Copy link
Contributor Author

Yes, the current lower bound seems likely to cause a problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants