-
Notifications
You must be signed in to change notification settings - Fork 431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for Dirichlet distribution #485
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks equivalent to the GSL implementation, except that the GSL has special handling for small values. See GSL source.
src/distributions/dirichlet.rs
Outdated
|
||
/// The dirichelet distribution `Dirichlet(alpha)`. | ||
/// | ||
/// The Dirichlet distribution } is a family of continuous multivariate probability distributions parameterized by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stray }
Copying a fancy description from Wikipedia doesn't really explain much, especially since the links are missing. Not that I have a better idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the Mathematica explanation a bit more than Wikipedia's.
src/distributions/dirichlet.rs
Outdated
for i in 0..alpha.len() { | ||
assert!( | ||
alpha[i] > 0.0, | ||
"Dirichlet::new called with `alpha` <= 0.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't think you need the quotes. Maybe just write alpha[i] <= 0
, and alpha.len() < 2
above.
src/distributions/dirichlet.rs
Outdated
/// Construct a new `Dirichlet` with the given alpha parameter | ||
/// `alpha`. Panics if `alpha.len() < 2`. | ||
#[inline] | ||
pub fn new(alpha: &[f64]) -> Dirichlet { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest you accept any vec-like thing here: new<V: Into<Vec<f64>>(alpha: V)
and then let alpha = alpha.into();
. Drop the to_vec()
later.
src/distributions/dirichlet.rs
Outdated
|
||
/// Construct a new `Dirichlet` with the given shape parameter and size | ||
/// `alpha`. Panics if `alpha <= 0.0`. | ||
/// `size` . Panic if `size < 2` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't render well. If you want a list, leave a blank line, the prefix each item with -
(it's Markdown). Otherwise just rewrite as two sentences.
src/distributions/dirichlet.rs
Outdated
#[inline] | ||
pub fn new_with_param(alpha: f64, size: usize) -> Dirichlet { | ||
assert!(alpha > 0.0, "Dirichlet::new called with `alpha` <= 0.0"); | ||
assert!(size > 1, "Dirichlet::new called with `size` <= 1"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is new_with_param
not new
. Again you can drop the extra quotes on parameter names.
src/distributions/mod.rs
Outdated
@@ -95,7 +95,7 @@ | |||
//! - [`ChiSquared`] distribution | |||
//! - [`StudentT`] distribution | |||
//! - [`FisherF`] distribution | |||
//! | |||
//! - Dirichlet distribution |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The [`Dirichlet`]
link is missing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's been added now, but Dirichlet
seems to be mentioned twice in a row
src/distributions/dirichlet.rs
Outdated
Dirichlet { | ||
alpha: alpha.to_vec(), | ||
} | ||
Dirichlet { alpha: a.into() } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you don't need into()
again here — it's already a Vec
src/distributions/mod.rs
Outdated
@@ -95,7 +95,7 @@ | |||
//! - [`ChiSquared`] distribution | |||
//! - [`StudentT`] distribution | |||
//! - [`FisherF`] distribution | |||
//! | |||
//! - Dirichlet distribution |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's been added now, but Dirichlet
seems to be mentioned twice in a row
src/distributions/dirichlet.rs
Outdated
assert!(a[i] > 0.0); | ||
} | ||
|
||
Dirichlet { alpha: a.into() } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This bit. Not sure why my last comment went somewhere else. a
is already your target type, so you don't need .into()
again.
@dhardy addressed review comment. |
Thanks @rohitjoshi! Looks good enough to me. Does anyone else want to comment before merging? |
|
||
/// The dirichelet distribution `Dirichlet(alpha)`. | ||
/// | ||
/// The Dirichlet distribution is a family of continuous multivariate probability distributions parameterized by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is a bit long. I think we usually wrap comments at 80 characters.
/// The dirichelet distribution `Dirichlet(alpha)`. | ||
/// | ||
/// The Dirichlet distribution is a family of continuous multivariate probability distributions parameterized by | ||
/// a vector alpha of positive reals. https://en.wikipedia.org/wiki/Dirichlet_distribution |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The naked link will look weird in the docs. I think you can remove, there is not precedence in Rand for linking to Wikipedia.
} | ||
} | ||
|
||
impl Distribution<Vec<f64>> for Dirichlet { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure our current distribution trait is well suited for multivariate distributions. It would be nice to sample without allocating, but this requires different method. Something like fn sample_multi(&self, &mut Rng, &mut [f64])
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is why you opened #496. I agree. On the other hand, I'm not too fussed about having to make breaking changes to this distribution later (it's still better for users than not having it, and we're not close to 1.0).
#[derive(Clone, Debug)] | ||
pub struct Dirichlet { | ||
/// Concentration parameters (alpha) | ||
alpha: Vec<f64>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about it, we could probably use alpha: [f64]
here. It makes the type "unsized" (i.e. users have to write Box<Dirichlet>
) but is more flexible (potentially more optimal).
On the other hand it may not be worth it since it makes the type less ergonomic to use for what is probably not a lot of gain.
Another option would be Dirichlet<N: usize> { alpha: [f64; N] }
— except I don't think Rust supports that yet (though it would also allow sample(..) -> [f64; N]
, thus side-stepping @vks's concerns).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do our distributions even work when you write Box<Dirichlet>
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rng::sample
won't of course but Distribution::sample
will. Either way it's not really a great choice (less convenient for users).
I think it is just time to merge this PR? |
@@ -81,7 +81,6 @@ | |||
//! - Related to real-valued quantities that grow linearly | |||
//! (e.g. errors, offsets): | |||
//! - [`Normal`] distribution, and [`StandardNormal`] as a primitive | |||
//! - [`Cauchy`] distribution |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rohitjoshi did you intend to move the reference to Cauchy in this documentation? Because you haven't added it back.
Maybe @pitdicker. It still needs a couple of doc tweaks, but of course any of us can do that too. |
Added support for the Dirichlet distribution.
Fixes #400