Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace the Sample and IndependentSample traits with Distribution. #27

Conversation

GrahamDennis
Copy link
Contributor

The Sample trait was always being implemented by calling through to IndependentSample.ind_sample, so no replacement for the state-mutating trait (formerly known as Sample) has been provided.

The only situation I can imagine needing a state-mutating trait like the old Sample is if you are simulating drawing objects from a bag without replacement.

This is a breaking change for users of this library.

@rust-highfive
Copy link

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @huonw (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. The way Github handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see CONTRIBUTING.md for more information.

@huonw
Copy link
Contributor

huonw commented Mar 15, 2015

Thanks for the pull request!

I'm definitely unhappy with the design of distributions at the moment, but I'm unsure about this. There are a few other examples of stateful distributions, e.g. just off the top of my head: a random walk, a (discretised) Wiener process, or a "Tetris" RNG (although, this is essentially just sampling without replacement with a small twist).

I'll sleep on it.

@GrahamDennis
Copy link
Contributor Author

I figure we have two things here: "random distributions", which is what was called IndependentSample, and "random processes", which was called Sample. For "random processes" we have both infinite random processes like random walks, and finite random processes like drawing items from a bag without replacement.

All of these generate a stream of values, and could be replaced with the Iterator trait if all implementing structs had an owned Rng. Now that I think about it, I like that idea as it gives you access to all of the Iterator machinery for operating on streams of random values. The downside of this approach is a reduction in flexibility as the Rng must be specified when the random stream is constructed.

An alternative approach would be to implement a method for random distributions / processes that consumes an Rng and returns an iterator.

How about:

trait RandomProcess<Support> {
    type ProcessIterator : Iterator<Item=Support>;
    fn iter<R: Rng>(&self, rng: R) -> ProcessIterator;
}

or perhaps we should be adding a new method on Rng like gen_iter:

trait Rng {
    fn gen_distribution_iter<'a, D, T>(&'a mut self, distribution: &'a D) -> Iterator<Item=T>
        where D: RandomDistribution<T> { ... }
    fn gen_process_iter<'a, P, T>(&'a mut self, process: &'a mut P) -> Iterator<Item=T>
        where P: RandomProcess<T> { ... }
}

We could then provide a blanket impl for RandomProcess<T> for all RandomDistribution<T>, and also a blanket impl for RandomDistribution<T> for all Rand.

@GrahamDennis GrahamDennis force-pushed the feature/independent-sample-becomes-sample branch 2 times, most recently from 106ad0e to d1b3a18 Compare March 21, 2015 07:12
@GrahamDennis
Copy link
Contributor Author

I've just pushed a bunch of commits for feedback. I need to update the documentation, but let me know if you're OK with this approach.

I've replaced Sample and IndependentSample with Distribution, which is equivalent to IndependentSample, and describes a probability distribution.

I've added sample and sample_iter methods to the Rng trait so that you can sample a single value or get a stream of values from a Distribution.

I can't implement Rng.gen_iter in terms of sample_iter without making sample_iter take an owned Distribution<T>, so I left Rng.gen_iter with a separate implementation.

@GrahamDennis GrahamDennis force-pushed the feature/independent-sample-becomes-sample branch from d1b3a18 to 38259ee Compare March 22, 2015 06:49
@GrahamDennis GrahamDennis changed the title Replace the Sample trait with IndependentSample. Replace the Sample and IndependentSample traits with Distribution. Mar 22, 2015
@GrahamDennis
Copy link
Contributor Author

I've just a change to make Distribution have an associated type instead of being generic over what is really an output trait.

type Output = T;

fn sample<R: Rng>(&self, _: &mut R) -> T {
let ConstantDistribution(ref value) = *self;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be self.0.clone(): the fields of tuple structs (and tuples) can be accessed positionally.

@huonw
Copy link
Contributor

huonw commented Mar 22, 2015

I like the changes (and agree wrt to the distinction between distributions and processes).

There's one slight sticking point, though. We're planning to pull parts of rand back into std. Part of the current design is to make Rand incorporate some aspects of Distribution, you can see initial thoughts along these lines in https://github.com/rust-lang/rfcs/pull/722/files#diff-4b581bede75dd1010402640c959e8c9fR164 , and later versions https://gist.github.com/aturon/d53613d1c0cce1297bf1 and http://huonw.github.io/rand-sketch/rand-sketch/ (offered without explanation for now because I don't want to leave you hanging any longer, I'll come back with more details/context in a bit). The only real change I'd like to see to this patch along those lines is removing the changes to Rng to keep distributions as separate as possible.

I'm very interested to hear whatever thoughts you have though (even if you can't make complete sense of the links above, since it was written as part of an ongoing discussion).

@huonw
Copy link
Contributor

huonw commented Mar 22, 2015

I'll come back with more details/context in a bit

Unfortunately, I'm not going to be able to do this tonight: still working on a more detail sketch of the proposal in @aturon's gist.

@huonw
Copy link
Contributor

huonw commented Mar 22, 2015

(Oh, I forgot to link to https://gist.github.com/huonw/98252a92111c27d7e3e2 also.)

@GrahamDennis
Copy link
Contributor Author

Thanks for the feedback, I have to head off to work, I'll try to take a look at this this evening.

@GrahamDennis GrahamDennis force-pushed the feature/independent-sample-becomes-sample branch from baea54a to c73ab02 Compare March 23, 2015 10:59
@GrahamDennis
Copy link
Contributor Author

I've pushed an updated which removes the changes to Rng. This just removes Sample and IndependentSample and replaces them with Distribution.

I still haven't had a chance to read through the design for rand in std. I'd like to, and I'll try tomorrow night.

@GrahamDennis
Copy link
Contributor Author

I'm not sure the best place to comment on your proposals, but how about this:

/// Type that can be used to create a random instance of `Output`.
///
/// Since no state is recorded, each sample is (statistically)
/// independent of all others, assuming the `Rng` used has this
/// property.
pub trait Distribution {
    type Output;
    /// Generate a random value of `Output`, using `rng` as the
    /// source of randomness.
    fn sample<R: Rng>(&self, rng: &mut R) -> <Self as Distribution>::Output;
}

/// This, together with `NewRandom` below is the replacement trait for `Rand`.  Instead of having `Rand::rand(&mut rng) -> T`, you now call
/// `T::default_distribution().sample(&mut rng)`.  This can be simplified to `NewRandom::rand(&mut rng) -> T`.
pub trait DefaultDistribution {
    type OutputDistribution: Distribution<Output=Self>;

    fn default_distribution() -> <Self as DefaultDistribution>::OutputDistribution;
}

pub trait NewRandom {
    fn rand<R: Rng>(rng: &mut R) -> Self;
}

impl <T> NewRandom for T where T: DefaultDistribution
{
    fn rand<R: Rng>(rng: &mut R) -> Self {
        T::default_distribution().sample(rng)
    }
}

/// An example implementation of DefaultDistribution that uses the existing `Rand` trait via the `RandDistribution` struct.
impl <T: Rand> DefaultDistribution for T {
    type OutputDistribution = RandDistribution<T>;

    fn default_distribution() -> RandDistribution<T> {
        RandDistribution::new()
    }
}

This allows you to precompute constants for generating values in a range, by just having the existing Range struct (although I'd probably rename it to UniformRangeDistribution). And you can then have default methods on the Rng trait like:

trait Rng {
    fn gen<D: Distribution>(&mut self, distribution: &D) -> <D as Distribution>::Output;
    fn gen_iter<D: Distribution>(&mut self, distribution: &D) -> GenIter<'a, Self, D>;
}

If you want to be able to call Rng.gen(1..4) and Rng.gen_iter(1..4), then maybe we can extend the above with a IntoDistribution trait which can convert things like the Range, RangeFrom and RangeTo structs into actual Distribution objects.

The advantage of explicitly generating Distribution objects is that we can do whatever precomputation we need to to make random number generation more efficient.

@GrahamDennis
Copy link
Contributor Author

(Ping @huonw)

@GrahamDennis GrahamDennis force-pushed the feature/independent-sample-becomes-sample branch from c73ab02 to 6a6be06 Compare March 23, 2015 13:44
@huonw
Copy link
Contributor

huonw commented Mar 23, 2015

@GrahamDennis interestingly, I think that is quite similar to @aturon's stream proposal (https://gist.github.com/aturon/d53613d1c0cce1297bf1). I also think some of the differences are valuable improvements (particularly using an associated type for Output), and the others are definitely worth considering.

Thanks for taking the time to take a look!

(It's late here, so I'll take a (hopefully) final look at this particular patch tomorrow.)

@GrahamDennis GrahamDennis force-pushed the feature/independent-sample-becomes-sample branch from 6a6be06 to b892802 Compare March 27, 2015 21:39
@GrahamDennis
Copy link
Contributor Author

Ping @huonw! I've rebased my work on the latest master.

I'm also keen to be involved in the redesign of Rand / Random and friends. Please let me know what I should be reading / contributing to.

You can see a basic implementation of what I'm thinking of here: GrahamDennis/rand@master...GrahamDennis:feature/random-experiment It lets you do rng.sample(..), rng.sample_iter(1..58), or rng.sample(Normal::new(0.0, 1.0)). In particular rng.sample_iter(1..58) creates a distribution object once and reuses it for every subsequent sample from the iterator.

@GrahamDennis GrahamDennis force-pushed the feature/independent-sample-becomes-sample branch from b892802 to 7336ee7 Compare March 29, 2015 21:37
Also removed the `Sample` trait.

`IndependentSample` is really a probability distribution of values, and hence renamed to `Distribution`, but with an associated `Output` type instead of being generic over a type.  The `Sample` trait is a state-mutating version of `IndependentSample` and is really a random process.  A random process can be infinite (e.g. a random walk) or finite (e.g. drawing from a bag without replacement).  So really, a random process is best represented by an `Iterator`.  Additionally all previous implementations of `Sample` simply called through to `IndependentSample.ind_sample`, so `Sample` has simply been removed.

The related trait `SampleRange` has been renamed to `RangeDistribution`.
@GrahamDennis GrahamDennis force-pushed the feature/independent-sample-becomes-sample branch from 7336ee7 to e6e78bd Compare April 11, 2015 09:39
@bluss bluss mentioned this pull request Aug 16, 2015
@dhardy
Copy link
Member

dhardy commented Jul 24, 2017

@huonw @alexcrichton what's the story on this? Random numbers are quite an important library feature, and this issue has gone two years without comment. I would be happy to help on this, but it looks like @GrahamDennis has already proposed several good changes.

Specifically, I agree with Graham that random walks and the tetris sampler are not distributions, thus there is no real need for a mutating sampler. (A random walk at its simplest is adding an independent sample to a position variable, with variations such as Lévy flight; the tetris sampler is really just randomly sorting a list, popping elements, then repeating. I don't see why such things deserve an implementation here.)

I'm not sure whether pulling RNGs into std is a good idea; there would still need to be a crate implementing less common generators and distributions.

@dhardy
Copy link
Member

dhardy commented Feb 20, 2018

Closing in favour of #256.

@dhardy dhardy closed this Feb 20, 2018
dhardy added a commit that referenced this pull request Feb 20, 2018
This is heavily inspired by #27 by @GrahamDennis but simpler trait and
maintains backwards compatibility with deprecations.
pitdicker pushed a commit that referenced this pull request Apr 4, 2018
This is heavily inspired by #27 by @GrahamDennis but simpler trait and
maintains backwards compatibility with deprecations.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
B-API Breakage: API
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants