Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Full update of weighted index by assigning weights #1194

Closed

Conversation

SuperFluffy
Copy link

I need to update my weighted indices inside a hot loop. Instead of reconstructing the entire index from scratch, this commit allows updating the inner cumulative weights in-place using a slice of weights via WeightedIndex::assign_weights. WeightedIndex::assign_weights_unchecked` is also provided in those cases where the user promises that all weights are valid and their sum exceeds zero.

Open questions

How to handle a partial update? If assignment fails during WeightedIndex::assign_weights, the index can be left in a partially updated undefined state. The method's doc comment notes that but does not go further than that. I see the following ways to handle this:

  1. Leave things as they are. Users of assign_weights read the documentation and will keep this caveat in mind.
  2. Roll back the changes by using e.g. SubAssign. This will probably require keeping around the old cumulative weights, which implies an extra allocation in the function body, which goes against the point of this new feature.
  3. Add a field has_errored: bool to WeightedIndex, initialized to false. If an error is encountered during assignment, set to true. If WeightedIndex is used for sampling with has_errored == True, panic. I don't recall where I have seen this, but I believe this solution is even used somewhere in the standard library.

Copy link
Member

@dhardy dhardy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

There are a few comments below. Likely the new constructor should be updated slightly too.

Could we have some benchmarks please comparing (1) replacing with a new instance, (2) assign_weights and (3) assign_weights_unchecked.

@@ -130,6 +130,72 @@ impl<X: SampleUniform + PartialOrd> WeightedIndex<X> {
})
}

/// Updates all weights by recalculating the index, without changing the number of weights.
///
/// **NOTE:** if `weights` contains invalid elements (for example, `f64::NAN` in the case of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NaN will fail the w >= &zero check. What wouldn't be caught is +inf (or a sum of weights which overflows to +inf). Possibly we should add a check for this (total_weight.is_finite()).

Despite this comment, the cases which are caught are identical to those of the new constructor. Possibly both need updating.

Copy link
Author

@SuperFluffy SuperFluffy Oct 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I noticed that as well. There is also is_normal(), but subnormal values are probably not of concern.

The bigger problem however is that at the moment WeightedIndex is valid for all X: SampleUniform, which includes integers, while is_finite only applies to floats.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point... I don't think we have any way of dealing with this.

The debug asserts in impl UniformSampler for UniformFloat<$ty> will catch this, but it doesn't seem ideal.

src/distributions/weighted_index.rs Outdated Show resolved Hide resolved
/// partially updated index is undefined. It is the user's responsibility to not sample from
/// the index upon encountering an error. The index may be used again after assigning a new set
/// of weights that do not result in an error.
pub fn assign_weights(&mut self, weights: &[X]) -> Result<(), WeightedError >
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new takes an iterator while this method takes a slice, which is inconsistent. There's no real reason we can't use an iterator here (should be benchmarked but I suspect perf. will be very similar).

@vks do you think we should use an iterator for consistency?

But if we do, we have an additional choice: require ExactSizeIterator or just test we finish with the right length? I think I favour using ExactSizeIterator but I haven't thought a lot about it (it's also the more restricted choice: potentially we could switch away from it later if required).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's a good idea as that's strictly more general. Since I am zipping internally anyways, this should not change anything with regards to slices and vectors.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented the change, but leaving the convo open because of the question.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using an iterator makes sense, unless the slice optimizes better.

src/distributions/weighted_index.rs Show resolved Hide resolved
src/distributions/weighted_index.rs Outdated Show resolved Hide resolved
src/distributions/weighted_index.rs Outdated Show resolved Hide resolved
@SuperFluffy
Copy link
Author

I have implemented the changes, including removing the _unchecked version because it is IMO no longer necessary with the check for validity moved to outside the loop.

I could not act on the is_finite() suggestion, even though I agree with it, as it requires specialization or an extra trait. It would simply always evaluate to true in the case of integers, and delegate to {f32,f64}::is_finite in the case of floats. Should I do that?

Benchmarks showing that assignment via exact size iterator gives a nice little speedbump. Roughly 3x to 5x.

test weighted_index_assignment       ... bench:          27 ns/iter (+/- 0)
test weighted_index_assignment_large ... bench:         395 ns/iter (+/- 10)
test weighted_index_creation         ... bench:          97 ns/iter (+/- 0)
test weighted_index_creation_large   ... bench:       2,079 ns/iter (+/- 18)
test weighted_index_modification     ... bench:          26 ns/iter (+/- 0)

Copy link
Member

@dhardy dhardy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent.

I'll let @vks take a look before merging.

@dhardy
Copy link
Member

dhardy commented Oct 19, 2021

I could not act on the is_finite() suggestion, even though I agree with it, as it requires specialization or an extra trait. It would simply always evaluate to true in the case of integers, and delegate to {f32,f64}::is_finite in the case of floats. Should I do that?

A custom trait would be the right choice but overall I think it's better to leave this as it is: a public trait adds another complication to the API for little gain, while a private (sealed) trait restricts usage to std types.

Note that for integer types there's already a check in debug builds: overflow when the sum gets too large. Again, this is not ideal, but whether it is worth checking for overflow is questionable.

@SuperFluffy
Copy link
Author

SuperFluffy commented Oct 19, 2021

I think we haven't addressed what to do with the index if assignment fails mid update.

Right now it's going to be filled with garbage if it encounters Nan, so one probably shouldn't sample on it. :-D

@vks
Copy link
Collaborator

vks commented Oct 19, 2021

A partial update could also be handled by setting the length of the weights to zero, this should make all other calls panic.

@dhardy
Copy link
Member

dhardy commented Oct 19, 2021

If any of the weights is NaN or inf, then total_weight will be NaN or inf, and then the sampler, X::Sampler::new(zero, total_weight.clone()), will have NaN/inf range, resulting in an assertion in Sampler::new.

Not the most elegant handler but still sufficient in my opinion. Perhaps the docs should mention that NaN/inf weight will result in a panic.

@SuperFluffy
Copy link
Author

SuperFluffy commented Oct 19, 2021

If any of the weights is NaN or inf, then total_weight will be NaN or inf, and then the sampler, X::Sampler::new(zero, total_weight.clone()), will have NaN/inf range, resulting in an assertion in Sampler::new.

Not the most elegant handler but still sufficient in my opinion. Perhaps the docs should mention that NaN/inf weight will result in a panic.

That's not actually the case. :-( We are returning early so that X::Sampler::new is never hit.

The sampler as well as the total weight as stored in the weighted index itself are not updated until the very end of the function.

@dhardy
Copy link
Member

dhardy commented Oct 19, 2021

Ugh; you're right. We could just not return early (use a result var). That way on error the weights get updated, the sampler doesn't and the method panics.

A panic is still not ideal in this context, but it's what Sampler::new does. Most other distributions don't: see #581 / #770. I had hoped to be past making significant breaking changes like this now, but it seems worth considering.

In the mean-time, perhaps the right thing to do here is to use a sealed trait (a "pub" trait in a private module) to enable the checks we need. In the future we may be able to drop the trait, which won't be a breaking change. Caveat: using the fixed bounds-detection in new is a breaking change (restricting compatible types), thus that change should be left until later.

@dhardy
Copy link
Member

dhardy commented Oct 19, 2021

If #1195 is implemented we can avoid the need for an extra trait bound. For now I suggest getting this PR merged without depending on that, however.

@SuperFluffy
Copy link
Author

I need further clarification. The issues we are discussing are somewhat orthogonal, and I think the sealed trait part warrants its own PR.

  1. @vks Suggested to set the cumulative weights to 0 if it encounters a problem mid-update. This will cause the index to panic if it's used thereafter. I am leaning towards this solution.
  2. The sealed trait @dhardy is suggesting is for the f64::is_finite issue (please correct me if I am wrong). But that change not only touches the new assign_weights method, but general initialization of WeightedIndex. As such, this should be done in a separate PR.

@dhardy Did I understand 2. correctly, or did you want me to do a completely different trait?

@dhardy
Copy link
Member

dhardy commented Oct 21, 2021

@SuperFluffy you are concerned with the state of WeightedIndex after returning an error? I was merely going to document this, but sure, clearing the weights to cause a panic in this case is an extra precaution.

@SuperFluffy
Copy link
Author

SuperFluffy commented Oct 21, 2021

Turns out the way Distribution is implemented on WeightedIndex nothing actually happens if cumulative_weights.len() == 0. The binary search will simply return 0 in error position.

I am now:

  1. explicitly clearing the cumulative weights before assignment happens, and clearing them again if an error is encountered during the loop.
  2. asserting in Distribution::sample that cumulative_weights.len() > 0.

assign_weights was renamed to assign_new_weights, and now allows setting an arbitrary number of new weights because they are pushed into the index anyways.

If you are happy with these change I can squash the commits.

@SuperFluffy
Copy link
Author

SuperFluffy commented Oct 21, 2021

The newest changes were not good for perf:

test weighted_index_assignment       ... bench:          50 ns/iter (+/- 1)
test weighted_index_assignment_large ... bench:       2,048 ns/iter (+/- 26)
test weighted_index_creation         ... bench:          99 ns/iter (+/- 0)
test weighted_index_creation_large   ... bench:       2,069 ns/iter (+/- 9)
test weighted_index_modification     ... bench:          26 ns/iter (+/- 0)

@SuperFluffy
Copy link
Author

SuperFluffy commented Oct 21, 2021

Again enforcing assignment to be equal length to do a lockstep zip between weights iterator and cumulative weights.

test weighted_index_assignment       ... bench:          26 ns/iter (+/- 0)
test weighted_index_assignment_large ... bench:         394 ns/iter (+/- 15)
test weighted_index_creation         ... bench:          99 ns/iter (+/- 8)
test weighted_index_creation_large   ... bench:       2,094 ns/iter (+/- 66)
test weighted_index_modification     ... bench:          26 ns/iter (+/- 0)

Copy link
Member

@dhardy dhardy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are four options:

  1. Use std::panic::catch_unwind. This is "not recommended for a general try/catch mechanism" and comes with various warnings, but may be okay.
  2. Change Sampler::new to return Result on error — but this is beyond the scope of this PR: Error handling of distributions::Uniform::new #1195.
  3. Use another trait bound to let us directly check the weight is finite. This fixes the specific case of f32/f64 but not necessarily user-extensions to the Uniform distribution.
  4. Simply state that if the method fails, results of sampling the distribution are undefined (within certain bounds).

src/distributions/weighted_index.rs Outdated Show resolved Hide resolved
return Err(WeightedError::AllWeightsZero);
};

self.weight_distribution = X::Sampler::new(zero, total_weight.clone());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's still a problem: this panics if total_weight is +inf, and we don't catch panics.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but here WeightedIndex::new suffers from the same issue. So if we want to address this, both assign_new_weights and new should be changed in a new PR, I think.

@SuperFluffy
Copy link
Author

There are four options:

1. Use `std::panic::catch_unwind`. This is "not recommended for a general try/catch mechanism" and comes with various warnings, but may be okay.

2. Change `Sampler::new` to return `Result` on error — but this is beyond the scope of this PR: [Error handling of distributions::Uniform::new #1195](https://github.com/rust-random/rand/issues/1195).

3. Use another trait bound to let us directly check the weight is finite. This fixes the specific case of `f32`/`f64` but not necessarily user-extensions to the `Uniform` distribution.

4. Simply state that if the method fails, results of sampling the distribution are undefined (within certain bounds).

Alright, I went with 4., mentioning that the results of sampling the distribution are undefined.

Regarding your other comment about total_weight being +inf - I think this should be part of another PR fixing it for both new as well as assign_new_weights.

Copy link
Member

@dhardy dhardy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I think I'm happy with this now, but I'll let @vks take another look before merging.

src/distributions/weighted_index.rs Show resolved Hide resolved
@SuperFluffy SuperFluffy force-pushed the assign_weighted_index branch 2 times, most recently from ff9d38a to 79f928f Compare October 21, 2021 13:11
benches/weighted.rs Show resolved Hide resolved
src/distributions/weighted_index.rs Outdated Show resolved Hide resolved
src/distributions/weighted_index.rs Show resolved Hide resolved
src/distributions/weighted_index.rs Outdated Show resolved Hide resolved
@vks vks added the B-API Breakage: API label Oct 21, 2021
@vks vks mentioned this pull request Oct 21, 2021
23 tasks
@SuperFluffy
Copy link
Author

SuperFluffy commented Oct 21, 2021

@vks Addressed all your points and squashed.

Copy link
Collaborator

@vks vks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! The new code is unfortunately not compatible with Rust 1.36:

error[E0277]: `[f64; 4]` is not an iterator
   --> src/distributions/weighted_index.rs:475:29
    |
475 |             let mut distr = WeightedIndex::new([1.0f64, 2.0, 3.0, 0.0]).unwrap();
    |                             ^^^^^^^^^^^^^^^^^^ borrow the array with `&` or call `.iter()` on it to iterate over it
    |
    = help: the trait `core::iter::Iterator` is not implemented for `[f64; 4]`
    = note: arrays are not iterators, but slices like the following are: `&[1, 2, 3]`
    = note: required because of the requirements on the impl of `core::iter::IntoIterator` for `[f64; 4]`

error[E0277]: the trait bound `[f64; 3]: core::iter::ExactSizeIterator` is not satisfied
   --> src/distributions/weighted_index.rs:476:29
    |
476 |             let res = distr.assign_new_weights([1.0f64, 2.0, 3.0]);
    |                             ^^^^^^^^^^^^^^^^^^ the trait `core::iter::ExactSizeIterator` is not implemented for `[f64; 3]`

error[E0277]: `[f64; 4]` is not an iterator
   --> src/distributions/weighted_index.rs:480:29
    |
480 |             let mut distr = WeightedIndex::new([1.0f64, 2.0, 3.0, 0.0]).unwrap();
    |                             ^^^^^^^^^^^^^^^^^^ borrow the array with `&` or call `.iter()` on it to iterate over it
    |
    = help: the trait `core::iter::Iterator` is not implemented for `[f64; 4]`
    = note: arrays are not iterators, but slices like the following are: `&[1, 2, 3]`
    = note: required because of the requirements on the impl of `core::iter::IntoIterator` for `[f64; 4]`
note: required by `distributions::weighted_index::WeightedIndex::<X>::new`
   --> src/distributions/weighted_index.rs:97:5
    |
97  | /     pub fn new<I>(weights: I) -> Result<WeightedIndex<X>, WeightedError>
98  | |     where
99  | |         I: IntoIterator,
100 | |         I::Item: SampleBorrow<X>,
...   |
131 | |         })
132 | |     }
    | |_____^

error[E0599]: no associated item named `NAN` found for type `f64` in the current scope
   --> src/distributions/weighted_index.rs:481:67
    |
481 |             let res = distr.assign_new_weights([1.0f64, 2.0, f64::NAN, 0.0]);
    |                                                                   ^^^ associated item not found in `f64`
    |
    = help: items from traits can only be used if the trait is in scope
    = note: the following trait is implemented but not in scope, perhaps add a `use` for it:
            `use core::num::dec2flt::rawfp::RawFloat;`

error[E0277]: `[u32; 4]` is not an iterator
   --> src/distributions/weighted_index.rs:485:29
    |
485 |             let mut distr = WeightedIndex::new([1u32, 2, 3, 0]).unwrap();
    |                             ^^^^^^^^^^^^^^^^^^ borrow the array with `&` or call `.iter()` on it to iterate over it
    |
    = help: the trait `core::iter::Iterator` is not implemented for `[u32; 4]`
    = note: arrays are not iterators, but slices like the following are: `&[1, 2, 3]`
    = note: required because of the requirements on the impl of `core::iter::IntoIterator` for `[u32; 4]`

error[E0277]: the trait bound `[u32; 4]: core::iter::ExactSizeIterator` is not satisfied
   --> src/distributions/weighted_index.rs:486:29
    |
486 |             let res = distr.assign_new_weights([0u32, 0, 0, 0]);
    |                             ^^^^^^^^^^^^^^^^^^ the trait `core::iter::ExactSizeIterator` is not implemented for `[u32; 4]`

error: aborting due to 6 previous errors
  • f64::NAN can be replaced with core::f64::NAN.
  • Iterating over slices instead of arrays should fix the other errors.

@vks
Copy link
Collaborator

vks commented Oct 21, 2021

@dhardy Do you think we can start to merge breaking changes for rand 0.9?

@dhardy
Copy link
Member

dhardy commented Oct 21, 2021

@vks I guess that depends on whether there are any significant non-breaking changes in master or expected to be merged soon. I don't know but can check tomorrow. If not, then I think we can start merging.

BREAKING CHANGE: This commit adds a variant to `WeightedError`.
@SuperFluffy SuperFluffy requested a review from vks October 22, 2021 11:29
@SuperFluffy
Copy link
Author

@vks Replaced f64::NAN by ::core::f64::NAN. Also changed the arrays to slices with &[<values>][..]. I actually did this to benches/weighted.rs as well, since none of the benches (mine and the old ones) were compatible with 1.36. Guess its because benches are checked with nightly, but not with "1.36 nightly".

@vks
Copy link
Collaborator

vks commented Oct 22, 2021

Great, thanks!

For the benchmarks it's fine to use newer features, because they require nightly anyway. For the API it's more important to track the MSRV, because this may break crates depending on rand.

@kazcw
Copy link
Contributor

kazcw commented Oct 22, 2021

There is a more general API that would solve this with less new code, and avoid rand having to choose a stance on a new failure case. Rather than providing optimized methods for specific use cases, why not give the user the tools to perform any operations of this sort efficiently?

The documentation already commits to WeightedIndex being implemented with a Vec<X>, so we could expose that with some easily-implemented functions:

WeightedIndex<X>::into_cumulative_weights(self) -> Vec<X>;
WeightedIndex<X>::from_cumulative_weights(weights: Vec<X>) -> Result<Self, WeightedError>;
WeightedIndex<X>::from_cumulative_weights_unchecked(weights: Vec<X>) -> Self;

The Vecs would have total_weight as the final element; this could be a conversion done by pushing/popping in the new methods, or (preferably, I think) the existing methods could be modified to store total_weight in that position. (total_weight doesn't actually need to be stored at all, but I think the API simplicity of keeping it in the Vec rather than accepting it as a separate parameter outweighs the overhead of a single unneeded element—it is logically the final element of the series.)

For convenience, WeightedIndex<X> could also have a Default implementation with an empty weights Vec.

Then, a user like @SuperFluffy could achieve the optimized operation in question like this:

fn example(&mut self) -> Result<()> {
    let mut weights = self.weighted_index.take().into_weights();
    update_weights_somehow(&mut weights[..]);
    self.weighted_index = WeightedIndex::from_weights(weights)?;
}

The possibility of length mismatch is obviated here, and what becomes of the source value in the error case is explicitly the user's choice. This would also support at least one additional use case: if the user needs distributions of varying length at different times, they can reuse one Vec so that it only needs allocation when it reaches a new high-water mark.

Incidentally, an example of using the new API in place of update_weights would be exactly the same. update_weights could potentially be deprecated after this.

@dhardy
Copy link
Member

dhardy commented Oct 23, 2021

Interesting points @kazcw. There are two caveats:

  1. You're pushing more work onto the user: converting weights into cumulative weights (not hard I know)
  2. It sounds like this is an unchecked API — fine, but we should be clear about that.

Anyway, it does make me consider something else:

  1. There doesn't appear to be much reason to require that the number of weights match. We can just use reserve or collect iterators, right?
  2. The constructor is much the same code as this method. Can we save some code, e.g. by making the constructor create an empty instance and then use replace_weights?

@SuperFluffy
Copy link
Author

SuperFluffy commented Oct 23, 2021

I actually had written a version that clears the original vector and pushes new elements into it instead of doing a lockstep zip + assignment.

This led to exactly the same performance as a just calling new, rendering the point of this PR moot.

@kazcw
Copy link
Contributor

kazcw commented Oct 23, 2021

The from/into approach asks the user to implement more, but it asks the user to be aware of API subtleties less. With assign_new_weights, rand must take a stance on:

  • what invariants are required (does the new distribution have to have the same number of elements?)
  • what happens if the invariants are violated (do we panic? try to rollback? leave self in an inconsistent state? poison self so that sample will panic?)
  • what are the performance characteristics (if the new distribution has smaller len(), does self still have the same memory footprint?)

rand must try to find a compromise that will be acceptable to as many use cases as possible (which, as the discussion on this issue shows, is not easy—are we there yet? I think a poisoned self would be better than inconsistent!), and the user needs to understand rand's decisions and their implications (likewise with update_weights).

Whereas given the from/into approach, these decisions are in the users hands; tradeoffs can be chosen appropriately to the use case, and the consequences should not come as a surprise. The meaning of from_cumulative_weights is clear without consulting documentation: if-and-only-if the input represents the cumulative weights of a valid distribution, it returns Ok(_).

As for the checked vs. unchecked question, the performance of from_cumulative_weights (the checked API) should be similar to assign_new_weights; each has to make a pass over the full input. from_cumulative_weights_unchecked would of course be O(1).

As far as I can see, the advantage of assign_new_weights is that it requires less code to use. If so, this comes down to a question of priorities: is it more important that code built on rand be terse, or that it be straightforward to write correctly and easy to review?

@dhardy
Copy link
Member

dhardy commented Oct 24, 2021

I actually had written a version that clears the original vector and pushes new elements into it instead of doing a lockstep zip + assignment.

Slightly weird, but pushing new elements is definitely slower. Using resize fixes this however:

self.cumulative_weights.resize(iter.len(), zero.clone());
for (w, c) in iter.zip(self.cumulative_weights.iter_mut()) {
    // ...

Of course, this technique can be used in new for a speed boost too, given ExactSizeIterator (if only specialisation was stable); probably also without that using the size hint plus a second loop to catch any remaining elements (but ugly redundant code).


@kazcw: I think you're right that this isn't the optimal API. I will think further on it.

@SuperFluffy
Copy link
Author

Slightly weird, but pushing new elements is definitely slower. Using resize fixes this however:

self.cumulative_weights.resize(iter.len(), zero.clone());
for (w, c) in iter.zip(self.cumulative_weights.iter_mut()) {
    // ...

It improves it a lot compared to direct pushing, but there is still a significant performance penalty on my M1 ARM machine:

# Zip without resize
test weighted_index_assignment       ... bench:          18 ns/iter (+/- 0)
test weighted_index_assignment_large ... bench:         392 ns/iter (+/- 3)

# Zip after resize
test weighted_index_assignment       ... bench:          22 ns/iter (+/- 0)
test weighted_index_assignment_large ... bench:         484 ns/iter (+/- 2)

@SuperFluffy
Copy link
Author

SuperFluffy commented Oct 25, 2021

I have been thinking about @kacw's suggestion. Their argument applies not just to cumulative weights, but to normal weights as well. We can easily do:

WeightedIndex<X>::into_cumulative_weights(self) -> Vec<X>;

# Take the provided weights as they are
WeightedIndex<X>::from_cumulative_weights(weights: Vec<X>) -> Result<Self, WeightedError>;
WeightedIndex<X>::from_cumulative_weights_unchecked(weights: Vec<X>) -> Self;

# Iterate over the weights accumulating a total, and assign that total weight in each iteration
WeightedIndex<X>::from_weights(weights: Vec<X>) -> Result<Self, WeightedError>;
WeightedIndex<X>::from_weights_unchecked(weights: Vec<X>) -> Self;

The *cumulative_weights* versions allow direct insertion of the cumulative weights, while the non-cumulative versions allow reusing the provided weights in-place. The constructor takes ownership of the vector, so it can trivially iter_mut over it and assign the total weights accumulator.

@kazcw
Copy link
Contributor

kazcw commented Oct 25, 2021

@SuperFluffy from_weights makes a lot of sense—if we'd need a pass over the data to check its cumulativeness, we might as well offer to accumulate it instead. I'm not sure about from_weights_unchecked though—it would have negligable performance benefit for typical X (anything that isn't very expensive to compare). I think offering an unchecked version that isn't normally appreciably faster could mislead users. For the rare X that could use it (bignums?), there's always from_cumulative_weights_unchecked, although WeightedIndex is not going to be optimal for bignums anyway.

As another convenience in the same vein, we might consider to_weights to better support the update_weights use case.

So we'd have:

/// O(1), no allocations
WeightedIndex<X>::into_cumulative_weights(self) -> Vec<X>;
/// O(N), no allocations
WeightedIndex<X>::to_weights(self) -> Vec<X>;

/// O(N), no allocations
WeightedIndex<X>::from_weights(weights: Vec<X>) -> Result<Self, WeightedError>;
/// O(N), no allocations
WeightedIndex<X>::from_cumulative_weights(weights: Vec<X>) -> Result<Self, WeightedError>;
/// O(1), no allocations, if input is not cumulative then distribution will not yield meaningful samples
WeightedIndex<X>::from_cumulative_weights_unchecked(weights: Vec<X>) -> Self;

(Distinguishing to from into in accordance with the Rust API Guidelines: into_cumulative_weights deconstructs, decreasing level of abstraction, and is cheap; to_weights converts, maintaining level of abstraction, and is asymptotically more expensive.)

@dhardy
Copy link
Member

dhardy commented Oct 25, 2021

I had a little play with from_weights; it is a nice way to rewrite the constructor without significant performance impact. Using it to rewrite assign_new_weights does impact performance (35-40% on the simple bench used). Ultimately I'm not sure whether we care enough about performance in this specific case @SuperFluffy?

See: SuperFluffy#1

I didn't bother with from_cumulative_weights though that may have some uses (unrelated to this PR).

@SuperFluffy
Copy link
Author

SuperFluffy commented Oct 26, 2021

@dhardy I also pushed my version of from_weights. Interestingly, I am only seeing at most a 25% decrease in performance compared to assign_new_weights, but that includes an extra clone in the benchmark, which might account for a significant part of the the regression (the right thing to do would be to move the benches to criterion, which can ensure that the clone is not measured).

I especially like that from_cumulative_weights_unchecked is extremely cheap.

Here are my results:

test weighted_index_assignment                    ... bench:          18 ns/iter (+/- 0)
test weighted_index_assignment_large              ... bench:         392 ns/iter (+/- 1)
test weighted_index_from_cumulative_weights       ... bench:          41 ns/iter (+/- 0)
test weighted_index_from_cumulative_weights_large ... bench:         156 ns/iter (+/- 0)
test weighted_index_from_weights                  ... bench:          46 ns/iter (+/- 0)
test weighted_index_from_weights_large            ... bench:         478 ns/iter (+/- 1)
test weighted_index_modification                  ... bench:          26 ns/iter (+/- 0)
test weighted_index_new                           ... bench:          59 ns/iter (+/- 0)
test weighted_index_new_large                     ... bench:       2,073 ns/iter (+/- 6)

NOTE: I should get rid of the total_weight field, as in @dhardy's version. This was just quick and dirty.

@dhardy
Copy link
Member

dhardy commented Oct 26, 2021

@SuperFluffy — about your benches — most of the results are extremely small (under 60ns). While the benchmark can fairly reliably time an operation at that level, I'm not convinced that the operation is representative. E.g. the "small" benchmark says that weighted_index_assingment is more than twice as wast as from_cumulative_weights, whereas the "large" version says basically the opposite. I would think the "large" variant is about as small as you'd want to go (an alternative would be to use multiple small distributions in the same loop).

Looking at the "large" variants, assignment and from_weights are 4-5 times faster, which does sound a useful speedup, however assignment is only 18% faster than from_weights, which, given the general unreliability of results from micro-benchmarks, is not that significant. Put differently: if you can demonstrate that assignment is significantly better than from_weights in something close to a real problem, then I will recognise the value of having both, but from the above I will not — and, if you don't specifically have a need for assignment, I'd prefer to drop it to simplify the API.

Because API simplicity is an important factor, and I have a feeling that we are over-optimising here (without a specific target).

@SuperFluffy
Copy link
Author

@dhardy I agree with removing assignment, and I agree that results are not reliable. I very much prefer the from and into APIs.

If you want, I will close this PR and submit a fresh PR that contains these.

@dhardy
Copy link
Member

dhardy commented Oct 26, 2021

@SuperFluffy not very important whether you use a new PR. Notice that my PR reduced line count significantly by making new a wrapper around from_weights and via simpler code there (enabled in part by pushing total_weight into the vec).

@dhardy
Copy link
Member

dhardy commented Dec 6, 2022

@SuperFluffy are you still able to work on this? It would be good to get it merged soon!

Was there anything else to resolve?

@dhardy
Copy link
Member

dhardy commented Feb 20, 2023

@SuperFluffy can I remind you of this? Both the above issues are now resolved in master.

@SuperFluffy
Copy link
Author

@dhardy apologies for not having responded. I left the job where this was relevant and it wasn't immediately relevant to me.

I'm going to find some time and rebased/adjust this PR.

Thanks for the reminder

@dhardy
Copy link
Member

dhardy commented Jan 29, 2024

Closing due to inactivity. We can re-open if someone wishes to work on this again.

I have a branch related to this here, but I don't recall any the motivation (probably benchmarking): https://github.com/dhardy/rand/commits/assign_weighted_index/

@dhardy dhardy closed this Jan 29, 2024
@dhardy dhardy added the X-stale Outdated or abandoned work label Jul 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
B-API Breakage: API X-stale Outdated or abandoned work
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants