-
Notifications
You must be signed in to change notification settings - Fork 432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Robert Floyd's combination algorithm. #144
Conversation
This gets rid of the 'experimental' level, removes the non-staged_api case (i.e. stability levels for out-of-tree crates), and lets the staged_api attributes use 'unstable' and 'deprecated' lints. This makes the transition period to the full feature staging design a bit nicer.
This gets rid of the 'experimental' level, removes the non-staged_api case (i.e. stability levels for out-of-tree crates), and lets the staged_api attributes use 'unstable' and 'deprecated' lints. This makes the transition period to the full feature staging design a bit nicer.
This adds the int_uint feature to *every* library, whether or not it needs it.
Conflicts: src/test/compile-fail/borrowck-move-out-of-overloaded-auto-deref.rs src/test/compile-fail/issue-2590.rs src/test/compile-fail/lint-stability.rs src/test/compile-fail/slice-mut-2.rs src/test/compile-fail/std-uncopyable-atomics.rs
Fixes #16072
Fixes #16072 r? @huonw
Lets them build with the -dev, -nightly, or snapshot compiler
* `core` - for the core crate * `hash` - hashing * `io` - io * `path` - path * `alloc` - alloc crate * `rand` - rand crate * `collections` - collections crate * `std_misc` - other parts of std * `test` - test crate * `rustc_private` - everything else
* `core` - for the core crate * `hash` - hashing * `io` - io * `path` - path * `alloc` - alloc crate * `rand` - rand crate * `collections` - collections crate * `std_misc` - other parts of std * `test` - test crate * `rustc_private` - everything else
Conflicts: mk/tests.mk src/liballoc/arc.rs src/liballoc/boxed.rs src/liballoc/rc.rs src/libcollections/bit.rs src/libcollections/btree/map.rs src/libcollections/btree/set.rs src/libcollections/dlist.rs src/libcollections/ring_buf.rs src/libcollections/slice.rs src/libcollections/str.rs src/libcollections/string.rs src/libcollections/vec.rs src/libcollections/vec_map.rs src/libcore/any.rs src/libcore/array.rs src/libcore/borrow.rs src/libcore/error.rs src/libcore/fmt/mod.rs src/libcore/iter.rs src/libcore/marker.rs src/libcore/ops.rs src/libcore/result.rs src/libcore/slice.rs src/libcore/str/mod.rs src/libregex/lib.rs src/libregex/re.rs src/librustc/lint/builtin.rs src/libstd/collections/hash/map.rs src/libstd/collections/hash/set.rs src/libstd/sync/mpsc/mod.rs src/libstd/sync/mutex.rs src/libstd/sync/poison.rs src/libstd/sync/rwlock.rs src/libsyntax/feature_gate.rs src/libsyntax/test.rs
Conflicts: mk/tests.mk src/liballoc/arc.rs src/liballoc/boxed.rs src/liballoc/rc.rs src/libcollections/bit.rs src/libcollections/btree/map.rs src/libcollections/btree/set.rs src/libcollections/dlist.rs src/libcollections/ring_buf.rs src/libcollections/slice.rs src/libcollections/str.rs src/libcollections/string.rs src/libcollections/vec.rs src/libcollections/vec_map.rs src/libcore/any.rs src/libcore/array.rs src/libcore/borrow.rs src/libcore/error.rs src/libcore/fmt/mod.rs src/libcore/iter.rs src/libcore/marker.rs src/libcore/ops.rs src/libcore/result.rs src/libcore/slice.rs src/libcore/str/mod.rs src/libregex/lib.rs src/libregex/re.rs src/librustc/lint/builtin.rs src/libstd/collections/hash/map.rs src/libstd/collections/hash/set.rs src/libstd/sync/mpsc/mod.rs src/libstd/sync/mutex.rs src/libstd/sync/poison.rs src/libstd/sync/rwlock.rs src/libsyntax/feature_gate.rs src/libsyntax/test.rs
Conflicts: src/libcore/cell.rs src/librustc_driver/test.rs src/libstd/old_io/net/tcp.rs src/libstd/old_io/process.rs
For example, `sample(r, b"ATGC", 4)` always returns `[b'A', b'T', b'G', b'C']`.
Clarify reservoir sampling docs
I see. One cannot even use features for access to deprecated stuff in stable. Any opinions on doing this without |
Since the dimension of a |
It's running over a Also, you could support only |
The issue I opened, referenced above, produces a similar result and runs in O(amount) (worst case O(n/2)) but uses the full space: O(n) / O(range.len()). |
I do think a |
Does it make sense to make this a distribution instead? Edit: no. |
If it does the same job as |
Yes |
Robert Floyd's algorithm works in the sense that a value is never picked twice. But it does not produce an all that random list of results. Example from picking 50 numbers from 1..100, and comparing them against a simple counter in the second column:
And if I try to get 90 results out of a list of 100, it becomes almost a counter:
So we should document very clearly when it is a good idea to use this, or not include it in |
Yes, the function is called As an aside, |
Sorry this PR got so suddenly closed. The cause was that we cleaned up the git history (#350) and forced-pushed it to master. Now there was a difference of about 30.000, and I suppose GitHub just gave up... This PR, together with two others, stayed open for so long with little attention because it touches on an area of Rand that needs some design first. That is now slowly getting explored in dhardy#82. |
The algorithm here is interesting. Unfortunately @burdges implementation here has a bug. This is a simpler version which I think is correct: fn combination<T>(rng, low, high, amount) -> Vec<T> {
assert!(amount <= (high - low));
let mut s = Vec::with_capacity(amount);
let n = high - low;
for j in n - amount .. n {
let t = rng.gen_range(low, low + j + 1);
let t = if s.contains(&t) { low + j } else { t };
s.push( t );
}
s
} @burdges implementation instead does The bug is that @burdges simplified by instead setting Note that |
I'm getting there @pitdicker ... |
No hurry, but I had the idea there that we could provide a couple of 'low-level' algorithms, and the current |
I'm donno if you guys want to merge this since it depends on deprecated access to constants, but it's useful if you've a slow random number generator and want only small samples.