Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ChaCha: support reading counter and setting nonce #374

Merged
merged 3 commits into from
Apr 30, 2018

Conversation

dhardy
Copy link
Member

@dhardy dhardy commented Apr 5, 2018

Part of #349 @burdges @pitdicker

  • Adjust set_counter to take only a single 64-bit counter
  • Add get_counter
  • Add 128-bit variants (more for completeness and because there isn't another way to read the nonce, but maybe they shouldn't be there?)
  • Add functions to set the nonce

This doesn't resolve the issue about setting rounds yet.

Also this removes functionality from ChaChaCore but I assume that will not be used directly (except within ReseedingRng where these functions aren't useful anyway).

@pitdicker
Copy link
Contributor

Having both a set_counter and get_counter functions seems like a good idea. Also to have a 64- and 128-bit version, now that support for u128 is on the horizon.

What should happen with get_counter when the counter does not fit a u64? I think it is best to return an option or result for that function.

Removing the methods from ChaChaCore can be ok, I think.

I am however completely against any messing with a nonce.

@dhardy
Copy link
Member Author

dhardy commented Apr 5, 2018

I'm not really concerned about get_counter: no one is going to generate 1 ZiB of data and if users set a nonce or high counter then they should know that they may need to use get_counter_128.

Also, if users are using a nonce, they may want to extract just the counter with get_counter() or get_counter() as u32.

@dhardy
Copy link
Member Author

dhardy commented Apr 6, 2018

I am however completely against any messing with a nonce.

Meaning you don't think we should add these set-nonce functions? Why? I can't see ChaCha changing in incompatible ways (stability and equivalence across implementations is one of the big advantages of ChaCha).

@pitdicker
Copy link
Contributor

pitdicker commented Apr 7, 2018

Meaning you don't think we should add these set-nonce functions? Why?

I think we should make up our mind on what our implementation of ChaCha is. Is it an RNG or a stream cipher? Or should it be both? Do we want to have anything to do with RNGs in Rand being used as stream ciphers?

If we decide it is an RNG (what I assumed until now, and how it is implemented until now) there is no use for a nonce. If we really want to allow using it with a nonce, we should change the implementation to also threat it as such, and not as part of the counter. Even if a high counter is unlikely unless someone starts doing things with set_counter, I think our implementation should be 'honest'. Either drop the 128-bit counter idea, or the idea of a nonce. But we can't have both.

@pitdicker
Copy link
Contributor

To nuance my position a bit: I don't care strongly whether we support setting a nonce or not. The thing is I believe it is not correct to have both a 128-bit counter and a nonce. The original implementation of ChaChaRng made the choice of a 128-bit counter, and that is why I prefer it.

@dhardy
Copy link
Member Author

dhardy commented Apr 8, 2018

Well, okay... I agree, it should primarily be an RNG.

But I don't think honesty has much to do with it — effectively the only deviation from the standard cipher is the ability to keep generating results after the first 256 GiB. This just means that if somebody was going to use this to encrypt/decrypt then they should ensure the length is no more than 256 GiB or split the input.

There is a small reason to include the nonce code: extra flexibility in reproducing results from other implementations. If you really want we can remove them, but I don't see much point.

@pitdicker
Copy link
Contributor

pitdicker commented Apr 8, 2018

We have the choice between three variants here:

  • 128-bit counter
  • 64-bit counter, 64-bit nonce (original variant)
  • 32-bit counter, 96-bit nonce (IETF variant)

How many bytes can be generated with a 64-bit counter before wrapping? counter * state size = 2^64 * 2^6 = 2^70 bytes. 2^10=kilo, 2^20=mega, 2^30=giga, 2^40=tera, 2^50=peta, 2^60=exa, 2^70=yotta. Seems like a reasonable max. A 32-bit counter on the other hand can be much too small, it would wrap around within 8 minutes on my PC. That equals 2^(32+6)=2^38=250gb of generated values.

For a nonce 64 bit may sometimes be a bit too small, because a nonce is usually chosen at random and gets in trouble because of the birthday problem. With ~10 million messages, there is a chance of 1 in a million of having a duplicated nonce. So the IETF variant with a 96-bit seems like a practical choice for encrypting messages, that all should have a unique nonce but are expected to all be smaller than 250gb.

We clearly can't make use of the IETF variant because the counter is too small for use as an RNG. A 128-bit counter is useless. But so is a nonce for RNGs.

@dhardy You make a good point that choosing a variant with a counter would give us reproducibility with other implementations. Also we currently have the problem that a 128-bit counter is not the easiest to set, because u128 is just now getting stable.

Would it make sense to switch our ChaCha implementation to the original variant by Bernstein? To end having this discussion coming up every other month, have reproducibility with other implementations, and a simple API (for get_counter/set_counter)?

@dhardy
Copy link
Member Author

dhardy commented Apr 8, 2018

I still don't see what's wrong with what is implemented here in this PR, except for the redundancy between 64-bit and 128-bit counter methods.

We should have at least 64-bit counter, but as you say this doesn't match up properly when using 96-bit nonce. I don't see how it's a major issue that this impl would simply wrap to the next nonce should more than 256 GiB be generated.

@pitdicker
Copy link
Contributor

pitdicker commented Apr 8, 2018

I don't see how it's a major issue that this impl would simply wrap to the next nonce should more than 256 GiB be generated.

If you start incrementing the nonce, it is no longer a user-supplied constant but simply part of the counter. The way the ChaCha implementation by Peter Reid does it is by offering both variants of nonce and counter size, and storing the choice. It will only have to be checked very rarely anyway.

But whatever we do, it should simply be correct. How do we expect users to have any trust in our ChaCha implementation if we start including hacks like this?

Edit: didn't like the sound of 'hacks' in the post, apologies. And if the problems only start after generating 250 GB it might have been somewhat excusable, if we didn't also have the possibility of setting the counter.

@pitdicker
Copy link
Contributor

@dhardy What are your thoughts at the moment? I am kind of warming up to the idea of using just Bernsteins original version with a 64-bit counter and a 64-bit nonce.

@dhardy
Copy link
Member Author

dhardy commented Apr 15, 2018

I still think we're sacrificing functionality without a good reason.

E.g. with 64-bit counter if somebody does set_counter(u64::MAX) why would we panic once the first block has been generated? I see no reason not to continue with wrapped counter (1<<64 not 0).

@pitdicker
Copy link
Contributor

pitdicker commented Apr 16, 2018

I can get behind adding more functionality, but not at the cost of correctness. What do you think of the enum approach, where we store the choice between the 3 variants? It would not add any real performance overhead, as it only needs to be checked when incrementing the counter past 32 bits.

@dhardy
Copy link
Member Author

dhardy commented Apr 16, 2018

Aside from the nonce stuff, are we agreed on the changes to get/setting the counter? I'm not quite sure myself; since 64-bits is enough for 1 ZiB maybe we shouldn't even bother supporting larger counters. We could instead call the high 64-bits the "stream" for example (which is roughly what the old set_counter supports, besides the cascading wrap).

Or, as you say, we could support multiple counter sizes, although I'm a little against adding complexity which almost never gets used.

BTW what should happen when the counter does wrap? The only options (I think) are:

  1. error/panic (which I really don't like to do in an RNG)
  2. wrap (i.e. if set_counter is used with (1<<64)-1 then the next results are those for counter 0, which probably won't have been generated before anyway unless lots of jumping happens)
  3. cascading-wrap as now (advantage that weird jump patterns are less likely to result in overlapping ranges, disadvantage that "streams" are not truly independent)

I still prefer the last option to be honest.

@dhardy dhardy added E-question Participation: opinions wanted B-API Breakage: API P-medium labels Apr 16, 2018
@pitdicker
Copy link
Contributor

since 64-bits is enough for 1 ZiB maybe we shouldn't even bother supporting larger counters. We could instead call the high 64-bits the "stream" for example (which is roughly what the old set_counter supports, besides the cascading wrap).

With this I can fully agree (proposed before).

  1. cascading-wrap as now (advantage that weird jump patterns are less likely to result in overlapping ranges, disadvantage that "streams" are not truly independent)

If we go and support multiple incompatible schemes with exactly the same code, we should document clearly how this is possible. Maybe by explaining what happens when the counter 'runs out'. Something like this for set_nonce_96:

If you use more values than the counter supports (2^32 * 64 bytes), the counter will wrap. Any values beyond this point are invalid. A consequence of our implementation is that the nonce will also be changed (incremented). But as this situation is to be avoided, and the values are already incorrect, that does not seem like a problem.

But I still fear that doing something that is theoretically incorrect is going to bite us in the end, so I'd prefer the first option.

@dhardy
Copy link
Member Author

dhardy commented Apr 17, 2018

It would be good to get some more feedback on whether this is even useful; also the variable rounds stuff. I put everything except the basic get/set counter functionality behind an "experimental" feature gate so we can ship the feature but make it clear that it might go away or change later.

It's not the nicest approach, but does let us make a release without worrying about this too much!

@pitdicker
Copy link
Contributor

I would really like to see a release, and yes, this ChaCha stuff is holding things up.

But it seems like we are so close to at least deciding on the nonce and counter part, can't we finish it? Do you have a problem with a 64-bit nonce and 64-bit counter?

@burdges
Copy link
Contributor

burdges commented Apr 17, 2018

I'm slightly disconnected from this conversation now, but the important point is that the default is (a) secure and (b) meshes well with standards.

I'll point out that XChaCha increased both the counter and nonce by incurring some startup cost, which suggests that a 64 bit counter can create security issues.

Is there a standardization issue for ChaChaRng with a 128 bit counter but no nonce? I think ChaCha counters increment the nonce when they roll over anyways, right? If so, then ChaChaRng becomes ChaCha20 with zero nonce, which sounds fine. Also XChaCha can be implemented in terms of ChaCha20, right?

Apologies for not paying closer attention lately. Anyways an important question is how the standard ChaChas role over their counter.

@vks
Copy link
Collaborator

vks commented Apr 17, 2018

What is the use case for this? People using ChaCha for encryption should probably rather use a different crate with a higher-level interface that supports authenticated encryption. I don't think we should encourage non-RNG use cases. I would prefer not to provide these features which are very easy to misuse.

Another option alternative to the experimental feature would be moving this to another release.

@burdges
Copy link
Contributor

burdges commented Apr 18, 2018

There is a whole world of cryptography outside simple authenticated encryption, including much that benefits from being close to standards, but for which nobody can or will take the time to develop standards. As an example, the old rand crate made ZCash's Power of Tau trusted setup program harder to reimplement in Go.

@vks
Copy link
Collaborator

vks commented Apr 18, 2018 via email

@burdges
Copy link
Contributor

burdges commented Apr 19, 2018

A CSPRNG is always a low level algorithm, but especially a deterministic one like ChChaRng. rand is a low level algorithms crates the moment it decides to address cryptography.

There are no scenarios under which ChChaRng will be used for encryption. We do use it as a CSPRNG however, or more formally as a PRF. If you pick a PRF that is too non-standard, then your existing protocol cannot easily become more standard, including by being implemented elsewhere, so you tends to box yourself into doing necessary work.

Anyways..

IETF ChaCha has only a 32 bit block counter. All other ChaCha variants have a 64 bit block counter.
We deviate slightly from ChaCha20 by using a 128 but block counter, not too badly, but enough to warrant document this in the crate.

Afaik, there are no cryptography uses for variable round ChaCha, but the next_u32 vs next_u64 business could mess people up, so we reiterating any issues around skipped output and endianess there that sounds wise.

@dhardy
Copy link
Member Author

dhardy commented Apr 19, 2018

Okay, then I agree that the best solution for now is just to drop the nonce functions (unless there are related standards for use of ChaCha as an RNG).

Quite honestly unless there any standards to guide this, perhaps we should go back to the old (u64, u64) set-counter function and just add a similar get-counter fn. This allows use of the high-part as the stream but makes clear that functionally it is part of the counter.

@dhardy
Copy link
Member Author

dhardy commented Apr 19, 2018

Sorry I wrote the above ↑ while on a plane and didn't read your last messages before posting.

I agree, standards are important. Are there any relevant here?

Using a 64-bit counter plus 64-bit "stream" or nonce may be the best option; in this case is it still reasonable to wrap the counter to 0 rather than panic? I'd really prefer to avoid panics. 1ZiB is ridiculously huge so there's no worry about repeats unless the counter gets set while generating.

Regarding next_u64 and next_u32, ChaCha now uses BlockRng which implements next_u64 by taking the next two u32 values, so no unexpected skipping here. fill_bytes however will always round up to the next 32-bit boundary in terms of random data consumed; this does not seem unreasonable behaviour.

@pitdicker
Copy link
Contributor

in this case is it still reasonable to wrap the counter to 0 rather than panic?

I would say wrapping the counter is by far the most reasonable thing to do, without giving any notification at all. The trouble does not lie in the counter, but in repeating the same stream of random numbers. As calculated, this is an incredibly large amount of data, taking many decennia/centuries to generate, all available for an attack. Instead we can assume that when the counter wraps around, set_counter was used. In that case there is no duplicated random number stream, so no problem.

@burdges
Copy link
Contributor

burdges commented Apr 19, 2018

I think the old (u64, u64) scheme can be described roughly like this:

In set_counter, our counter_low corresponds to the block counter in ChaCha20, while our counter_high corresponds to the nonce in ChaCha20 read little endian. We deviate from ChaCha20 in that wrapping the block counter increments the nonce, which avoids needing to panic if ChaChaRng instances run long enough to wrap a u64.

We could even provide set_nonce([u8; 8]) and set_offset(byte_offset: u64) methods with the second doing the byte offset instead of the block offset.

@dhardy
Copy link
Member Author

dhardy commented Apr 20, 2018

Well, I've been trying to adapt ChaCha to support variable size nonce, defined at construction time. That works, but it's a bit complex (interacting with nonce-defining code and offset/counter adjusting code). So a fixed 64-bit counter and 64-bit nonce is much easier; and also probably a better way to go if we want to make our implementation a "standard".

I've also been trying to add get_offset / set_offset functions; unfortunately this is complicated by the fact that ChaChaRng uses BlockRng, and the BlockRng fields are not exposed (we need to read/write the index and be able to regenerate results, and with that much functionality it would appear much simpler just to expose the fields like @pitdicker originally did).

@dhardy
Copy link
Member Author

dhardy commented Apr 20, 2018

I pushed some code implementing the above (with fixed 64-bit counter); I'm not really happy with it but it's sufficient to get some feedback.

Copy link
Contributor

@pitdicker pitdicker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of new functionality here!

@@ -219,20 +219,38 @@ impl<R: BlockRngCore> BlockRng<R> {
}

/// Return a reference the wrapped `BlockRngCore`.
#[inline(always)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious: did inlining turn out to be necessary, or did you just inline all methods?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't benchmarked at all, still meaning to do that.

pub fn reset(&mut self) {
self.index = self.results.as_ref().len();
}

/// Generate a new set of results immediately, setting the index to the
/// given value.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the use in combining both pieces of functionality, but doing it with a method named generate is a bit strange. I think it is better to just have get_index, set_index and generate separately.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did do that, but this provides an extra level of protection to avoid re-using the same results. It also seems to fit with the existing code extremely well.

self.0.inner_mut().set_counter(counter_low, counter_high);
self.0.reset(); // force recomputation on next use
/// This returns an error only if the result is not representable in the
/// 128-bit return value.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is always representable?

Does the returned position represent the index of the last returned value, or that of the next random number? The choice seems arbitrary, but good to document. get_counter pointed at the next block of results.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, left-over from when I was trying to support 128-bit counters.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Practically, u64 would be large enough most of the time, but probably better to use u128 to avoid extra error handling.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think that is a good choice.

/// This returns an error only if the result is not representable in the
/// 128-bit return value.
#[cfg(feature = "i128_support")]
pub fn get_offset(&self) -> u128 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not convinced on the name offset, as that sounds more like the distance between two points. What do you think of position? Or other alternatives?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced either. It's also quite a clunky function compared to get_counter, but I suppose some users may actually need this functionality! [Seek] uses seek(pos) but doesn't have a "get" equivalent.

/// Set the offset from the start of the stream, in bytes.
#[cfg(feature = "i128_support")]
pub fn set_offset(&mut self, offset: u128) -> Result<(), ()> {
if (offset & 0x3) != 0 || (128 - offset.leading_zeros() - 6 > 64) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(128 - offset.leading_zeros() - 6 > 64)

I think it is ok to just wrap around if the offset is beyond the period. I.e. just mask out the first couple of bits?

Can we define the offset/position to be in multiples of the word size? Users have to care about it already, so we may as well make the function infallible. I think there is just about no RNG that can seek to an exact byte, always to something word-size.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, I suppose we wrap anyway during usage!

Seeking to word size also makes a lot of sense given that we're trying to "standardise" this such that fill_bytes consumes a whole number of words.

core: ChaChaCore,
}

impl ChaChaBuilder {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really like the builder approach here. Does it have any real advantages?

The three provided methods here also make sense an the RNG itself, not only on the builder.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, there's a little problem with setting the nonce: one may need to regenerate the current buffer, and it doesn't exactly make sense to do that mid-buffer — surely you'd only set the nonce if resetting the counter too? When it comes to selecting a stream I'm not sure.

The set_rounds thing I don't know; I'm still not quite convinced what we should do with this especially with regards to CryptoRng.

Copy link
Contributor

@pitdicker pitdicker Apr 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am (already was) ok with hiding/removing set_rounds for now. Actually I glanced over the function too fast, I though the three builder methods for for nonce, stream and counter. What do you think of just three 'setting' methods on ChaChaRng, like so?

  • set_position(&mut self, position: u128). High-level, has to regenerate the buffered results if the position does not lie on a block boundary. Otherwise waits with generating until on next use. Position is the position of the next value to return.
  • set_stream(&mut self, stream: u64). High-level, schedules generating a new block of results.
  • set_counter_and_nonce(&mut self, counter: u64, nonce: [u8; 8]). Low-level, also schedules generating a new block of results.

And I suppose three corresponding getters.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems a reasonable choice. But if we had separate set_counter and set_nonce which each reset the block, the compiler would optimise out the extra reset if both were called anyway, right? So maybe four methods are better.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reset does nothing more than reset the index to results.len(). If this is optimized out or not is immeasurable.

I thought combining them made sense because as you said, if you change the nonce you probably also want to change the counter. And if you want to change only the counter, either do it in combination with the nonce or use set_position (when i128 gets more common).

/// Set the nonce from a stream number.
///
/// This simply reinterprets the nonce region as a stream code.
pub fn set_stream(mut self, stream: u64) -> Self {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was slightly confused about the use of this, but 👍.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be useful with parallel iterators perhaps?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but I suppose there are more situations.

@pitdicker
Copy link
Contributor

B.t.w. I don't envy you making the controversial PRs. Keep them coming 😄.

@pitdicker
Copy link
Contributor

@dhardy To be honest I expected rand 0.5 to be released about three weeks ago. Is there something I can do to help?

@dhardy
Copy link
Member Author

dhardy commented Apr 25, 2018

Find a mutually-acceptable solution to #416?

@dhardy
Copy link
Member Author

dhardy commented Apr 26, 2018

I've been thinking about generalising this by adding some things to SeedableRng:

trait SeedableRng {
    ...
    type Nonce: Sized + Default + AsMut<[u8]>; // nonce / seed type
    type Options: Default;  // type-specific creation options
    fn from_seed_and_nonce(seed: Seed, nonce: Nonce, opts: Options);
    fn with_seed(seed: Seed) -> SeedableRngBuilder<Self>;
}

struct<T: SeedableRng> SeedableRngBuilder<T> {
    seed: <T as SeedableRng>::Seed,
    nonce: <T as SeedableRng>::Nonce,
    pub opts: <T as SeedableRng>::Opts,
}
impl<T: SeedableRng> SeedableRngBuilder<T> {
    fn with_nonce(mut self, nonce) -> Self { .. }
    fn with_stream(mut self, stream: u64) -> Self { .. } // set nonce from stream
    fn build(self) -> T {..}
}

// used like:
let mut rng = ChaChaRng::with_seed(my_seed).with_nonce(my_nonce).build();

It would a be fairly simple addition and wouldn't need to impair on the existing ways to use SeedableRng (although from_seed would be redundant), but ultimately I'm not sure it's worth it since very few RNGs support a nonce/seed anyway (although we could find a way to make them support one).

Note that the more obvious model, an extra trait with fn set_nonce which ChaCha may implement, would not work for all RNGs supporting nonces since mixing may be required (although it might also work for PCG).

It seems an interesting idea finding a general method to support nonces in RNGs, but of course the simpler general method is simply to use part of the seed as the nonce, where required (Hc128 actually divides the 256-bit seed into 128-bit key and 128-bit IV internally).

@pitdicker
Copy link
Contributor

It seems an interesting idea finding a general method to support nonces in RNGs

Are you sure you are not over-thinking things? Because I understand it, a nonce has nothing to do with RNGs. It is already a concession to expose it for ChaCha.

Allowing configuration of the number of rounds via state
adjustment has disadvantages: undesirable manipulation of
the generator and interaction with CryptoRng.
Also restricts the counter to 64-bits (still allows 1 ZiB output)
The primary rationale is simpler specification of this RNG.
@dhardy
Copy link
Member Author

dhardy commented Apr 27, 2018

Rewritten and massively simplified:

  • remove set_rounds for now
  • replace set_counter with set_word_pos and set_stream
  • adjusted specification to 64-bit counter and 64-bit stream specifier

This is partly with @burdges comments in mind about keeping the specification simple and easy for other implementations to match.

Possibly we should add a test-vector involving streams and counters, though we sort-of have that already.

@dhardy dhardy mentioned this pull request Apr 27, 2018
33 tasks
}
}
}

impl CryptoRng for ChaChaCore {}

impl From<ChaChaCore> for ChaChaRng {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Worth adding to HC-128 and ISAAC too (eventually)?

Copy link
Contributor

@pitdicker pitdicker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Glad to say this is looking good!

I only have a few small comments.

/// nonce. A 64-bit counter over 64-byte (16 word) blocks allows 1 ZiB of output
/// before cycling, and the stream identifier allows 2<sup>64</sup> unique
/// streams of output per seed. Both counter and stream are initialized to zero
/// but may be set via [`set_word_pos`] and [`set_stream`].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you document that set_word_pos is only available with the nightly or i128_support features?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess, but i128 support will be standard at some point in the near-ish future anyway.

Copy link
Member Author

@dhardy dhardy Apr 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since a u64 can represent 64 Exabytes, it feels a bit stupid using a u128 here — should we use a u64 instead and panic if the upper four bits of the counter are non-zero? This will suit the vast majority of users just fine, with only half the output size.

Note that in this case there is some rationale for allowing set/get counter directly too (arguably jumping to an arbitrary block address is a separate use case).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If u128 was stabilised a couple of months earlier, would you also worry about it?
It seems like a good property to me that all positions can be indicated and set.

/// Since the generated blocks are 16 words (2<sup>4</sup>) long and the
/// counter is 64-bits, the offset is a 68-bit number. Sub-word offsets are
/// not supported, hence the result can simply be multiplied by 4 to get a
/// byte-offset.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you document that the returned position is that of the previous value? And I have a small preference for returning the next value instead. Otherwise you get u64::MAX right after initialization.

Maybe also add a test that you can generate a couple of values twice using get_word_pos and set_word_pos?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you get u64::MAX right after initialization

Sounds like I should fix that!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this is incorrect — set_word_pos(0) will set both parts of the counter to zero, then call reset (effectively triggering generation of values on next use). get_word_pos() will return zero when the counter is zero and index is high — this relies on the index never being zero, which seems to be an informal contract of the BlockRng code. Perhaps we should make that explicit, or test for zero in get_word_pos.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, I read your code wrong. Yes, it is good 😄.

this relies on the index never being zero, which seems to be an informal contract of the BlockRng code.

The old ISAAC code, where this comes from, did use a zero index right after seeding. Because a new batch of results was already generated after seeding, and 0 values of those where used.

Your new generate_and_set can provide the same, and I see no harm in it.

Copy link
Member Author

@dhardy dhardy Apr 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. Annoying to have to check another thing, but best to be correct. 👍

Edit: I see my code does not rely on index not being zero, so no problem there.

self.0.reset(); // force recomputation on next use
/// This is initialized to zero; 2<sup>64</sup> unique streams of output
/// are available per seed/key.
pub fn set_stream(&mut self, stream: u64) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You decided to drop set_nonce? Or only to not add it for now?

Would it be enough to just explain here that converting a [u8, 8] nonce to a u64 little-endian stream does the trick?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes; RNG streams we potentially have use-cases for; nonces as you mentioned aren't usually used with RNGs. I guess I could mention this, but we don't support the IETF variant properly anyway due to counter increment, and probably that is the more widely used variant with nonces?


/// Generate a new set of results immediately, setting the index to the
/// given value.
pub fn generate_and_set(&mut self, index: usize) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good, you agree. Given that I wanted to avoid/minimise breakage with the published API, this seemed like the best option.

@pitdicker pitdicker mentioned this pull request Apr 29, 2018
@dhardy
Copy link
Member Author

dhardy commented Apr 30, 2018

I addressed some of @pitdicker's review concerns and found a bug in BlockRng::fill_bytes' x86/64 impl. The fix isn't entirely optional but is probably better than adding another branch (we shouldn't reset index if we used less than one block of results).

@dhardy dhardy merged commit 30834a5 into rust-random:master Apr 30, 2018
@dhardy dhardy deleted the chacha branch April 30, 2018 18:49
@pitdicker
Copy link
Contributor

🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
B-API Breakage: API E-question Participation: opinions wanted
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants