rand performs unaligned memory access (and invalid raw ptr usage) #779

RalfJung · 2019-04-19T07:57:29Z

The following program fails to run in Miri with an unaligned memory access:

use rand::Rng;

fn main()
{
    let mut rng = rand::thread_rng();
    let _val = rng.gen::<i32>();
    let _val = rng.gen::<usize>(); // causes unaligned memory access
}

This is caused by this line of code:

rand/rand_core/src/block.rs

Line 186 in 852988b

unsafe { *(&results[index] as *const u32 as *const u64) }

That casts a 4-aligned pointer to a *const i64 and then uses that for a read. i64 reads must be 8-aligned. There is a comment in the souce indicating that this "requires little-endian CPU supporting unaligned reads", but that is not sufficient: the alignment requirement is made by LLVM, not by the CPU. In Rust, all accesses must be aligned on all platforms, no matter what the CPU says; any unaligned access causes UB (on all platforms). Unaligned accesses are only allowed with read_unaligned and write_unaligned.

There is another problem with this code as well: It violates pointer provenance rules. &results[index] as *const u32 creates a raw pointer that may be used to access this particular element, but it is not legal to use this pointer to access neighboring elements that were never cast to a raw pointer. You could try to fix this with something like &results[index..index+1] as *const [u32] as *const u32 as *const u64.

Is the performance of the "fallback code path" really so much worse that all these hacks are worth it?

The text was updated successfully, but these errors were encountered:

RalfJung · 2019-04-19T09:31:56Z

fill_bytes and try_fill_bytes in the same file also seem to be based on the incorrect assumption that unaligned accesses are okay in Rust when compiling for x86:

    // As an optimization we try to write directly into the output buffer.
    // This is only enabled for little-endian platforms where unaligned writes
    // are known to be safe and fast.

dhardy · 2019-04-19T09:33:31Z

Thanks @RalfJung; some good questions here.

This particular example is around 13% faster with the HC128 generator on my system (gen_u64_hc128 benchmark), but with much less difference for other generators (1-4%). In my opinion, that's enough of a difference to be worth taking if it's free, but not if it causes UB.

Following your suggestions, I get around the same performance with this version of the code:

            if cfg!(any(target_arch = "x86", target_arch = "x86_64")) {
                // requires little-endian CPU supporting unaligned reads:
                let ptr: *const u64 = &results[index..index+1] as *const [u32] as *const u32 as *const u64;
                unsafe { read_unaligned(ptr) }
            }

Is this acceptable in your opinion? Knowing where to draw the line between unsafe platform-specific optimisations and safe, portable code isn't always easy.

The same performance observations appear to apply to the fill_bytes code: HC128 is around 10% faster, while other generators aren't affected much.

Of note is that HC128 is currently the algorithm behind thread_rng, but will likely be replaced in the future.

RalfJung · 2019-04-19T12:19:41Z

Is this acceptable in your opinion?

This looks good, I'll try to run it in Miri as well. Do you have this in a branch somewhere?

Of note is that HC128 is currently the algorithm behind thread_rng, but will likely be replaced in the future.

That's in fact how I noticed, as you can see from my example above. I should probably try to run the entire rand test suite in Miri... will it exercise all these code paths irrespective of what thread_rng chooses?

RalfJung · 2019-04-19T12:44:35Z

I can confirm that RalfJung@8303309 makes my small tests pass in Miri.

dhardy · 2019-04-19T14:08:16Z

Running the entire test suite should probably cover most of the unsafe code, but it would be worth testing all the sub-crates in the repo (e.g. BlockRng64 is only used by Isaac64Rng which is not a dependency of the main crate).

Thanks again for stepping in to fix the unsafe pointer access; as you can tell we can do with a bit of help here.

RalfJung · 2019-04-19T15:15:44Z

I'm afraid the fill_bytes problem is much harder to fix. This creates an unaligned reference at

rand/rand_core/src/block.rs

Lines 234 to 237 in 9828cdf

    
           let dest_u32: &mut R::Results = unsafe { 
        
               &mut *(dest[filled..].as_mut_ptr() as 
        
               *mut <R as BlockRngCore>::Results) 
        
           };

and then passes that to generate. References must always be aligned even if they are not used.

So to fix this, generate would have to take a raw pointer instead. But that seems to be a trait method so that's not that easy...

dhardy · 2019-04-20T07:08:57Z

Here, filled should always be a multiple of four, but of course that's not enough since the dest argument may not be 32-bit aligned.

I'm not sure how important the direct copy is to performance (it's only going to be encountered with large fill requests), hence we could refactor this to generate to the self.results buffer or a temporary and then copy from there.

Refactoring the trait isn't completely out of the question (it doesn't have a massive number of users), but I don't wish to force block RNG implementations to use ugly unsafe code, therefore would prefer generate does not take a raw pointer.

bump rand in libcore/liballoc test suites This pulls in the fix for rust-random/rand#779, which trips Miri when running these test suites. `SmallRng` (formerly used by libcore) is no longer built by default, it needs a feature gate. I opted to switch to `StdRng` instead. Or should I enable the feature gate?

RalfJung mentioned this issue Apr 19, 2019

test rand a bit more rust-lang/miri#701

Merged

dhardy mentioned this issue Apr 19, 2019

Would this repo be open to a SIMD implementation of ChaCha20? #667

Closed

This was referenced Apr 20, 2019

add Miri to CI #781

Merged

Block RNGs: remove unaligned memory cast #783

Merged

dhardy closed this as completed in #783 Apr 23, 2019

RalfJung mentioned this issue May 28, 2019

Stacked Borrows: raw pointer usable only for T too strict? rust-lang/unsafe-code-guidelines#134

Open

RalfJung mentioned this issue Aug 4, 2019

bump rand in libcore/liballoc test suites rust-lang/rust#63261

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rand performs unaligned memory access (and invalid raw ptr usage) #779

rand performs unaligned memory access (and invalid raw ptr usage) #779

RalfJung commented Apr 19, 2019

RalfJung commented Apr 19, 2019

dhardy commented Apr 19, 2019

RalfJung commented Apr 19, 2019

RalfJung commented Apr 19, 2019

dhardy commented Apr 19, 2019

RalfJung commented Apr 19, 2019

dhardy commented Apr 20, 2019

rand performs unaligned memory access (and invalid raw ptr usage) #779

rand performs unaligned memory access (and invalid raw ptr usage) #779

Comments

RalfJung commented Apr 19, 2019

RalfJung commented Apr 19, 2019

dhardy commented Apr 19, 2019

RalfJung commented Apr 19, 2019

RalfJung commented Apr 19, 2019

dhardy commented Apr 19, 2019

RalfJung commented Apr 19, 2019

dhardy commented Apr 20, 2019