Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rand performs unaligned memory access (and invalid raw ptr usage) #779

Closed
RalfJung opened this issue Apr 19, 2019 · 7 comments · Fixed by #783
Closed

rand performs unaligned memory access (and invalid raw ptr usage) #779

RalfJung opened this issue Apr 19, 2019 · 7 comments · Fixed by #783

Comments

@RalfJung
Copy link
Contributor

The following program fails to run in Miri with an unaligned memory access:

use rand::Rng;

fn main()
{
    let mut rng = rand::thread_rng();
    let _val = rng.gen::<i32>();
    let _val = rng.gen::<usize>(); // causes unaligned memory access
}

This is caused by this line of code:

unsafe { *(&results[index] as *const u32 as *const u64) }

That casts a 4-aligned pointer to a *const i64 and then uses that for a read. i64 reads must be 8-aligned. There is a comment in the souce indicating that this "requires little-endian CPU supporting unaligned reads", but that is not sufficient: the alignment requirement is made by LLVM, not by the CPU. In Rust, all accesses must be aligned on all platforms, no matter what the CPU says; any unaligned access causes UB (on all platforms). Unaligned accesses are only allowed with read_unaligned and write_unaligned.

There is another problem with this code as well: It violates pointer provenance rules. &results[index] as *const u32 creates a raw pointer that may be used to access this particular element, but it is not legal to use this pointer to access neighboring elements that were never cast to a raw pointer. You could try to fix this with something like &results[index..index+1] as *const [u32] as *const u32 as *const u64.

Is the performance of the "fallback code path" really so much worse that all these hacks are worth it?

@RalfJung
Copy link
Contributor Author

fill_bytes and try_fill_bytes in the same file also seem to be based on the incorrect assumption that unaligned accesses are okay in Rust when compiling for x86:

    // As an optimization we try to write directly into the output buffer.
    // This is only enabled for little-endian platforms where unaligned writes
    // are known to be safe and fast.

@dhardy
Copy link
Member

dhardy commented Apr 19, 2019

Thanks @RalfJung; some good questions here.

This particular example is around 13% faster with the HC128 generator on my system (gen_u64_hc128 benchmark), but with much less difference for other generators (1-4%). In my opinion, that's enough of a difference to be worth taking if it's free, but not if it causes UB.

Following your suggestions, I get around the same performance with this version of the code:

            if cfg!(any(target_arch = "x86", target_arch = "x86_64")) {
                // requires little-endian CPU supporting unaligned reads:
                let ptr: *const u64 = &results[index..index+1] as *const [u32] as *const u32 as *const u64;
                unsafe { read_unaligned(ptr) }
            }

Is this acceptable in your opinion? Knowing where to draw the line between unsafe platform-specific optimisations and safe, portable code isn't always easy.

The same performance observations appear to apply to the fill_bytes code: HC128 is around 10% faster, while other generators aren't affected much.

Of note is that HC128 is currently the algorithm behind thread_rng, but will likely be replaced in the future.

@RalfJung
Copy link
Contributor Author

Is this acceptable in your opinion?

This looks good, I'll try to run it in Miri as well. Do you have this in a branch somewhere?

Of note is that HC128 is currently the algorithm behind thread_rng, but will likely be replaced in the future.

That's in fact how I noticed, as you can see from my example above. I should probably try to run the entire rand test suite in Miri... will it exercise all these code paths irrespective of what thread_rng chooses?

@RalfJung
Copy link
Contributor Author

I can confirm that RalfJung@8303309 makes my small tests pass in Miri.

@dhardy
Copy link
Member

dhardy commented Apr 19, 2019

Running the entire test suite should probably cover most of the unsafe code, but it would be worth testing all the sub-crates in the repo (e.g. BlockRng64 is only used by Isaac64Rng which is not a dependency of the main crate).

Thanks again for stepping in to fix the unsafe pointer access; as you can tell we can do with a bit of help here.

@RalfJung
Copy link
Contributor Author

I'm afraid the fill_bytes problem is much harder to fix. This creates an unaligned reference at

rand/rand_core/src/block.rs

Lines 234 to 237 in 9828cdf

let dest_u32: &mut R::Results = unsafe {
&mut *(dest[filled..].as_mut_ptr() as
*mut <R as BlockRngCore>::Results)
};

and then passes that to generate. References must always be aligned even if they are not used.

So to fix this, generate would have to take a raw pointer instead. But that seems to be a trait method so that's not that easy...

@dhardy
Copy link
Member

dhardy commented Apr 20, 2019

Here, filled should always be a multiple of four, but of course that's not enough since the dest argument may not be 32-bit aligned.

I'm not sure how important the direct copy is to performance (it's only going to be encountered with large fill requests), hence we could refactor this to generate to the self.results buffer or a temporary and then copy from there.

Refactoring the trait isn't completely out of the question (it doesn't have a massive number of users), but I don't wish to force block RNG implementations to use ugly unsafe code, therefore would prefer generate does not take a raw pointer.

This was referenced Apr 20, 2019
pietroalbini added a commit to pietroalbini/rust that referenced this issue Aug 7, 2019
bump rand in libcore/liballoc test suites

This pulls in the fix for rust-random/rand#779, which trips Miri when running these test suites.

`SmallRng` (formerly used by libcore) is no longer built by default, it needs a feature gate. I opted to switch to `StdRng` instead. Or should I enable the feature gate?
Centril added a commit to Centril/rust that referenced this issue Aug 8, 2019
bump rand in libcore/liballoc test suites

This pulls in the fix for rust-random/rand#779, which trips Miri when running these test suites.

`SmallRng` (formerly used by libcore) is no longer built by default, it needs a feature gate. I opted to switch to `StdRng` instead. Or should I enable the feature gate?
Centril added a commit to Centril/rust that referenced this issue Aug 8, 2019
bump rand in libcore/liballoc test suites

This pulls in the fix for rust-random/rand#779, which trips Miri when running these test suites.

`SmallRng` (formerly used by libcore) is no longer built by default, it needs a feature gate. I opted to switch to `StdRng` instead. Or should I enable the feature gate?
Centril added a commit to Centril/rust that referenced this issue Aug 8, 2019
bump rand in libcore/liballoc test suites

This pulls in the fix for rust-random/rand#779, which trips Miri when running these test suites.

`SmallRng` (formerly used by libcore) is no longer built by default, it needs a feature gate. I opted to switch to `StdRng` instead. Or should I enable the feature gate?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants