Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Function pointer provenance #2926

Closed
VorfeedCanal opened this issue Jun 13, 2023 · 7 comments
Closed

Function pointer provenance #2926

VorfeedCanal opened this issue Jun 13, 2023 · 7 comments

Comments

@VorfeedCanal
Copy link

I'm trying to pack bunch of function pointers to save memory and have hit strange issue:

fn foo(x: i32) -> i32 {
    x * 2
}

pub fn main() {
    let foo_addr: usize = unsafe {
        std::mem::transmute::<fn(x:i32)->i32, usize>(foo)
    };
    let foo_restored: fn(x:i32)->i32 = unsafe {
        std::mem::transmute::<usize, fn(x:i32)->i32>(foo_addr)
    };
    println!("{}", foo_restored(2));
}

Here Miri complains: error: Undefined Behavior: out-of-bounds pointer use: 0x2794b[noalloc] is a dangling pointer (it has no provenance).

But fn doesn't have with_addr or anything similar. And it's not clear if it even should have it: at least in my case all functions are coming from the binary and thus all functions pointers are 'static pointers… am I wrong? Should function pointers have provenance or not?

I'm just not sure what's the actual rules here and how to fix the issue.

@saethlin
Copy link
Member

saethlin commented Jun 13, 2023

It's not clear if function pointers have provenance. So nobody is really sure what the rules are. However...

all functions are coming from the binary and thus all functions pointers are 'static pointers… am I wrong?

Yes. The fact that the functions are "in the binary" is irrelevant. You are trying to use an observation about a representation that Rust can be lowered to in order to make an argument about Rust. Similarly, lifetime is also irrelevant. Pointers to statics still require provenance to be dereferenced.

The reason that it's unclear if function pointers should have provenance or not is that it's not clear if giving function pointers provenance permits any useful optimizations; the properties of an executable have nothing to do with it.


If you transmute to *const () instead of usize, the code works fine, because the typed copy into the *const () (or really any pointer type, but *const () is the usual choice for an "I don't care" pointee type) does carry the provenance. Wheras the typed copy into a usize would drop the provenance.

If you really need pass this around as a usize (which I would advise against, but the choice is yours) you can use expose semantics. So:

fn foo(x: i32) -> i32 {
    x * 2
}

pub fn main() {
    let foo_addr: *const () = unsafe {
        std::mem::transmute::<fn(x:i32)->i32, *const ()>(foo)
    };

    let exposed = foo_addr as usize;

    let unexposed = exposed as *const ();

    let foo_restored: fn(x:i32)->i32 = unsafe {
        std::mem::transmute::<*const (), fn(x:i32)->i32>(unexposed)
    };
    println!("{}", foo_restored(2));
}

@RalfJung
Copy link
Member

Yeah, the recommended way to handle function pointers is to always transmute to/from raw pointers, and then use raw pointer APIs from there on.

@VorfeedCanal
Copy link
Author

If you really need pass this around as a usize (which I would advise against, but the choice is yours) you can use expose semantics.

It's not as if I want to convert function pointer to usize variable for the sake of it.

I just have bunch of functions which receive u8 slices and want to put them into one array: pointer to function, slice of data for that function, pointer to function, slice of data for that function and so on.

It's easily doable in Rust without Miri, but I don't know how to do that with Miri.

I tried to change your code just a tiny bit and it gives warnings:

fn foo(x: i32) -> i32 {
    x * 2
}

pub fn main() {
    let foo_addr: *const () = unsafe {
        std::mem::transmute::<fn(x:i32)->i32, *const ()>(foo)
    };

    let packed = (foo_addr as usize ^ 1).to_ne_bytes();

    println!("{packed:?}");

    let exposed = usize::from_ne_bytes(packed) ^ 1;

    let unexposed = exposed as *const ();

    let foo_restored: fn(x:i32)->i32 = unsafe {
        std::mem::transmute::<*const (), fn(x:i32)->i32>(unexposed)
    };
    println!("{}", foo_restored(2));
}

And if I try to add provenance it no longer works:

#![feature(strict_provenance)]

fn foo(x: i32) -> i32 {
    x * 2
}

fn bar() -> () {
}


pub fn main() {
    let foo_addr: *const () = unsafe {
        std::mem::transmute::<fn(x:i32)->i32, *const ()>(foo)
    };

    let packed = (foo_addr as usize ^ 1).to_ne_bytes();

    println!("{packed:?}");

    let exposed = usize::from_ne_bytes(packed) ^ 1;

    let bar_addr: *const () = unsafe {
        std::mem::transmute::<fn()->(), *const ()>(bar)
    };

    let unexposed = bar_addr.with_addr(exposed);

    let foo_restored: fn(x:i32)->i32 = unsafe {
        std::mem::transmute::<*const (), fn(x:i32)->i32>(unexposed)
    };
    println!("{}", foo_restored(2));
}

@RalfJung
Copy link
Member

RalfJung commented Jun 14, 2023

I tried to change your code just a tiny bit and it gives warnings:

Indeed, storing pointers in integers will warn in Miri. The warning explains why:

   = help: This program is using integer-to-pointer casts or (equivalently) `ptr::from_exposed_addr`,
   = help: which means that Miri might miss pointer bugs in this program.
   = help: See https://doc.rust-lang.org/nightly/std/ptr/fn.from_exposed_addr.html for more details on that operation.
   = help: To ensure that Miri does not miss bugs in your program, use Strict Provenance APIs (https://doc.rust-lang.org/nightly/std/ptr/index.html#strict-provenance, https://crates.io/crates/sptr) instead.
   = help: You can then pass the `-Zmiri-strict-provenance` flag to Miri, to ensure you are not relying on `from_exposed_addr` semantics.
   = help: Alternatively, the `-Zmiri-permissive-provenance` flag disables this warning.

If you don't want to use with_addr etc, then disabling this warning is fine, but you get fewer assurances from Miri.

And if I try to add provenance it no longer works:

Yeah, you are using bar's provenance with foo's address, that is wrong, so Miri complains. That's the point of provenance: it has to match the object at the given address, else you get UB. The docs explain this in more detail.

@RalfJung
Copy link
Member

RalfJung commented Jun 14, 2023

I just have bunch of functions which receive u8 slices and want to put them into one array: pointer to function, slice of data for that function, pointer to function, slice of data for that function and so on.

Due to provenance, u8 is not a "universal type that can hold all initialized bytes". Using u8 loses provenance. u8 is for integers, not pointers. If you must use u8, then doing ptr as usize/usize as ptr casts is your only choice, and you will have to silence the Miri warning.

The way to avoid this loss of provenance and avoid the warning is to use a different type, such as MaybeUninit<u8>, which can hold provenance. MaybeUninit<u8> is indeed a universal "byte" type that can hold all values a byte can have in Rust.

@VorfeedCanal
Copy link
Author

If you must use u8, then doing ptr as usize/usize as ptr casts is your only choice, and you will have to silence the Miri warning.

That's what I have already did. As you can guess I don't have tons of these casts spread around, there are just one function which adds function plus slice to the buffer and another which pulls it out and calls it.

I just added hashmap from address of function to function and miri is obviously happy. Just looked a bit silly for me to do all that dance with provenance for an object which is not ever allocated or deallocated thus I decided to ask here.

But given then fact that I'm doing that for yet-another-JIT… I guess having provenance for functions is important. I just wish all statically allocated functions had the same provevance: they couldn't ever be allocated or deallocated, why would these need provenance? But then, one may argue that the same idea may be applicable to any &'static reference thus it's no longer about functions.

But I guess that's low-priority issue. Thanks for the answers.

@RalfJung
Copy link
Member

Closing as "Miri behaves as intended"; rust-lang/unsafe-code-guidelines#340 is where we track whether function pointers should have provenance at all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants