-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
slice: avoid calling memcmp with size 0 #113435
Conversation
r? @scottmcm (rustbot has picked a reviewer for you, use r? to override) |
I would prefer if we did the same thing we already do for |
We don't do that for memcpy though, LLVM does. We just call LLVM's We can of course start making such an assumption directly from the Rust side as well. Then that should be documented somewhere though. Also, does that mean all Rust code calling Cc @rust-lang/opsem |
What about adding a new intrinsic for this which a codegen backend may lower to |
Sure, then Miri could apply different rules. However in terms of documenting this, that would just move the responsibility about this assumption from t-libs to t-compiler. |
LLVM assigns defined behavior to memcmp with size 0 1. While I can't figure out how to make it insert a memcmp call with size 0, I don't think it is unrealistic for this to happen now or in the future. And when it happens t-compiler still has the burden of this assumption even if t-libs decided to avoid making this assumption in the standard library. Footnotes |
Yeah but that's their intrinsic, not ours. ;) |
Should we add a rustc intrinsic for That would let the length check be a codegen issue, rather than a library issue. (I visited this from my review queue, but I tweaked a bunch of labels because it seemed like this wasn't just wanting on code review, but more a policy discussion. Feel free to set it back when it's in a "just confirm the code" state.) |
That would help in one case: if we don't want Rust code calling I don't know which team is the one to make that call... T-lang? |
We discussed this in the libs meeting today. We don't want to add an unnecessary check for something that will never be a real issue in practice. All real We see 2 potential ways to resolve this:
|
I'll make a PR to make an intrinsic for it. That'll give a nice place to write our expectations for it, and have it up to the backend to guarantee those things as needed. EDIT: PR #114382 |
I was pointed to https://doc.rust-lang.org/nightly/core/index.html#how-to-use-the-core-library as a good place to document this assumption. Do you want to do that in the same PR or should I make a separate PR? |
I opened #114412 for the documentation part. So this PR can be closed now. |
…r=cjgillot Add a new `compare_bytes` intrinsic instead of calling `memcmp` directly As discussed in rust-lang#113435, this lets the backends be the place that can have the "don't call the function if n == 0" logic, if it's needed for the target. (I didn't actually *add* those checks, though, since as I understood it we didn't actually need them on known targets?) Doing this also let me make it `const` (unstable), which I don't think `extern "C" fn memcmp` can be. cc `@RalfJung` `@Amanieu`
…r=cjgillot Add a new `compare_bytes` intrinsic instead of calling `memcmp` directly As discussed in rust-lang#113435, this lets the backends be the place that can have the "don't call the function if n == 0" logic, if it's needed for the target. (I didn't actually *add* those checks, though, since as I understood it we didn't actually need them on known targets?) Doing this also let me make it `const` (unstable), which I don't think `extern "C" fn memcmp` can be. cc `@RalfJung` `@Amanieu`
C `mem` function shims: consistently treat "invalid" pointers as UB Depends on rust-lang/rust#113435.
C `mem` function shims: consistently treat "invalid" pointers as UB Depends on rust-lang#113435.
…r=cjgillot Add a new `compare_bytes` intrinsic instead of calling `memcmp` directly As discussed in rust-lang#113435, this lets the backends be the place that can have the "don't call the function if n == 0" logic, if it's needed for the target. (I didn't actually *add* those checks, though, since as I understood it we didn't actually need them on known targets?) Doing this also let me make it `const` (unstable), which I don't think `extern "C" fn memcmp` can be. cc `@RalfJung` `@Amanieu`
This is more of an invitation for discussion that "I am sure we need to change this": according to C (C18 §7.1.4), it is UB to call
memcmp
with an "invalid" pointer. In Rust, we considerptr::invalid(42)
a perfectly valid address for an empty slice ofu8
. Whether C considers such a pointer (roughly equivalent to(int*)42
) to be "invalid" is not entirely clear to me. The section on "Cast operators" in C is very short so I assume int-to-ptr casts are specified (or rather, mostly left unspecified) elsewhere. In the future things might get worse if t-opsem decides that more references are valid for size 0 (including OOB/UAF pointers) -- those are definitely "invalid" in C. But then, it is not clear which part of those rules apply when the function is called from another language.The safe thing to do is to avoid calling
memcmp
when the size is 0. However I don't know the cost of this (in terms of performance, in particular).Cc @rust-lang/libs