-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Miscompilation: Equal pointers comparing as unequal #107975
Comments
Godbolt indicates that this is EarlyCSE in LLVM optimizing the pointer comparison to false despite the two stack allocations have non-overlapping lifetime ranges. |
This seems related to pointer provenance. |
The bug itself or the cause of the bug? I don't know what the incorrect reasoning that LLVM is using here is, so I can't comment on the cause of the issue. But both Rust and LLVM clearly define pointer comparison to be address based, which means that I don't think we need to think about provenance in order to conclude that this is a miscompilation |
tl;dr even if the docs were changed to take provenance into account when comparing for equality, there would still be a miscompilation, because evaluating the same expression twice leads to different results! fn main() {
let a: *const u8;
let b: *const u8;
{
let v: [u8; 16] = [core::hint::black_box(0); 16];
a = &(v[0]);
}
{
let v: [u8; 16] = [core::hint::black_box(0); 16];
b = &(v[0]);
}
println!("{a:?} == {b:?} evaluates to {}", a==b);
println!("{a:?} == {b:?} evaluates to {}", a==b);
}
This was exactly the train of thought that led me to experiment, thereby uncovering the bug: “Can two pointers compare as equal, even when they belong to different allocations?” I experimented, figured out that they cannot, and tried to refactor my PS: In case it matters, my experiments were on the stable channel. |
In theory they could define pointer comparison to be non-deterministic, and then this would be okay. For a comparison that involves provenance, that is probably the case. But to my knowledge that's not how LLVM intends their ptr comparison to work. That last example is really strange though, why would it optimize only one of them? Also I was unable to reproduce this without printing, which is equally strange... |
I thought that was queer, so I experimented further. What I found is as follows:
let (a_, b_) = (a as u128, b as u128);
dbg!(a_ == b_, a_ ^ b_); outputs
to
|
I tried some alternatives to // Produces true on following println
black_box(format_args!("{:?}", a));
//black_box(format_args!("{:?}", b));
//format!("{:?}", a);
// Doesn't
//format_args!("{:?}", black_box(a));
//black_box(format_args!("{:?}", black_box(a)));
//format!("{:?}", black_box(a)); |
The wild thing is that In fact looks like adding black_box actually enables the optimization?!? |
Reduced to |
Yeah, this isn't all too surprising, the difference between |
Result from my bisection points to a rollup. Out of that rollup likely #102232, I guess? searched nightlies: from nightly-2022-01-01 to nightly-2023-02-14 bisected with cargo-bisect-rustc v0.6.5Host triple: x86_64-unknown-linux-gnu cargo bisect-rustc --start=2022-01-01 --script script.sh Also - If I'm correct - this should tag more t-libs than t-compiler @rustbot label -t-compiler +t-libs +t-libs-api |
Is it note-worthy that the issue can trigger without using |
How's that not surprising?^^ In terms of tings like "exposing the pointer", both should be equivalent...
In fact even the original examples work without black-box, e.g. fn main() {
let a: *const u8;
let b: *const u8;
{
let v: [u8; 16] = [0; 16];
a = &(v[0]);
}
{
let v: [u8; 16] = [0; 16];
b = &(v[0]);
}
println!("{a:?} == {b:?} evaluates to {}", a==b);
println!("{a:?} == {b:?} evaluates to {}", a==b);
} |
That just stabilizes black_box without changing its behavior. Presumably you need to add a feature flag to continue bisecting this code further into the past? Definitely looks like a t-compiler issue to me. |
Somewhat simpler, still demonstrating misbehaviour, again no Edit: updated even simpler. |
I whipped up a quick example for the Compiler Explorer, and tried it with the 10 newest and the 10 oldest stable versions of the compiler. All of them appear to fold the comparison to What's even more staggering is that this behaviour persists even if we use strict provenance and |
Here's an example without the prints in between, using a bunch of black boxes: https://godbolt.org/z/cKMra38a8 It appears that copying the value from either integer causes llvm to "realize" that they're actually equal. Before that it assumes they're not equal. Edit: actually only the one black box is needed ( |
Hmm you are all right here, the @rustbot label T-compiler -T-libs -T-libs-api P-high |
Yeah we know what it's triggered by: llvm/llvm-project#45725. It's a very hard to fix bug in LLVM boiling down to different parts of LLVM making contradicting assumptions about the semantics of |
Out of curiosity, I tried those two examples in the playground again, to see if the behaviour had changed from version to version. EDIT: Both examples currently fail with Original post preserved for history:
|
Part of the resolution to rust-lang#105107
Add a test for rust-lang#107975 The int is zero. But also not zero. This is so much fun. This is a part of rust-lang#105107. Initially I was going to just rebase rust-lang#108445, but quite a few things changed since then: * The [mcve](rust-lang#105787 (comment)) used for rust-lang#105787 got fixed.[^upd2] * You can't just `a ?= b` for rust-lang#107975 anymore. Now you have to `a-b ?= 0`. This is what this PR does. As an additional flex, it show that three ways of converting a pointer to its address have this issue: 1. `as usize` 2. `.expose_provenance()` 3. `.addr()` * rust-lang#108425 simply got fixed. Yay. As an aside, the naming for `addr_of!` is quite unfortunate in context of provenance APIs. Because `addr_of!` gives you a pointer, but what provenance APIs refer to as "address" is the `usize` value. Oh well. UPD1: GitHub is incapable of parsing rust-lang#107975 in the PR name, so let's add it here. [^upd2]: UPD2: [The other mcve](rust-lang#105787 (comment)) does not work anymore either, saying "this behavior recently changed as a result of a bug fix; see rust-lang#56105 for details."
Add a test for rust-lang#107975 The int is zero. But also not zero. This is so much fun. This is a part of rust-lang#105107. Initially I was going to just rebase rust-lang#108445, but quite a few things changed since then: * The [mcve](rust-lang#105787 (comment)) used for rust-lang#105787 got fixed.[^upd2] * You can't just `a ?= b` for rust-lang#107975 anymore. Now you have to `a-b ?= 0`. This is what this PR does. As an additional flex, it show that three ways of converting a pointer to its address have this issue: 1. `as usize` 2. `.expose_provenance()` 3. `.addr()` * rust-lang#108425 simply got fixed. Yay. As an aside, the naming for `addr_of!` is quite unfortunate in context of provenance APIs. Because `addr_of!` gives you a pointer, but what provenance APIs refer to as "address" is the `usize` value. Oh well. UPD1: GitHub is incapable of parsing rust-lang#107975 in the PR name, so let's add it here. [^upd2]: UPD2: [The other mcve](rust-lang#105787 (comment)) does not work anymore either, saying "this behavior recently changed as a result of a bug fix; see rust-lang#56105 for details."
Add a test for rust-lang#107975 The int is zero. But also not zero. This is so much fun. This is a part of rust-lang#105107. Initially I was going to just rebase rust-lang#108445, but quite a few things changed since then: * The [mcve](rust-lang#105787 (comment)) used for rust-lang#105787 got fixed.[^upd2] * You can't just `a ?= b` for rust-lang#107975 anymore. Now you have to `a-b ?= 0`. This is what this PR does. As an additional flex, it show that three ways of converting a pointer to its address have this issue: 1. `as usize` 2. `.expose_provenance()` 3. `.addr()` * rust-lang#108425 simply got fixed. Yay. As an aside, the naming for `addr_of!` is quite unfortunate in context of provenance APIs. Because `addr_of!` gives you a pointer, but what provenance APIs refer to as "address" is the `usize` value. Oh well. UPD1: GitHub is incapable of parsing rust-lang#107975 in the PR name, so let's add it here. [^upd2]: UPD2: [The other mcve](rust-lang#105787 (comment)) does not work anymore either, saying "this behavior recently changed as a result of a bug fix; see rust-lang#56105 for details."
Rollup of 4 pull requests Successful merges: - rust-lang#127003 (Add a test for rust-lang#107975) - rust-lang#127763 (Make more Windows functions `#![deny(unsafe_op_in_unsafe_fn)]`) - rust-lang#127813 (Prevent double reference in generic futex) - rust-lang#127847 (Reviewer on vacation) r? `@ghost` `@rustbot` modify labels: rollup
Add a test for rust-lang#107975 The int is zero. But also not zero. This is so much fun. This is a part of rust-lang#105107. Initially I was going to just rebase rust-lang#108445, but quite a few things changed since then: * The [mcve](rust-lang#105787 (comment)) used for rust-lang#105787 got fixed.[^upd2] * You can't just `a ?= b` for rust-lang#107975 anymore. Now you have to `a-b ?= 0`. This is what this PR does. As an additional flex, it show that three ways of converting a pointer to its address have this issue: 1. `as usize` 2. `.expose_provenance()` 3. `.addr()` * rust-lang#108425 simply got fixed. Yay. As an aside, the naming for `addr_of!` is quite unfortunate in context of provenance APIs. Because `addr_of!` gives you a pointer, but what provenance APIs refer to as "address" is the `usize` value. Oh well. UPD1: GitHub is incapable of parsing rust-lang#107975 in the PR name, so let's add it here. [^upd2]: UPD2: [The other mcve](rust-lang#105787 (comment)) does not work anymore either, saying "this behavior recently changed as a result of a bug fix; see rust-lang#56105 for details."
Add a test for rust-lang#107975 The int is zero. But also not zero. This is so much fun. This is a part of rust-lang#105107. Initially I was going to just rebase rust-lang#108445, but quite a few things changed since then: * The [mcve](rust-lang#105787 (comment)) used for rust-lang#105787 got fixed.[^upd2] * You can't just `a ?= b` for rust-lang#107975 anymore. Now you have to `a-b ?= 0`. This is what this PR does. As an additional flex, it show that three ways of converting a pointer to its address have this issue: 1. `as usize` 2. `.expose_provenance()` 3. `.addr()` * rust-lang#108425 simply got fixed. Yay. As an aside, the naming for `addr_of!` is quite unfortunate in context of provenance APIs. Because `addr_of!` gives you a pointer, but what provenance APIs refer to as "address" is the `usize` value. Oh well. UPD1: GitHub is incapable of parsing rust-lang#107975 in the PR name, so let's add it here. [^upd2]: UPD2: [The other mcve](rust-lang#105787 (comment)) does not work anymore either, saying "this behavior recently changed as a result of a bug fix; see rust-lang#56105 for details."
Add a test for rust-lang#107975 The int is zero. But also not zero. This is so much fun. This is a part of rust-lang#105107. Initially I was going to just rebase rust-lang#108445, but quite a few things changed since then: * The [mcve](rust-lang#105787 (comment)) used for rust-lang#105787 got fixed.[^upd2] * You can't just `a ?= b` for rust-lang#107975 anymore. Now you have to `a-b ?= 0`. This is what this PR does. As an additional flex, it show that three ways of converting a pointer to its address have this issue: 1. `as usize` 2. `.expose_provenance()` 3. `.addr()` * rust-lang#108425 simply got fixed. Yay. As an aside, the naming for `addr_of!` is quite unfortunate in context of provenance APIs. Because `addr_of!` gives you a pointer, but what provenance APIs refer to as "address" is the `usize` value. Oh well. UPD1: GitHub is incapable of parsing rust-lang#107975 in the PR name, so let's add it here. [^upd2]: UPD2: [The other mcve](rust-lang#105787 (comment)) does not work anymore either, saying "this behavior recently changed as a result of a bug fix; see rust-lang#56105 for details."
Add a test for rust-lang#107975 The int is zero. But also not zero. This is so much fun. This is a part of rust-lang#105107. Initially I was going to just rebase rust-lang#108445, but quite a few things changed since then: * The [mcve](rust-lang#105787 (comment)) used for rust-lang#105787 got fixed.[^upd2] * You can't just `a ?= b` for rust-lang#107975 anymore. Now you have to `a-b ?= 0`. This is what this PR does. As an additional flex, it show that three ways of converting a pointer to its address have this issue: 1. `as usize` 2. `.expose_provenance()` 3. `.addr()` * rust-lang#108425 simply got fixed. Yay. As an aside, the naming for `addr_of!` is quite unfortunate in context of provenance APIs. Because `addr_of!` gives you a pointer, but what provenance APIs refer to as "address" is the `usize` value. Oh well. UPD1: GitHub is incapable of parsing rust-lang#107975 in the PR name, so let's add it here. [^upd2]: UPD2: [The other mcve](rust-lang#105787 (comment)) does not work anymore either, saying "this behavior recently changed as a result of a bug fix; see rust-lang#56105 for details."
I tried this code:
I expected to see this happen: Either the pointers (when cast to integers) are the same and the comparison is
true
, or they are not the same and the comparison isfalse
.Instead, this happened: It printed:
(140728325198984, 140728325198984, false)
Upstream LLVM issue
Meta
Reproduced via
rustc +nightly -Copt-level=3 test.rs && ./test
.rustc --version --verbose
:Also reproduces on master.
@rustbot label +I-unsound +T-compiler +A-llvm
The text was updated successfully, but these errors were encountered: