-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
semantics of black_box and clobber in the memory model #311
Comments
These functions, by definition, are no-ops to the virtual machine. All they are, are hints to the optimizer. The definition of these functions is implementation-defined. It's similar to the |
I tend to agree to that positon, though then their docs need to be much more carefully worded, and they may not make sense to put in std. But the underlying questions -- what the effects of |
@rkruppe It is all implementation defined. Defining |
Defining |
@rkruppe what could we define? Beyond it being UB to store to a memory location you don't have mutable access to, or load from a memory location you don't have access to, and it being fine to not do that. Assembler is completely outside the bounds of a reasonable memory model - the best you can do is treat it as any other linked thing, and force the visible effects of the "call" to be valid under the Rust object/memory model. |
@ubsan That's pretty much what I was thinking of and it's already something more useful than "it's all implementation defined"! Saying precisely which memory locations the inline asm can and can't affect already tells us in which cases it can't possibly block optimizations, and (looking beyond the scope of |
@rkruppe Yeah, fair. I'd define it similarly to how it'd be defined to call out to C with. for |
@ubsan |
@gnzlbg oh. Then there's like, literally nothing that |
It's a side-effect that reads all memory and does something observable with it - but it does not alter any memory that the program can observe. |
This is off-topic but I keep forgetting to point it out so I'll do it now: that's not what clobbering usually means, and not what |
@rkruppe |
@ubsan When applied to "memory as a whole", yes -- the term is also commony applied to specific memory locations and to registers. Although in the context of inline asm I wouldn't call it a an optimizer hint. But I don't know what @gnzlbg has envisioned for the proposed function in |
Yes, sorry for the confusion, the RFC currently gives it the
semantics, so it makes sense to me to call that clobber. For the purpose of benchmarking,
Sorry for calling that Sorry again for the confusion. |
So what would the
|
honestly, it seems like you could just explain how it hints the optimizer in the |
From an optimization standpoint, if I had to describe these functions in the model, I'd describe them as calls to inherently unknown functions -- functions that the compiler cannot make any assumptions about, and thus has to be maximally pessimistic in terms of optimizations (or, equivalently, functions that just block any sort of interprocedural analysis). So, I would think that pub fn foo(...) {
...
clobber();
...
} can be optimized exactly the same way that the following can be optimized pun fn foo(..., clobber: fn()) {
...
clobber();
...
} Similar for In particular, However, I think the second example @gnzlbg gave cannot reasonably work: {
let tmp = x;
black_box(&tmp);
clobber();
} Here, {
let tmp = 3;
bar(&tmp);
let tmp2 = tmp;
baz();
return tmp == tmp2; // optimized to true (tmp will stay 3 all the time)
} Imagine So, I agree with @ubsan when they say
Calling C via FFI is another example of calling an unknown function. The funny thing is that in C, if I follow the same rules I laid out above, I do get the behavior that @gnzlbg wants for int tmp = 3;
bar(&tmp);
int tmp2 = tmp;
baz(); // bar could have stored the pointer to `tmp` somewhere accessible to baz, so this can change `tmp`
return (tmp == tmp2); // not optimizable The reason for this difference is that Rust's stronger type system enables the optimizer to reason about aliasing without interprocedural analysis, making optimizations of pointer operations much simpler than they are in C. That's the entire point of the unsafe code guidelines team's struggle with the aliasing rules. Mostly, this is beneficial, but if you want to inhibit optimizations of course this becomes a problem. Depending on how the rules around raw pointers work out, it may be that calling Also, @gnzlbg what do you think about other cases that concern Rust's aliasing rules? For example fn foo2(x: &i32) -> bool {
let y = *x;
clobber(); // cannot possibly write to x as we have a shared reference
y == *x // optimized to true
} Again a C compiler would have to assume that |
@RalfJung I mostly agree, but slightly disagree about the way the compiler gets to that point. Specifically, I don't think it's to do with the function returning - I don't think it's invalid for |
@RalfJung your examples are great, thank you! If
then: unsafe {
let mut tmp = 3;
black_box(&mut tmp as *mut _);
let tmp2 = tmp; // A
clobber();
return tmp == tmp2; // B
} are the following statements:
true or is this behavior something reasonable to expect from the memory models that have been explored in this repo?
I think that That is, for the For example, one could pass fn foo3(x: &i32) -> bool {
unsafe {
black_box(x as *const _ as *mut _); // OK: writes &mut x to memory
let y = *x;
clobber(); // modifies *x via *&mut -> UB
y == *x
}
} However, for writing benchmarks, we don't need
which we can then use: fn foo4(x: &mut i32) -> bool {
unsafe {
black_box(x as *mut _); // OK: writes &mut x to memory
*x = 2;
global_side_effect(); // OK: reads *x via *&mut but does not modify it.
// ^^ forces the compiler to emit a store for *x = 2
2 == *x // compiler can optimize this to true
}
} |
Ah, this is "check only on usage" vs. "check all the time" again. The difference between our two models is whether the compiler is apparent in the following code: {
let mut tmp = 3;
bar(&tmp);
baz();
tmp = 4;
} Assuming
Yes that sounds reasonable. This is behavior that (I think) we would want allow unsafe code to have.
This is a subtle one. Already small variants of this should probably not be allowed. Consider unsafe {
let mut tmp = 3;
black_box(&mut tmp as *mut _);
let tmp_ptr = &mut tmp;
*tmp_ptr = 3;
clobber();
return *tmp_ptr; // Can be optimized to return 3
} The reasoning here would be that by writing For your concrete code, I am not sure. I am worried people will expect it to work, which means we better make it work. But see below for why that's rather funny.
That makes me happy =D
So this would be a function that can do strictly less than clobber? It has the same signature, and clobber is the "most unknown function possible" with that signature (I think).
Ah, this is an interesting one, actually not to dissimilar from my example above. It is not at all clear whether This is my understanding of (some variant of) the ACA (asserting-conflicting-access) model: (A) asserts that the reference is valid here, The take-away is that the Rust memory model is pretty strict about what memory you can even read, and if you want The first example in your post skids the edge of an ACA violation: Instead of using a reference, it uses the original variable again. It seems strange to let the rules be any different then -- a very surprising subtle difference. So maybe, if your example should be okay (as in, {
let mut tmp = 3;
safe_black_box(&mut tmp as *mut _);
let tmp2 = tmp; // A
safe_clobber();
return tmp == tmp2; // B
} We most certainly do want to optimize this to Well, this thread has certainly lead to some very interesting examples, and some concrete questions that, once answered, will provide much clearer boundaries to the model. :) |
Calling unsafe {
let mut tmp = 3;
let x = &mut tmp;
let y = y as *mut _;
let z = y;
*x = 2;
{
*y = 3;
*z = 4;
}
*x == 2; // is the compiler allowed to optimize this check away here?
} It might also be that things are valid in this case because it happens "within a function", but not in the
This is new to me (and makes sense). I think that answering whether |
Hold on why should clobber ever have UB? That seems like a huge footgun with precisely zero advantages. If an access to some memory location would trigger UB, clobber should not access it. |
@rkruppe IIUC to avoid undefined behavior we would need to state that |
Actually, I don't think the docs of clobber need to say anything special about which memory it accesses. The optimizer already assumes programs don't have UB, including by accessing memory in ways they're not allowed to. That's pretty much what UB means to optimizers, after all. We just need to not specify or imply that clobber triggers UB. |
Doesn't that make it practically useless? For the original case of writing a benchmark for fn foo() -> Duration {
let mut vec = Vec::with_capacity(N);
black_box(vec.as_ptr()); // writes *const Vec<i32> to memory
let start = now();
for i in 0..N {
vec.push(0);
clobber(); // A
}
(now() - start) / N
} In The following might work, but at this point, at least for benchmarking purposes, we wouldn't need clobber anymore: fn foo() -> Duration {
let mut vec = Vec::with_capacity(N);
black_box(vec.as_ptr()); // writes *const Vec<i32> to memory
let start = now();
for i in 0..N {
vec.push(0);
vec = back_box(vec);
// back_box(&mut vec[vec.len() - 1]);
// or similar
}
(now() - start) / N
} |
I don't see how a clobber that can cause UB changes that, since the optimizer already assumes UB is not triggered (i.e., those memory locations are not accessed) You're probably right that this makes it useless, but that's because the strong optimization blocking effect is inherently incompatible with the strong optimization enabling effect of Rust's aliasing rules :P |
I probably also picked confusing terminology... I've been in the mental mode of So, when I asked "would you think clobber is allowed", I implicitly assumed that
Absolutely. And as long as you only use raw pointers, you are fine. The open question is what happens when you mix-and-match accesses through references (that are supposed to come with strong aliasing guarantees) and accesses through raw pointers. Concerning @gnzlbg's
Yes, giving away a mutable reference to the vector on each round through the loop is the way to go. Is there any fundamental problem with having only
Yes, that's really the key of the issue here. |
There aren't any fundamental problems AFAICT.
However, all the limits discussed about
FWIW, LLVM currently optimize that example to Here LLVM should always be able to completely remove the loop. Whether the memory allocation should be elidable or not is a completely different topic. |
Absolutely.
Right, so that that much is uncontroversial -- if So, a minimal safe API could be just having If you want, you could include additional language saying that this function may also observe "leaked" shared data (
😂 (I assume that's without using clobber/black_box) Okay, I was talking in a context that actually uses the vector later. The loop should not have to do a size check on each round as the vector was allocated with enough capacity. That's one of the advanced optimizations benchmarks @gankro would like us to get right. ;) |
Closing - |
To be clear, the semantics of the stable function are a NOP in the memory model. Specifying a 'guaranteed' black_box remains a fun topic of conversation but not something we have to track. |
I've just closed rust-lang/rfcs#2360 due to @rkruppe's input that by trying to specify what
black_box
andclobber
do there the RFC is basically specifying a subset of the memory model.So I'd like to ask feedback for these here first.
The specification in the RFC is very loose.
mem::black_box(x)
is specified as follows:x
to memorywhile
mem::clobber()
:where I have no idea how to specify memory but @rkruppe came up with a minimal example that helps:
Here, the compiler is allowed to optimize the store of
x
totmp
away, because the only code that has the address oftmp
is in that scope, and that code does not read or write fromtmp
. In the specification ofclobber
, when it states "read/write from/to all memory", "memory" does not refer totmp
. However, if I change the example to:in this case
clobber()
requires the store ofx
totmp
to be live, becauseblack_box
has written the address oftmp
to "memory", and thus the "read/write to/from memory" inclobber
can do something withtmp
.So a big question I have is what is "memory" here, and how does
tmp
in the first example differs from the rest?I don't know how these could make sense in Rust memory model, what the wording for their specification should be, and I am barely qualified to follow the discussion at all. The RFC contains more information, and lots of godbolt links, but
black_box
andclobber
proposed implementaiton is this:Also, @rkruppe asked:
To which @nagisa answered:
cc @rkruppe @nagisa
The text was updated successfully, but these errors were encountered: