Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rustc performs auto-ref when a raw pointer would be enough #73987

Open
RalfJung opened this issue Jul 3, 2020 · 42 comments
Open

rustc performs auto-ref when a raw pointer would be enough #73987

RalfJung opened this issue Jul 3, 2020 · 42 comments
Labels
A-MIR Area: Mid-level IR (MIR) - https://blog.rust-lang.org/2016/04/19/MIR.html C-bug Category: This is a bug. F-arbitrary_self_types `#![feature(arbitrary_self_types)]` T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@RalfJung
Copy link
Member

RalfJung commented Jul 3, 2020

The following code:

#![feature(slice_ptr_len)]

pub struct Test {
    data: [u8],
}

pub fn test_len(t: *const Test) -> usize {
    unsafe { (*t).data.len() }
}

generates MIR like

        _2 = &((*_1).0: [u8]);
        _0 = const core::slice::<impl [u8]>::len(move _2) -> bb1;

This means that a reference to data gets created, even though a raw pointer would be enough. That is a problem because creating a reference makes aliasing and validity assumptions that could be avoided. It would be better if rustc would not implicitly introduce such assumptions.

Cc @matthewjasper

@jonas-schievink jonas-schievink added A-MIR Area: Mid-level IR (MIR) - https://blog.rust-lang.org/2016/04/19/MIR.html C-bug Category: This is a bug. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jul 3, 2020
@SimonSapin
Copy link
Contributor

This is specifically about field projection of a raw pointer, right?

@RalfJung
Copy link
Member Author

RalfJung commented Jul 3, 2020

I think so, yes. It is key that t starts out as a raw pointer.

@SimonSapin
Copy link
Contributor

Oh I just realized something, and I think that the issue title and description are misleading. They make it sound like we’re calling <*const [u8]>::len(self), and in the process unnecessarily going through &[u8]. But the second line of MIR shows that the method called is actually <[u8]>::len(&self). On closer look that seems completely expected to me. The expression (*t).data by itself has type [u8], and method resolution ends up finding a result through auto-ref. But there is no equivalent to auto-ref for raw pointer. If we instead try to call a raw pointer method that doesn’t have a slice method of the same name, we get an error:

trait Foo {
    fn bar(self);
}

impl Foo for *const [u8] {
    fn bar(self) {}
}

pub struct Test {
    data: [u8],
}

pub fn test_len(t: *const Test) -> usize {
    unsafe { (*t).data.bar() }
}

(Playground)

Errors:

   Compiling playground v0.0.1 (/playground)
error[E0599]: no method named `bar` found for slice `[u8]` in the current scope
  --> src/lib.rs:14:24
   |
14 |     unsafe { (*t).data.bar() }
   |                        ^^^ method not found in `[u8]`
   |
   = help: items from traits can only be used if the trait is implemented and in scope
note: `Foo` defines an item `bar`, perhaps you need to implement it
  --> src/lib.rs:1:1
   |
1  | trait Foo {
   | ^^^^^^^^^

Another example without involving a struct field:

fn ptr_after<T>(x: &T) -> *const T {
    (x as *const T).offset(1)  // Ok
}

fn ptr_after2<T>(x: &T) -> *const T {
    x.offset(1)
}

(Playground)

Errors:

   Compiling playground v0.0.1 (/playground)
error[E0599]: no method named `offset` found for reference `&T` in the current scope
 --> src/lib.rs:6:7
  |
6 |     x.offset(1)
  |       ^^^^^^ method not found in `&T`

So I’d be inclined to call this not a bug.

@RalfJung
Copy link
Member Author

RalfJung commented Jul 3, 2020

Oh I just realized something, and I think that the issue title and description are misleading. They make it sound like we’re calling <*const [u8]>::len(self), and in the process unnecessarily going through &[u8]. But the second line of MIR shows that the method called is actually <[u8]>::len(&self).

Yes, that is the problem. We should be calling the raw ptr method, but instead the compiler chooses the call the other method. That is what this bug is about. I am happy for suggestions for how to word this better. :)

Elsewhere you wrote:

The example code in #73987 never involved a *const [u8] value at all. I’ve commented some more there.

The example code also doesn't involve a &[u8]. It just involves a [u8]. The issue is that the compiler chooses to introduce an &[u8] instead of introducing *const [u8]. Either choice works synatically, but one makes way more assumptions, so we should be auto-introducing the one with fewer assumptions.

I am aware that the reason for this is auto-ref, and not having auto-raw-ptr. But that is IMO a big problem as it means it is actually very hard to call raw-self methods on paths -- and it is very easy to accidentally call the reference method instead.

@RalfJung
Copy link
Member Author

RalfJung commented Jul 3, 2020

If we instead try to call a raw pointer method that doesn’t have a slice method of the same name, we get an error:

Indeed. IMO we should not stabilize any raw ptr method where a reference method with the same name exists, until this bug is fixed. It's just too much of a footgun, with people accidentally calling the reference method instead of the raw ptr method.

@SimonSapin
Copy link
Contributor

I’m not sure I agree that this is a bug in the first place. The language never had coercion from T to *const T in any context

@RalfJung
Copy link
Member Author

RalfJung commented Jul 3, 2020

Would you agree that it is a footgun, though?

I agree it is behavior as intended. I just don't think the intentions are fit for modern Rust any more -- after all, when this behavior was designed, there were no raw-self methods.

@neocturne
Copy link
Contributor

In this particular example, the behaviour does not feel like a footgun to me: In my simplified mental model of the language, (*t) is already only valid when the usual aliasing and validity assumptions hold, even if these assumptions only actually need to hold when I do something with the result.

I would go as far as saying that having &raw const (*t).data as the supported way to get from the raw struct pointer to the raw field pointer is quite ugly because the code looks as if t is dereferenced - is there some nicer way to do this? Optimally, the test_len function in the original example shouldn't even need unsafe at all. (But I'm likely missing years of discussions on these topics, given that I've only recently taken an interest in Rust)

@RalfJung
Copy link
Member Author

RalfJung commented Jul 4, 2020

In my simplified mental model of the language, (*t) is already only valid when the usual aliasing and validity assumptions hold, even if these assumptions only actually need to hold when I do something with the result.

That is not the case though. *t (where t is a raw pointer) just requires t to be aligned and dereferencable; creating a reference (& or &mut) makes a huge difference on top of that by also making aliasing assumptions.

(What you said is basically right when t is a reference, though.)

I would go as far as saying that having &raw const (*t).data as the supported way to get from the raw struct pointer to the raw field pointer is quite ugly because the code looks as if t is dereferenced - is there some nicer way to do this?

Well, t does get dereferenced. No memory access happens, but all rules in the langauge that talk about pointers being dereferenced apply to *t, even when used in the context of &*t.

This is the same in C: &ptr->field is UB if ptr is dangling or unaligned.

@ssomers
Copy link
Contributor

ssomers commented Jul 15, 2020

Wearing an old hat, *t is not just dereferencing (for some definition) to me but how you get from a raw pointer back into the safe world. So I would expect (*t).data.len() to make all the assumptions it does. And to find in a back alley some notation like t + .data or &t->data to do pointer arithmetic, reading in the doc that pointer arithmetic is subject to the same pointer validation as dereferencing.

Wearing a newer hat, since unsafe {&raw const *t} and &raw const (*t).data exist, and don't dereference (as much as *t), it's much less clear to me what (*t).data.len()should do. Isn't quietly doing raw pointer access also a risk, leaving you unprotected by aliasing rules that you thought were being applied?

@RalfJung
Copy link
Member Author

RalfJung commented Jul 15, 2020

Isn't quietly doing raw pointer access also a risk, leaving you unprotected by aliasing rules that you thought were being applied?

Not sure what the aliasing rules are "protecting" here? Stacked Borrows is adding strictly more Undefined Behavior, i.e. you could describe it as purely existing to break your code, of course with the goal of having the compiler make other code (the one that is not broken) go faster.

Creating a raw pointer makes strictly fewer assumptions. IOW, when (& <path>).len() is a safe (i.e., non-UB) thing to do, then (&raw const <path>).len() is definitely also safe. (This assumes that the two len methods are actually doing basically the same thing, we in libstd we should definitely ensure anyway.) So I cannot see how adding an "auto-raw-addr-of"-like mechanism could ever introduce problems.

@bjorn3
Copy link
Member

bjorn3 commented Jul 15, 2020

So I cannot see how adding an "auto-raw-addr-of"-like mechanism could ever introduce problems.

What about unsafe fn foo(self: *mut Self), which requires self to point to a valid allocation. If you call foo() in an unsafe block with other unsafe code, it could be easy to miss.

@RalfJung
Copy link
Member Author

Ah, I was only thinking of cases where we current do auto-ref. As in, this is code we currently accept, but we are possibly doing the wrong thing (by creating a reference instead of a raw pointer).

I am less sure about code that currently would fail to compile; I am not proposing to accept such code.

@ssomers
Copy link
Contributor

ssomers commented Jul 15, 2020

Not sure what the aliasing rules are "protecting" here?

Not sure either. I guess I thought that the borrow checker would somehow complain about obvious aliasing but it doesn't. Also, I thought &* on raw pointers had some purpose but I have no recollection what it was.

@RalfJung
Copy link
Member Author

I thought &* on raw pointers had some purpose but I have no recollection what it was.

It's effectively a cast from a raw ptr to a (shared) reference.

@ssomers
Copy link
Contributor

ssomers commented Jul 16, 2020

Yes, but I thought there was another effect. (2 hours pass) Actually, no, that may be a lie! &* does not cast to a reference, if the context asks for a pointer, as in this playground example. It seems like such a &*could as well be removed, but I guess the pointer really is dereferenced, checked for alignment and what not. But that's not the effect I thought I read about, it's probably what bjorn3 already mentioned, and not applicable to this issue.

@RalfJung
Copy link
Member Author

Actually, no, that may be a lie! &* does not cast to a reference, if the context asks for a pointer, as in this playground example.

It does cast to a reference. But if the context asks for a pointer, it can be implicitly coerced back to a pointer again.

Given that coercion, I actually wonder if there is any good reason for <&[T]>::len to exist at all, except for backwards compatibility...

@neocturne
Copy link
Contributor

As someone asked on Zulip, I'd like to push this a bit. In particular, I'd like to have the answers to the following questions:

  • Which part of (*t).data.len() is it that invokes UB for invalid t? (I think @RalfJung already answered this above, but the answer is not entirely clear to me right now while reading the discussion)
  • Which part of (*t).data.len() should invoke UB, if we had the chance to define this now? In my opinion the *t alone should already be UB for invalid t, even in an expression like &raw const (*t).data
  • Does the existence of a raw-self method like slice ptr len() make it more likely to hit this UB without noticing? I don't really see that this is the case, so I don't consider this a blocker for Tracking Issue for raw slice getters (slice_ptr_get) #74265 etc.

@RalfJung
Copy link
Member Author

@NeoRaider this depends on what you mean by "valid". ;)

The *t alone causes UB if t dangles, i.e. if it does not point to dereferencable memory. The &((*t).data) (implicit due to the len) causes UB if reading from this pointer violates aliasing assumptions, i.e. if the pointer lost the permission to "read" from this memory; it also causes UB if the metadata is invalid (in particular if computing the size would overflow an isize) or the pointee is not initialized.

In my opinion the *t alone should already be UB for invalid t, even in an expression like &raw const (*t).data

Long-term, I think &raw const (*t).data should be UB only if this pointer arithmetic overflows. This should be allowed even when t dangles or is NULL. But sadly we currently have no good way to express this towards LLVM, which is why currently, this causes UB whenever t dangles.

@RalfJung
Copy link
Member Author

Does the existence of a raw-self method like slice ptr len() make it more likely to hit this UB without noticing? I don't really see that this is the case, so I don't consider this a blocker for #74265 etc.

That is my concern, yes: the existence of these raw ptr methods might make people think that (*t).data.len() is now allowed because len can be called with a raw pointer. If no raw ptr len method existed, there'd be no possibility for such a confusion.

@neocturne
Copy link
Contributor

Long-term, I think &raw const (*t).data should be UB only if this pointer arithmetic overflows. This should be allowed even when t dangles or is NULL. But sadly we currently have no good way to express this towards LLVM, which is why currently, this causes UB whenever t dangles.

This runs counter to my intuition of how operations compose. If *t is already UB for dangling t, it should not be possible to use this as a subexpression of something like &raw const (*t).data and have that not be UB.

@Pointerbender
Copy link
Contributor

Hi NeoRaider and RalfJung, I was directed here from Zulip and was following along this conversation. If I may, I'd like to contribute some thoughts. Rust 1.51.0 stabilized the addr_of! macro and I wonder if this macro takes away any confusion that may arise. If the programmer can expect that t.data (from the example in the first post) is unaligned, then they would most likely reach for the addr_of! macro. Here is an example of how this looks: https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=f9b0a9837823e0b89de506ea9951542d

This runs counter to my intuition of how operations compose. If *t is already UB for dangling t, it should not be possible to use this as a subexpression of something like &raw const (*t).data and have that not be UB.

The compiler currently agrees with your intuition :-) In the above example an unsafe block is needed in order to be able to call core::ptr::addr_of!((*t).data).len() to ensure that t is properly aligned.

I hope this is useful feedback from a Rust end-user. To me the availability of the addr_of macro makes it unlikely that I would try to call unsafe { (*t).data.len() } if I knew upfront that (*t).data may be unaligned. If I can help out in any way, please let me know!

@RalfJung
Copy link
Member Author

This runs counter to my intuition of how operations compose. If *t is already UB for dangling t, it should not be possible to use this as a subexpression of something like &raw const (*t).data and have that not be UB.

Fair, I need to refine this a little. Notice that the notion of "subexpression" is non-trivial here: to understand how this composes, you need to take into account the fact that Rust distinguishes places and values (called "lvalues" and "rvalues" in C). In particular, there are implicit place2value coercions introduced whenever a place expression is used in a position where a value expression was expected. I think since Rust has two kinds of pointers (references and raw pointers), it makes sense to also consider two kinds of places. *t is a place expression, but which kind of place it creates is determined by whether it is a "reference place expression" or a "raw place expression"; a surrounding & leads to the former, a surrounding &raw to the latter. As a reference place expression, *t must not dangle; as a raw place expression, it may.

That would be a way to realize my proposal while remaining compositional (in a sense). Another alternative might be to say that *t of a raw pointer is actually not immediate UB, and the UB only arises when the reference is being created via & (so that would be slightly different than what I described above). Either way I think it would be really valuable to not make &raw const (*t).data UB when t dangles; for example, that would make implementing offset_of! much simpler.

I should also clarify that none of this is official in any way. Officially, *t is UB when t dangles, period -- not as the result of some decision process, but because this is the most conservative (least permissive) choice. That gives us room to perhaps relax the rules for UB in the future. What I stated above is just what I think we should be heading towards wrt the questions you asked.

@RalfJung
Copy link
Member Author

RalfJung commented Mar 26, 2021

If the programmer can expect that t.data (from the example in the first post) is unaligned, then they would most likely reach for the addr_of! macro.

Note that the macro is relevant not just for alignment, but also for cases where the pointer points to uninitialized data.

Your snippet nicely demonstrates my concern: if the programmer, instead of core::ptr::addr_of!((*t).data).len(), writes just ((*t).data).len(), suddenly everything goes horribly wrong! This is despite the fact that the programmer wrote no &, so there is no clear sign here that a reference is even being created (i.e., there is no clear sign that alignment and initialization are even relevant).

@RalfJung
Copy link
Member Author

RalfJung commented Mar 26, 2021

Actually, according to the current UB rules, it is UB to

Dereferencing (using the * operator on) a dangling or unaligned raw pointer.

So, core::ptr::addr_of!((*t).data) actually is UB when t is misaligned -- not because the resulting ptr is misaligned, but because t is!

We should probably change that... discussion on Zulip

@LegionMammal978
Copy link
Contributor

Is this still considered an issue and a blocker for #71146? I also share the sentiment that casting the field to a &[u8] for the receiver is the expected behavior here. It comes from this intuition about the requirements:

Expression Allocated? Initialized & valid? Unaliased?
ptr
(*ptr).field = value (if Copy)
addr_of[_mut]!((*ptr).field)
addr_of[_mut]!((*ptr).field1.field2) (directly)
value = (*ptr).field
(*ptr).field = value (if not Copy) ✓* ✓*
&[mut] (*ptr).field
(*ptr).field.method()
addr_of[_mut]!((*ptr).field1.field2) (through Deref[Mut])

* unless field has no subobjects with nontrivial drop code

By this intuition, (*t).data.len() falls under the (*ptr).field.method() pattern, ultimately interpreted as <[u8]>::len(&(*ptr).field). If I wanted it to call <*const [u8]>::len(), I'd use the explicit form addr_of!((*t).data).len() to keep it as a pointer.

Uninitialized Memory: Unsafe Rust is Too Hard and especially Rust's Unsafe Pointer Types Need An Overhaul are relevant here. See also UnsafeCell::raw_get(), which works around for the similar issue that UnsafeCell::get()'s receiver must be initialized.

@RalfJung
Copy link
Member Author

Uninitialized Memory: Unsafe Rust is Too Hard and especially Rust's Unsafe Pointer Types Need An Overhaul are relevant here.

My feeling is that having to write addr_of!((*t).data).len() rather than (*t).data.len() is an example of how "unsafe Rust is too hard".

However, I don't have a very strong opinion here; ultimately this is up to @rust-lang/libs-api. I would also think that having (*t).data.len() be the reference method now and making it call the raw ptr method in the future is a backwards compatible change (as long as we keep the two methods identical in behavior, which we obviously should) -- but I might be missing something.

@Pointerbender
Copy link
Contributor

I would also think that having (*t).data.len() be the reference method now and making it call the raw ptr method in the future is a backwards compatible change (as long as we keep the two methods identical in behavior, which we obviously should) -- but I might be missing something.

I might know of an additional relevant and fun footgun that would need be solved before then 😄 What if (*t).data is of type &mut [T], would coercing a &mut [T] to a *const [T] then happen as:

@LegionMammal978
Copy link
Contributor

I would also think that having (*t).data.len() be the reference method now and making it call the raw ptr method in the future is a backwards compatible change (as long as we keep the two methods identical in behavior, which we obviously should) -- but I might be missing something.

I guess I wasn't very clear in my point. I think that ultimately, having (*t).data.len() call the *const/mut [T] method, absent other major changes, would make the situation even more confusing. Consider the scenario: In a codebase, we see an expression (*ptr).field.method() in an unsafe function being refactored, and we know that field is not Copy. Must we ensure that field is initialized, valid, and not improperly aliased by other references? (That is, is the reference &[mut] (*ptr).field created at any point?)

With the current status quo, the answer is always yes. The .method() call will always take a &[mut] self receiver, created from the (*ptr).field place. However, if we were to naively allow the expression (*t).data.len() to create no reference, then we must evaluate a decision tree:

  • Is (*ptr).field a slice, and are we calling .len() on it?
    • If so, then no, there is no reference created.
    • Otherwise, yes, there is a reference created.

It would make little sense for <*const/mut [T]>::len() to be the only methods allowing such a transformation. But if we were to generalize it to all pointer methods, then the decision would become even trickier:

  • Is .method() a method defined on *const/mut T in the current version of Rust?
    • If so, then no, there is no reference created.
    • Otherwise, yes, there is a reference created.

In this scenario, we'd have the same problem as we currently have with Deref[Mut]. We could not add any new methods to *const/mut T, since they could possibly conflict with methods on T. Also, in the related scenario where field is a pointer itself, it introduces ambiguity: are we calling the method on the value of (*ptr).field, or on the pointer to (*ptr).field?


To prevent these issues, I believe that place expressions should never be implicitly converted to pointers, since currently, they always become references or are copied by value. To project a pointer into a field, we should require the use of something explicit such as addr_of[_mut]!((*ptr).field). Of course, as you mention, the syntax is currently verbose and its importance is poorly communicated.

This is why I mentioned the latter article, which has a proposal for lightweight "path offset syntax" using ~. With it, the answer would be pretty clear: use (*t).data.len() to access data by reference, and use t~data.len() to access data by pointer. I particularly like how it introduces the intuition that the deref syntax *t asserts that data is valid. I don't think the exact syntax proposed is particularly pretty, but I could probably live with it.

@RalfJung
Copy link
Member Author

I just had my opinion that this is a pretty bad footgun confirmed via #99437.

@RalfJung
Copy link
Member Author

IMO the lesson to be learned here is that if a place originates from a raw pointer, we should never implicitly turn it into a reference. For the cases where we already do, we should warn about this situation.

To prevent these issues, I believe that place expressions should never be implicitly converted to pointers

IOW, I vehemently disagree with this. A place expression that comes from a raw pointer should only ever be converted to a raw pointer, never to a reference. Converting it to a reference implicitly is much too likely to introduce aliasing requirements that were probably not intended. This must therefore only happen explicitly.

@LegionMammal978
Copy link
Contributor

LegionMammal978 commented Jul 18, 2022

A place expression that comes from a raw pointer should only ever be converted to a raw pointer, never to a reference. Converting it to a reference implicitly is much too likely to introduce aliasing requirements that were probably not intended. This must therefore only happen explicitly.

While it would be nice if place-to-reference conversions were always explicit, such a thing cannot be implemented without horribly breaking backward compatibility. To list the implicit place-to-ref conversions we currently allow in stable code:

  • place.method(), either where method() takes &[mut] self, or via Deref[Mut]
  • place.field, via Deref[Mut]
  • place = value, where the type of place has a nontrivial destructor
  • place op= value
  • place == value, ditto with !=, <, <=, >, and >=
  • place[index]
  • place(), where the type of place has no FnOnce impl
  • format_args!("{}", place)

All of these can take *ptr as a place, and they will all produce intermediate references. In fact, there's only a few operations that don't, assuming we can't move values out of the place:

  • place in a value expression context, where the type of place is Copy
  • place.method(), not via Deref[Mut], where method() takes self and the type of place is Copy
  • place.field, not via Deref[Mut]
  • place = value, where the type of place has a trivial destructor
  • addr_of[_mut]!(place)

With this imbalance in mind, my usual approach would (IMHO) seem quite practical: assume that a *ptr expression wants to produce a reference by default, unless it unambiguously fits into one of the patterns in the second list. My fear is that by allowing and encouraging patterns like (*slice_ptr).len() or (*slice_ptr)[index], we'd be muddying the waters of what can be done reference-free with place expressions. That's why I'd prefer putting helper methods on the pointers themselves, e.g., slice_ptr.len() or slice_ptr.get[_mut]_ptr(index). That way, *ptr place expressions can stay restricted to value contexts and trivial field projections.

@RalfJung
Copy link
Member Author

RalfJung commented Jul 19, 2022

assume that a *ptr expression wants to produce a reference by default

The only "small" problem with this approach is that it is a false assumption. I don't think programmers will typically assume this. And I can't blame them, even with all I know, I wouldn't have assumed this. I started at the example that triggered #99437 for quite a while until I realized what was happening.

So, while there are indeed many syntactic positions where place-to-ref conversions happen, I am not as sure as you seem to be that this happens regularly in unsafe code that actually intends to do this. That's why I think we should have a lint, that would help us detect these situations.

Of the things you list, I also don't think format_args! is an auto-ref. It just literally adds a & as part of macro expansion. So that wouldn't warn, and that seems fine. I am not entirely sure about == and the other operators; for those I am also less surprised by the reference. Also note that (*ptr)[idx] does not auto-ref, it uses a primitive MIR indexing operation and does not go through Index. We could potentially do the same for subslicing. place = value explicitly writes to the pointer so no surprise there either.

But I think place.field, place[idx] should definitely warn when implicitly constructing a reference to a place constructed from a raw pointer (those operations should usually just be a ptr offset, so whenever they do any more than that in unsafe code that's a big footgun), and place.method() probably should warn as well.

@RalfJung
Copy link
Member Author

Put differently: I don't think defining the problem away by making it the programmer's problem is a constructive solution to this issue. Sure, it means us compiler devs can wash their hands, but it means more bugs in Rust code, and that's a bad outcome. We should strive to prevent those bugs instead. Our goal here isn't to be "technically correct" when things like #99437 happen, our goal is to make it easier to write reliable software and to prevent such issues from occurring.

@LegionMammal978
Copy link
Contributor

Also note that (*ptr)[idx] does not auto-ref, it uses a primitive MIR indexing operation and does not go through Index. We could potentially do the same for subslicing.

Hmm, I didn't know that, and I don't think it's at all obvious to an intermediate Rust user. Looking at the Reference, it carefully words its way around slices and arrays using Index[Mut], but it doesn't elaborate on the distinction at all (source):

Array and slice-typed values can be indexed by writing a square-bracket-enclosed expression of type usize (the index) after them. When the array is mutable, the resulting memory location can be assigned to.

For other types an index expression a[b] is equivalent to *std::ops::Index::index(&a, b), or *std::ops::IndexMut::index_mut(&mut a, b) in a mutable place expression context. Just as with methods, Rust will also insert dereference operations on a repeatedly to find an implementation.

The docs for Index[Mut] say nothing about the distinction (source):

Used for indexing operations (container[index]) in immutable contexts.

container[index] is actually syntactic sugar for *container.index(index), but only when used as an immutable value. If a mutable value is requested, IndexMut is used instead. This allows nice things such as let value = v[index] if the type of value implements Copy.

The Rustonomicon also suggests that slice index operations go through Index[Mut] following autoderef (source):

Here is an example of the method lookup algorithm:

let array: Rc<Box<[T; 3]>> = ...;
let first_entry = array[0];

How does the compiler actually compute array[0] when the array is behind so many indirections? First, array[0] is really just syntax sugar for the Index trait - the compiler will convert array[0] into array.index(0). Now, the compiler checks to see if array implements Index, so that it can call the function.

Then, the compiler checks if Rc<Box<[T; 3]>> implements Index, but it does not, and neither do &Rc<Box<[T; 3]>> or &mut Rc<Box<[T; 3]>>. Since none of these worked, the compiler dereferences the Rc<Box<[T; 3]>> into Box<[T; 3]> and tries again. Box<[T; 3]>, &Box<[T; 3]>, and &mut Box<[T; 3]> do not implement Index, so it dereferences again. [T; 3] and its autorefs also do not implement Index. It can't dereference [T; 3], so the compiler unsizes it, giving [T]. Finally, [T] implements Index, so it can now call the actual index function.

But as you describe, the example does not produce Index::index() but instead a primitive indexing operation in the MIR, even with -Z mir-opt-level=0.


Overall, we don't give any solid indication that indexing operations have semantics other than those of Index::index() and IndexMut::index_mut(). And since all of the slice and array types implement Index[Mut] for the benefit of generic code (1234), it's a natural conclusion that they use (or act just like they use) the exact same methods as all other containers, and that they produce transient &[mut] self references that have aliasing restrictions. There's simply no evidence against it short of trawling through the MIR-related compiler code or the scattered discussions on GitHub.

(As for why I included format_args!(), it can be easy to forget that it creates references even to Copy types; I could easily see a user writing println!("i32 field: {}", (*ptr).i32_field) and thinking that (*ptr).i32_field is a value expression, like it would be for regular function calls.)

@RalfJung
Copy link
Member Author

RalfJung commented Jul 19, 2022

(As for why I included format_args!(), it can be easy to forget that it creates references even to Copy types; I could easily see a user writing println!("i32 field: {}", (*ptr).i32_field) and thinking that (*ptr).i32_field is a value expression, like it would be for regular function calls.)

If it were a value expression, then the ptr would be actually loaded from, so that would only have more UB. Therefore this is not a footgun.

The problem is writing code where you don't want a value expression, like addr_of_mut!((*(ptr))[..layout_size]), and then accidentally creating a reference with aliasing guarantees nonetheless.

@LegionMammal978
Copy link
Contributor

If it were a value expression, then the ptr would be actually loaded from, so that would only have more UB. Therefore this is not a footgun.

Hmm, you're right about that, now that I think about it. At the very top of the borrow stack, it's only an issue when you hold on to the &mut longer than you need it. The transient &muts are mainly an issue in the middle of the borrow stack, where you end up with long-lived *mut <- &mut <- *mut patterns.

The problem is writing code where you don't want a value expression, like addr_of_mut!((*(ptr))[..layout_size]), and then accidentally creating a reference with aliasing guarantees nonetheless.

So we agree that there's lots of ways to get implicit refs from place expressions, and this can cause issues with the aliasing and non-null restrictions. You seem to argue that we should lint on these, or change the semantics so they operate via pointer. But I think that we should keep place semantics as they are, and steer users away from writing (*ptr) places at all, unless they specifically want to access the value or reborrow as a reference.

Right now, the predominant case where (*ptr) places are necessary is in field projections such as addr_of_mut!((*ptr).field); this case could be eliminated with something like Gankra's path offset syntax ptr~field. The second-biggest case is probably your subslice case, which could be eliminated with the further extension ptr~[..layout_size], but that syntax is probably stretching it a bit. More conservatively, we could give clear examples in the docs of implementing the same behavior with #71146 + #74265.

If we could eliminate those two cases, then users could simply avoid (*ptr) places unless they want to access the value. Even though the mental model of "(*ptr) as inherently risky" would be somewhat inaccurate, it would be sufficient to prevent unexpected behavior through any of the implicit-ref operations.

@RalfJung
Copy link
Member Author

steer users away from writing (*ptr) places at all

I mean, that'd be great, but it is a big change to the language -- much bigger than what I have the time for. So I dearly hope someone will pursue this. :) But meanwhile, I think there are smaller steps we can take that will help improve the situation, and those are the steps I am proposing.

And even once we reach that goal, many people will still write *ptr places, since that's how you do it in C. So we still need a plan for how to detect and lint against incorrect use of that pattern.

bstrie added a commit to bstrie/enarx that referenced this issue Jul 19, 2022
…ith slice

This potential source of UB was discovered while upgrading the Rust toolchain,
which upgrades us to a new version of Miri with stricter rules around raw pointers.
Specifically, an expression like `addr_of_mut!((*(ptr))[offset..])` is deliberately
attempting to operate only on raw pointers while avoiding any intermediate references,
since references have invariants that raw pointers do not.
However, there is in fact an implicit reference here that is created as a result of the
indexing operation. This is both surprising and not surprising, for interesting reasons.

First, it should not be surprising because indexing is governed by the Index traits,
whose methods function return references, so their presence here is natural.

On the other hand, it is surprising because Rust already special cases `(*ptr)[foo]` when `ptr`
is a raw slice and `foo` is not a range to avoid the Index traits entirely, which allows it to
avoid emitting an intermediate reference.

The ideal solution here is for Rust to be smart enough to not introduce the intermediate
reference here at all, which is tracked at rust-lang/rust#73987 .

In addition, while investigating this issue I brought it up to the Unsafe Code Guidelines team,
who saw fit to file rust-lang/rust#99437 as a more specific example
of the potential perils of the current behavior.
bstrie added a commit to bstrie/enarx that referenced this issue Jul 19, 2022
…ith range

This potential source of UB was discovered while upgrading the Rust toolchain,
which upgrades us to a new version of Miri with stricter rules around raw pointers.
Specifically, an expression like `addr_of_mut!((*(ptr))[offset..])` is deliberately
attempting to operate only on raw pointers while avoiding any intermediate references,
since references have invariants that raw pointers do not.
However, there is in fact an implicit reference here that is created as a result of the
indexing operation. This is both surprising and not surprising, for interesting reasons.

First, it should not be surprising because indexing is governed by the Index traits,
whose methods function return references, so their presence here is natural.

On the other hand, it is surprising because Rust already special cases `(*ptr)[foo]` when `ptr`
is a raw slice and `foo` is not a range to avoid the Index traits entirely, which allows it to
avoid emitting an intermediate reference.

The ideal solution here is for Rust to be smart enough to not introduce the intermediate
reference here at all, which is tracked at rust-lang/rust#73987 .

In addition, while investigating this issue I brought it up to the Unsafe Code Guidelines team,
who saw fit to file rust-lang/rust#99437 as a more specific example
of the potential perils of the current behavior.

Signed-off-by: bstrie <865233+bstrie@users.noreply.github.com>
bstrie added a commit to bstrie/enarx that referenced this issue Jul 19, 2022
… with range

This potential source of UB was discovered while upgrading the Rust toolchain,
which upgrades us to a new version of Miri with stricter rules around raw pointers.
Specifically, an expression like `addr_of_mut!((*(ptr))[offset..])` is deliberately
attempting to operate only on raw pointers while avoiding any intermediate references,
since references have invariants that raw pointers do not.
However, there is in fact an implicit reference here that is created as a result of the
indexing operation. This is both surprising and not surprising, for interesting reasons.

First, it should not be surprising because indexing is governed by the Index traits,
whose methods function return references, so their presence here is natural.

On the other hand, it is surprising because Rust already special cases `(*ptr)[foo]` when `ptr`
is a raw slice and `foo` is not a range to avoid the Index traits entirely, which allows it to
avoid emitting an intermediate reference.

The ideal solution here is for Rust to be smart enough to not introduce the intermediate
reference here at all, which is tracked at rust-lang/rust#73987 .

In addition, while investigating this issue I brought it up to the Unsafe Code Guidelines team,
who saw fit to file rust-lang/rust#99437 as a more specific example
of the potential perils of the current behavior.

Signed-off-by: bstrie <865233+bstrie@users.noreply.github.com>
enarxbot pushed a commit to enarx/enarx that referenced this issue Jul 19, 2022
… with range

This potential source of UB was discovered while upgrading the Rust toolchain,
which upgrades us to a new version of Miri with stricter rules around raw pointers.
Specifically, an expression like `addr_of_mut!((*(ptr))[offset..])` is deliberately
attempting to operate only on raw pointers while avoiding any intermediate references,
since references have invariants that raw pointers do not.
However, there is in fact an implicit reference here that is created as a result of the
indexing operation. This is both surprising and not surprising, for interesting reasons.

First, it should not be surprising because indexing is governed by the Index traits,
whose methods function return references, so their presence here is natural.

On the other hand, it is surprising because Rust already special cases `(*ptr)[foo]` when `ptr`
is a raw slice and `foo` is not a range to avoid the Index traits entirely, which allows it to
avoid emitting an intermediate reference.

The ideal solution here is for Rust to be smart enough to not introduce the intermediate
reference here at all, which is tracked at rust-lang/rust#73987 .

In addition, while investigating this issue I brought it up to the Unsafe Code Guidelines team,
who saw fit to file rust-lang/rust#99437 as a more specific example
of the potential perils of the current behavior.

Signed-off-by: bstrie <865233+bstrie@users.noreply.github.com>
@kornelski
Copy link
Contributor

I agree it's a trap, but at the same time if * magically temporarily preserved "raw-pointerness" of the value, then it would be inconsistent with:

let tmp = *ptr;
tmp.data.len();

I'm already spooked by &* cancelling each other in a special way, but at least in safe rust that's inconsequential.

So having a dedicated operator for a raw deref (->) or pointer offset (~) would be better: https://faultlore.com/blah/fix-rust-pointers/

@RalfJung
Copy link
Member Author

RalfJung commented Sep 9, 2022

it would be inconsistent with:

It's not inconsistent since these are completely different programs! Let's think more carefully about places and values to make sense of all this. We have to make place-to-value coercions explicit; I will use __load(place) to write the value expression that denotes the value stored in the place. In particular when x is a local variable, then x is a place expression denoting the address of the local variable, and __load(x) is a value expression denoting the value stored in that local variable. *value is a place expression and &place is a value expression. place.field is also a place expression. Aside from __load, none of these performs a memory access.

Your code then becomes

let tmp = __load(*__load(ptr));
len(&tmp.data);

whereas test_len from the OP becomes

pub fn test_len(t: *const Test) -> usize = unsafe {
    len(&(*__load(t)).data) // the & being inserted here is exactly the problem
}

IOW, by storing *ptr into a local, you are doing an extra __load, which makes a big difference for which kind of UB can happen where. It should not be surprising that when you do a __load from the raw pointer, the pointer must be valid. But in test_len we are never __loading from the pointer (we are just loading the pointer value itself, stored in the place t), so we should never assert any kind of validity.

I'm already spooked by &* cancelling each other in a special way, but at least in safe rust that's inconsequential.

That is also entirely explained by a proper treatment of places and values. There's nothing special going on. MiniRust defines both & and * in a compositional modular way without special cases to get the right behavior (but so far it's not really in a state where it can serve as a tutorial for people not already versed in this kind of operational semantics).

I do agree that we could do a lot better teaching this place/value stuff.

@kornelski
Copy link
Contributor

kornelski commented Sep 9, 2022

You're looking at this from a very low-level perspective — no doubt technically correct one, but I mean it from more layman perspective. It requires having a more low-level mental model of the language. For novice Rust users used to higher-level/GC languages it's already weird that x().foo() and let tmp = x(); tmp.foo() are semantically different, and this adds another such case.

So I don't think that having a special raw-pointer-temporary deref would solve the surprising behavior, it'd just move it around.

@RalfJung
Copy link
Member Author

RalfJung commented Sep 9, 2022

If you are using raw pointers, then I don't think you can avoid learning about places and values. (Believe me, it can get a lot more low-level than that. ;)

The comparison with high-level/GC languages makes little sense since those languages don't have the features we are discussing here. For better or worse, Rust (like C and C++ but unlike, e.g., Java) is a place-based language, and it makes little sense to try and hide that fact from people that want to use low-level Rust features such as raw pointers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-MIR Area: Mid-level IR (MIR) - https://blog.rust-lang.org/2016/04/19/MIR.html C-bug Category: This is a bug. F-arbitrary_self_types `#![feature(arbitrary_self_types)]` T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

9 participants