-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Elaborate on the invariants for references-to-slices #121965
Conversation
Given that @rustbot label +I-lang-nominated |
That sounds like we should go through t-opsem FCP, maybe joint with t-lang. @scottmcm could you spell out the motivation for why it should be a validity invariant? The authoritative location for validity invariants is the Reference ("behavior considered undefined"), so could you prepare an accompanying reference PR? |
/// `isize::MAX / size_of::<E>()`. (Raw pointers may have longer lengths, but | ||
/// references must not. For example, compare the documentation of | ||
/// [`ptr::slice_from_raw_parts`](ptr/fn.slice_from_raw_parts.html) and | ||
/// [`slice::from_raw_parts`](slice/fn.from_raw_parts.html).) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this meant to apply to all types with slice tails, or deliberately restricted to only slices and str?
In &(i32, [i32])
, is the max length reduced by one to make sure that the entire type always fits into isize
?
Does this affect the design of the unstable |
See also: #117474 |
Add `assume`s to slice length calls Since `.len()` on slices is safe, let's see how this impacts things vs what we could do with rust-lang#121965 and LLVM 19
I don't think so, as this is only an invariant for references, so |
Can you elaborate on the complications it would cause? I don't know how to judge what is or isn't hard in there. I'd assumed that it wouldn't be substantially harder to deal with in opsem than how references have different validity invariants on the address+provenance part from pointers already. (Talking about the pointee I think I understand how it's harder to enforce/check something, from the other conversations about that, but since the metadata isn't behind the pointer and thus it's right there to see when doing a typed copy, I didn't imagine it being substantially harder than checking things like the alignment of the pointer that's also reference-only, not applicable for pointers.) I think the only reason I have to make it strictly a validity invariant would be for niches. That would be a better version of what RawVec is currently doing by hand, for example, with rust/library/alloc/src/raw_vec.rs Lines 36 to 40 in 9b8d12c
Today the very-common Otherwise it's mostly convenience. Telling LLVM about the range restrictions is actually helpful, even just putting it on length checks, but also very expensive today -- both demonstrated in #122926 I'd love to just make it a validity invariant we can put on all
Hmm, that's a very good question. I guess everything with a slice tail would be the most consistent, and thus really does need to be phrased in terms of the implied size of the object. (With the element count rule being the simple consequent case for plain slices.) |
It means we can't view metadata as "just a type" and check the metadata field as if it were a field of some type. Instead we have to view it as an inherent part of the pointer that cannot be described separately. (Or there are two separate types, one for 'metadata of reference' and one for 'metadata of raw pointer'.) We also get more degrees of freedom as we have to separately define the invariant for references and raw pointers. It's not a big deal, so if there are clear benefits I don't think this should stop us. But we should clearly motivate breaking this symmetry. In other words, it would have been nice to say that the validity of the metadata field of a pointer/reference to |
Someone should diff the optimized IR for the compiler or some other large project before and after the PR that adds the assumes to see what optimizations are derived. I didn't notice this demonstrated in the PR, but I suspect with the assumes, LLVM optimizes code like |
Is this PR obsoleted by rust-lang/reference#1482? IIUC, the text in this PR is all logically deducible from the text in rust-lang/reference#1482. |
Please no. We should have documentation about these things in as many places as is reasonable. |
No objection to that. @RalfJung , you seem to have concerns with this PR that you don't have with rust-lang/reference#1482. I don't personally have a dog in this fight as long as rust-lang/reference#1482 lands, but I'm curious just for curiosity's sake where the discrepancy is. |
I don't see much of a relation between the two PRs. The one in the reference documents a property of syntactic type well-formedness, this here is about a (runtime) validity invariant.
I agree we should decide on the validity invariants of wide references and raw pointers. It may be better to consider all 4 at once though so that we have the larger picture in mind. (4 invariants: slice and dyn Trait tails; references and raw pointers.) I don't even know which of them we have consensus on in t-opsem.
|
@@ -1387,9 +1387,19 @@ mod prim_usize {} | |||
/// returning values from safe functions; such violations may result in undefined behavior. Where | |||
/// exceptions to this latter requirement exist, they will be called out explicitly in documentation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that the section you are editing here is talking about the safety invariant, not the validity invariant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So... the PR diff seems fine to me, except that it doesn't match the PR description (I don't see a validity invariant being defined here), and I am not sure if spelling out this consequence of the previous definition in so many words is all that useful?
If this intends to talk about the validity invariant, then IMO there should be separate subsections for safety and validity invariant, so that is is clear that we are talking about two different invariants.
…, r=scottmcm reference type safety invariant docs: clarification The old text could have been read as saying that you can call a function if these requirements are upheld, which is definitely not true as they are an underapproximation of the actual safety invariant. I removed the part about functions relaxing the requirements via their documentation... this seems incoherent with saying that it may actually be unsound to ever temporarily violate the requirement. Furthermore, a function *cannot* just relax this for its return value, that would in general be unsound. And the part about "unsafe code in a safe function may assume these invariants are ensured of arguments passed by the caller" also interacts with relaxing things: clearly, if the invariant has been relaxed, unsafe code cannot rely on it any more. There may be a place to give general guidance on what kinds of function contracts can exist, but the reference type is definitely not the right place to write that down. I also took a clarification from rust-lang#121965 that is orthogonal to the rest of that PR. Cc `@joshlf` `@scottmcm`
…, r=scottmcm reference type safety invariant docs: clarification The old text could have been read as saying that you can call a function if these requirements are upheld, which is definitely not true as they are an underapproximation of the actual safety invariant. I removed the part about functions relaxing the requirements via their documentation... this seems incoherent with saying that it may actually be unsound to ever temporarily violate the requirement. Furthermore, a function *cannot* just relax this for its return value, that would in general be unsound. And the part about "unsafe code in a safe function may assume these invariants are ensured of arguments passed by the caller" also interacts with relaxing things: clearly, if the invariant has been relaxed, unsafe code cannot rely on it any more. There may be a place to give general guidance on what kinds of function contracts can exist, but the reference type is definitely not the right place to write that down. I also took a clarification from rust-lang#121965 that is orthogonal to the rest of that PR. Cc ``@joshlf`` ``@scottmcm``
…, r=scottmcm reference type safety invariant docs: clarification The old text could have been read as saying that you can call a function if these requirements are upheld, which is definitely not true as they are an underapproximation of the actual safety invariant. I removed the part about functions relaxing the requirements via their documentation... this seems incoherent with saying that it may actually be unsound to ever temporarily violate the requirement. Furthermore, a function *cannot* just relax this for its return value, that would in general be unsound. And the part about "unsafe code in a safe function may assume these invariants are ensured of arguments passed by the caller" also interacts with relaxing things: clearly, if the invariant has been relaxed, unsafe code cannot rely on it any more. There may be a place to give general guidance on what kinds of function contracts can exist, but the reference type is definitely not the right place to write that down. I also took a clarification from rust-lang#121965 that is orthogonal to the rest of that PR. Cc ```@joshlf``` ```@scottmcm```
Rollup merge of rust-lang#125043 - RalfJung:ref-type-safety-invariant, r=scottmcm reference type safety invariant docs: clarification The old text could have been read as saying that you can call a function if these requirements are upheld, which is definitely not true as they are an underapproximation of the actual safety invariant. I removed the part about functions relaxing the requirements via their documentation... this seems incoherent with saying that it may actually be unsound to ever temporarily violate the requirement. Furthermore, a function *cannot* just relax this for its return value, that would in general be unsound. And the part about "unsafe code in a safe function may assume these invariants are ensured of arguments passed by the caller" also interacts with relaxing things: clearly, if the invariant has been relaxed, unsafe code cannot rely on it any more. There may be a place to give general guidance on what kinds of function contracts can exist, but the reference type is definitely not the right place to write that down. I also took a clarification from rust-lang#121965 that is orthogonal to the rest of that PR. Cc ```@joshlf``` ```@scottmcm```
reference type safety invariant docs: clarification The old text could have been read as saying that you can call a function if these requirements are upheld, which is definitely not true as they are an underapproximation of the actual safety invariant. I removed the part about functions relaxing the requirements via their documentation... this seems incoherent with saying that it may actually be unsound to ever temporarily violate the requirement. Furthermore, a function *cannot* just relax this for its return value, that would in general be unsound. And the part about "unsafe code in a safe function may assume these invariants are ensured of arguments passed by the caller" also interacts with relaxing things: clearly, if the invariant has been relaxed, unsafe code cannot rely on it any more. There may be a place to give general guidance on what kinds of function contracts can exist, but the reference type is definitely not the right place to write that down. I also took a clarification from rust-lang/rust#121965 that is orthogonal to the rest of that PR. Cc ```@joshlf``` ```@scottmcm```
t-opsem approved the desired validity invariant in rust-lang/unsafe-code-guidelines#510. The PR still needs adjustments though as noted above. |
@rustbot labels -I-lang-nominated We discussed this in the lang call today. We believe that what's in this PR is implied by the change to the Reference here: We decided to hold the FCP over on that Reference PR. Once FCP on that PR completes, this PR will be OK to move forward in terms of lang. |
@scottmcm |
The length limit on slices is clearly a safety invariant, and I'd like it to also be a validity invariant. With function parameter metadata making progress in LLVM, I'd really like to be able to use it when
&[_]
is passed as a scalar pair, in particular.The documentation for references is cagey about what exactly is a validity invariant, so for now just elaborate on the consequences of the existing safety rules on slices -- the length restriction follows from the
size_of_val
restriction -- as a way to help discourage people from trying to violate them.I also made the existing warning stronger, since I'm fairly sure it's already UB to violate at least the "references must be non-null" rule, rather than it just being that it "might be UB in the future".
cc @joshlf @RalfJung