-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Differences between *const T
and *mut T
. Initially *const T
pointers are forever read-only?
#257
Comments
These statements aren't true. the only thing that matters is how did you get the pointer, for example this is 100% correct rust code: let mut v = 5u8;
let ptr: *const u8 = unsafe {std::mem::transmute(&mut v)};
unsafe {*(ptr as *mut u8) = 7;}
assert_eq!(v, 7); So even though it started as a |
This is, I think, a duplicate of #227. I agree it is a problem. I just do not know a good solution.
The subtle aspect of this is that |
Ohhh that's what he was talking about, I'm sorry I misunderstood you @thomcc |
I hope I'm not off topic. I don't think it is, since
I've also found this post which basically asserts that it is entirely ok to use And, now I'm confused :) |
hm, we could say that this isn't a let x = [1, 2, 3];
let y = c_fun_which_takes_ptr_and_mutates_it(x.as_ptr() as *mut _)?; this is said to be a UB as well therefore I find that using |
AIUI, this actually has little to do with The only tricky part is the point @RalfJung mentioned when casting directly from a One option would be to warn on this direct cast ( Then you can be safe in treating |
Could we change |
The tricky bit would be to keep this code working: fn main() {
let x = &mut 0;
let shared = &*x;
let y = x as *const i32; // if we use *mut here instead, this stops compiling
let _val = *shared;
} Currently this works because OTOH, we do reject the |
How does let mut x = 0_i32;
let ptr_x: *const i32 = std::ptr::addr_of!(x);
let mut_ptr_x: *mut i32 = ptr_x as _;
unsafe { *mut_ptr_x = 2; } creates a |
Indeed, that's how it currently affects addr_of.
True. It also doesn't say that you can do that. The docs are not exhaustive for what you cannot do. (That would require infinitely large docs.) There has not been a decision on what the semantics should be here, and that's why the docs basically don't talk about this. It's not great, but absent a decision it's also not clear what else to do. And making the decision without having an entire aliasing model for all the context isn't really a good idea either. |
It’s true. The way “validity” is currently defined in the standard library docs doesn’t guarantee that any use of pointers from Is there a rationale for making |
If you want to write through the pointer you would use
It actually does under the examples section of the
|
I feel quoted out of context here -- for the question raised in that particular thread, my statement holds true. But specifically when converting a reference to a raw pointer, there is a difference.
Basically, because it matches what the borrow checker does -- see here further up this thread. |
avoid mixing accesses of ptrs derived from a mutable ref and parent ptrs `@Vanille-N` is working on a successor for Stacked Borrows. It will mostly accept strictly more code than Stacked Borrows did, with one exception: the following pattern no longer works. ```rust let mut root = 6u8; let mref = &mut root; let ptr = mref as *mut u8; *ptr = 0; // Write assert_eq!(root, 0); // Parent Read *ptr = 0; // Attempted Write ``` This worked in Stacked Borrows kind of by accident: when doing the "parent read", under SB we Disable `mref`, but the raw ptrs derived from it remain usable. The fact that we can still use the "children" of a reference that is no longer usable is quite nasty and leads to some undesirable effects (in particular it is the major blocker for resolving rust-lang/unsafe-code-guidelines#257). So in Tree Borrows we no longer do that; instead, reading from `root` makes `mref` and all its children read-only. Due to other improvements in Tree Borrows, the entire Miri test suite still passes with this new behavior, and even the entire libcore and liballoc test suite, except for these 2 cases this PR fixes. Both of these involve code where the programmer wrote `&mut` but then used pointers derived from that reference in ways that alias with the parent pointer, which arguably is violating uniqueness. They are fixed by properly using raw pointers throughout.
avoid mixing accesses of ptrs derived from a mutable ref and parent ptrs ``@Vanille-N`` is working on a successor for Stacked Borrows. It will mostly accept strictly more code than Stacked Borrows did, with one exception: the following pattern no longer works. ```rust let mut root = 6u8; let mref = &mut root; let ptr = mref as *mut u8; *ptr = 0; // Write assert_eq!(root, 0); // Parent Read *ptr = 0; // Attempted Write ``` This worked in Stacked Borrows kind of by accident: when doing the "parent read", under SB we Disable `mref`, but the raw ptrs derived from it remain usable. The fact that we can still use the "children" of a reference that is no longer usable is quite nasty and leads to some undesirable effects (in particular it is the major blocker for resolving rust-lang/unsafe-code-guidelines#257). So in Tree Borrows we no longer do that; instead, reading from `root` makes `mref` and all its children read-only. Due to other improvements in Tree Borrows, the entire Miri test suite still passes with this new behavior, and even the entire libcore and liballoc test suite, except for these 2 cases this PR fixes. Both of these involve code where the programmer wrote `&mut` but then used pointers derived from that reference in ways that alias with the parent pointer, which arguably is violating uniqueness. They are fixed by properly using raw pointers throughout.
Some updates on this:
|
To make sure it's remembered, there is some practical justification of examplelet mut x = &mut 5;
let r = &x;
let _ = x as *mut _;
// ^ERROR: cannot borrow as mutable ... also borrowed as immutable
let _ = x as *const _;
// allowed
let _ = addr_of_mut!(x);
// ^ERROR: cannot borrow as mutable ... also borrowed as immutable
let _ = addr_of!(x);
// allowed
dbg!(r); This doesn't mean that the opsem has to match this and create a pointer with shared provenance for the As long as providing derived mut provenance to the pointer doesn't impact the validity of extant provenance until the pointer is accessed, though, I agree that the more permissive model of giving the mut provenance when possible is desirable. (If the more permissive semantics are a pessimization to some code, it can probably be rewritten to introduce a shared reborrow and limit the provenance explicitly. Plus, managing two distinct simultaneously valid sibling raw provenances (one mut and one shr) seems like a nightmare.) Footnotes
|
I opened #400 for the specific question of whether |
@RalfJung the one thing I will point out here is that it does not apriori have to be the case that for |
I think it would be very surprising if those two ways of turning a mutable ref into a raw ptr would not do the same thing -- I feel fairly strongly they should be the same. However I can see the question of mutation of |
Users seem to have some kind of intuition that the expression inside In the indefinite future, we should have a stable |
avoid mixing accesses of ptrs derived from a mutable ref and parent ptrs ``@Vanille-N`` is working on a successor for Stacked Borrows. It will mostly accept strictly more code than Stacked Borrows did, with one exception: the following pattern no longer works. ```rust let mut root = 6u8; let mref = &mut root; let ptr = mref as *mut u8; *ptr = 0; // Write assert_eq!(root, 0); // Parent Read *ptr = 0; // Attempted Write ``` This worked in Stacked Borrows kind of by accident: when doing the "parent read", under SB we Disable `mref`, but the raw ptrs derived from it remain usable. The fact that we can still use the "children" of a reference that is no longer usable is quite nasty and leads to some undesirable effects (in particular it is the major blocker for resolving rust-lang/unsafe-code-guidelines#257). So in Tree Borrows we no longer do that; instead, reading from `root` makes `mref` and all its children read-only. Due to other improvements in Tree Borrows, the entire Miri test suite still passes with this new behavior, and even the entire libcore and liballoc test suite, except for these 2 cases this PR fixes. Both of these involve code where the programmer wrote `&mut` but then used pointers derived from that reference in ways that alias with the parent pointer, which arguably is violating uniqueness. They are fixed by properly using raw pointers throughout.
Two small potential arguments for
For full clarity, I am fully in support of preferring And I still think I weakly favor Because OTPT is nowhere near, I think this is an argument for stabilizing |
This sounds like a foot gun. I would have expected it to be captured by raw pointer. But that seems off topic here. I'll open an issue in the main rust repo after investigating this. |
TBH, I wouldn't expect a whole new capture mode here. Changing |
Closure capture is an interesting one. The consistent capture mode would be But really the main point to me is that |
I'm curious... what is the point of having two types As a developer, if I call a function that takes |
Well, it's intended as an indicator to programmers mostly. |
Using a |
We also have |
Stabilize `raw_ref_op` (RFC 2582) This stabilizes the syntax `&raw const $expr` and `&raw mut $expr`. It has existed unstably for ~4 years now, and has been exposed on stable via the `addr_of` and `addr_of_mut` macros since Rust 1.51 (released more than 3 years ago). I think it has become clear that these operations are here to stay. So it is about time we give them proper primitive syntax. This has two advantages over the macro: - Being macros, `addr_of`/`addr_of_mut` could in theory do arbitrary magic with the expression on which they work. The only "magic" they actually do is using the argument as a place expression rather than as a value expression. Place expressions are already a subtle topic and poorly understood by many programmers; having this hidden behind a macro using unstable language features makes this even worse. Conversely, people do have an idea of what happens below `&`/`&mut`, so we can make the subtle topic a lot more approachable by connecting to existing intuition. - The name `addr_of` is quite unfortunate from today's perspective, given that we have accepted provenance as a reality, which means that a pointer is *not* just an address. Strict provenance has a method, `addr`, which extracts the address of a pointer; using the term `addr` in two different ways is quite unfortunate. That's why this PR soft-deprecates `addr_of` -- we will wait a long time before actually showing any warning here, but we should start telling people that the "addr" part of this name is somewhat misleading, and `&raw` avoids that potential confusion. In summary, this syntax improves developers' ability to conceptualize the operational semantics of Rust, while making a fundamental operation frequently used in unsafe code feel properly built in. Possible questions to consider, based on the RFC and [this](rust-lang#64490 (comment)) great summary by `@CAD97:` - Some questions are entirely about the semantics. The semantics are the same as with the macros so I don't think this should have any impact on this syntax PR. Still, for completeness' sake: - Should `&raw const *mut_ref` give a read-only pointer? - Tracked at: rust-lang/unsafe-code-guidelines#257 - I think ideally the answer is "no". Stacked Borrows says that pointer is read-only, but Tree Borrows says it is mutable. - What exactly does `&raw const (*ptr).field` require? Answered in [the reference](https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html): the arithmetic to compute the field offset follows the rules of `ptr::offset`, making it UB if it goes out-of-bounds. Making this a safe operation (using `wrapping_offset` rules) is considered too much of a loss for alias analysis. - Choose a different syntax? I don't want to re-litigate the RFC. The only credible alternative that has been proposed is `&raw $place` instead of `&raw const $place`, which (IIUC) could be achieved by making `raw` a contextual keyword in a new edition. The type is named `*const T`, so the explicit `const` is consistent in that regard. `&raw expr` lacks the explicit indication of immutability. However, `&raw const expr` is quite a but longer than `addr_of!(expr)`. - Shouldn't we have a completely new, better raw pointer type instead? Yes we all want to see that happen -- but I don't think we should block stabilization on that, given that such a nicer type is not on the horizon currently and given the issues with `addr_of!` mentioned above. (If we keep the `&raw $place` syntax free for this, we could use it in the future for that new type.) - What about the lint the RFC talked about? It hasn't been implemented yet. Given that the problematic code is UB with or without this stabilization, I don't think the lack of the lint should block stabilization. - I created an issue to track adding it: rust-lang#127724 - Other points from the "future possibilites of the RFC - "Syntactic sugar" extension: this has not been implemented. I'd argue this is too confusing, we should stick to what the RFC suggested and if we want to do anything about such expressions, add the lint. - Encouraging / requiring `&raw` in situations where references are often/definitely incorrect: this has been / is being implemented. On packed fields this already is a hard error, and for `static mut` a lint suggesting raw pointers is being rolled out. - Lowering of casts: this has been implemented. (It's also an invisible implementation detail.) - `offsetof` woes: we now have native `offset_of` so this is not relevant any more. To be done before landing: - [x] Suppress `unused_parens` lint around `&raw {const|mut}` expressions - See bottom of rust-lang#127679 (comment) for rationale - Implementation: rust-lang#128782 - [ ] Update the Reference. - rust-lang/reference#1567 Fixes rust-lang#64490 cc `@rust-lang/lang` `@rust-lang/opsem` try-job: x86_64-msvc try-job: i686-mingw try-job: test-various try-job: dist-various-1 try-job: armhf-gnu try-job: aarch64-apple
Stabilize `raw_ref_op` (RFC 2582) This stabilizes the syntax `&raw const $expr` and `&raw mut $expr`. It has existed unstably for ~4 years now, and has been exposed on stable via the `addr_of` and `addr_of_mut` macros since Rust 1.51 (released more than 3 years ago). I think it has become clear that these operations are here to stay. So it is about time we give them proper primitive syntax. This has two advantages over the macro: - Being macros, `addr_of`/`addr_of_mut` could in theory do arbitrary magic with the expression on which they work. The only "magic" they actually do is using the argument as a place expression rather than as a value expression. Place expressions are already a subtle topic and poorly understood by many programmers; having this hidden behind a macro using unstable language features makes this even worse. Conversely, people do have an idea of what happens below `&`/`&mut`, so we can make the subtle topic a lot more approachable by connecting to existing intuition. - The name `addr_of` is quite unfortunate from today's perspective, given that we have accepted provenance as a reality, which means that a pointer is *not* just an address. Strict provenance has a method, `addr`, which extracts the address of a pointer; using the term `addr` in two different ways is quite unfortunate. That's why this PR soft-deprecates `addr_of` -- we will wait a long time before actually showing any warning here, but we should start telling people that the "addr" part of this name is somewhat misleading, and `&raw` avoids that potential confusion. In summary, this syntax improves developers' ability to conceptualize the operational semantics of Rust, while making a fundamental operation frequently used in unsafe code feel properly built in. Possible questions to consider, based on the RFC and [this](rust-lang#64490 (comment)) great summary by `@CAD97:` - Some questions are entirely about the semantics. The semantics are the same as with the macros so I don't think this should have any impact on this syntax PR. Still, for completeness' sake: - Should `&raw const *mut_ref` give a read-only pointer? - Tracked at: rust-lang/unsafe-code-guidelines#257 - I think ideally the answer is "no". Stacked Borrows says that pointer is read-only, but Tree Borrows says it is mutable. - What exactly does `&raw const (*ptr).field` require? Answered in [the reference](https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html): the arithmetic to compute the field offset follows the rules of `ptr::offset`, making it UB if it goes out-of-bounds. Making this a safe operation (using `wrapping_offset` rules) is considered too much of a loss for alias analysis. - Choose a different syntax? I don't want to re-litigate the RFC. The only credible alternative that has been proposed is `&raw $place` instead of `&raw const $place`, which (IIUC) could be achieved by making `raw` a contextual keyword in a new edition. The type is named `*const T`, so the explicit `const` is consistent in that regard. `&raw expr` lacks the explicit indication of immutability. However, `&raw const expr` is quite a but longer than `addr_of!(expr)`. - Shouldn't we have a completely new, better raw pointer type instead? Yes we all want to see that happen -- but I don't think we should block stabilization on that, given that such a nicer type is not on the horizon currently and given the issues with `addr_of!` mentioned above. (If we keep the `&raw $place` syntax free for this, we could use it in the future for that new type.) - What about the lint the RFC talked about? It hasn't been implemented yet. Given that the problematic code is UB with or without this stabilization, I don't think the lack of the lint should block stabilization. - I created an issue to track adding it: rust-lang#127724 - Other points from the "future possibilites of the RFC - "Syntactic sugar" extension: this has not been implemented. I'd argue this is too confusing, we should stick to what the RFC suggested and if we want to do anything about such expressions, add the lint. - Encouraging / requiring `&raw` in situations where references are often/definitely incorrect: this has been / is being implemented. On packed fields this already is a hard error, and for `static mut` a lint suggesting raw pointers is being rolled out. - Lowering of casts: this has been implemented. (It's also an invisible implementation detail.) - `offsetof` woes: we now have native `offset_of` so this is not relevant any more. To be done before landing: - [x] Suppress `unused_parens` lint around `&raw {const|mut}` expressions - See bottom of rust-lang#127679 (comment) for rationale - Implementation: rust-lang#128782 - [ ] Update the Reference. - rust-lang/reference#1567 Fixes rust-lang#64490 cc `@rust-lang/lang` `@rust-lang/opsem` // try-job: i686-mingw // `dump-ice-to-disk` is flaky try-job: x86_64-msvc try-job: test-various try-job: dist-various-1 try-job: armhf-gnu try-job: aarch64-apple
…,jieyouxu Stabilize `raw_ref_op` (RFC 2582) This stabilizes the syntax `&raw const $expr` and `&raw mut $expr`. It has existed unstably for ~4 years now, and has been exposed on stable via the `addr_of` and `addr_of_mut` macros since Rust 1.51 (released more than 3 years ago). I think it has become clear that these operations are here to stay. So it is about time we give them proper primitive syntax. This has two advantages over the macro: - Being macros, `addr_of`/`addr_of_mut` could in theory do arbitrary magic with the expression on which they work. The only "magic" they actually do is using the argument as a place expression rather than as a value expression. Place expressions are already a subtle topic and poorly understood by many programmers; having this hidden behind a macro using unstable language features makes this even worse. Conversely, people do have an idea of what happens below `&`/`&mut`, so we can make the subtle topic a lot more approachable by connecting to existing intuition. - The name `addr_of` is quite unfortunate from today's perspective, given that we have accepted provenance as a reality, which means that a pointer is *not* just an address. Strict provenance has a method, `addr`, which extracts the address of a pointer; using the term `addr` in two different ways is quite unfortunate. That's why this PR soft-deprecates `addr_of` -- we will wait a long time before actually showing any warning here, but we should start telling people that the "addr" part of this name is somewhat misleading, and `&raw` avoids that potential confusion. In summary, this syntax improves developers' ability to conceptualize the operational semantics of Rust, while making a fundamental operation frequently used in unsafe code feel properly built in. Possible questions to consider, based on the RFC and [this](rust-lang#64490 (comment)) great summary by `@CAD97:` - Some questions are entirely about the semantics. The semantics are the same as with the macros so I don't think this should have any impact on this syntax PR. Still, for completeness' sake: - Should `&raw const *mut_ref` give a read-only pointer? - Tracked at: rust-lang/unsafe-code-guidelines#257 - I think ideally the answer is "no". Stacked Borrows says that pointer is read-only, but Tree Borrows says it is mutable. - What exactly does `&raw const (*ptr).field` require? Answered in [the reference](https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html): the arithmetic to compute the field offset follows the rules of `ptr::offset`, making it UB if it goes out-of-bounds. Making this a safe operation (using `wrapping_offset` rules) is considered too much of a loss for alias analysis. - Choose a different syntax? I don't want to re-litigate the RFC. The only credible alternative that has been proposed is `&raw $place` instead of `&raw const $place`, which (IIUC) could be achieved by making `raw` a contextual keyword in a new edition. The type is named `*const T`, so the explicit `const` is consistent in that regard. `&raw expr` lacks the explicit indication of immutability. However, `&raw const expr` is quite a but longer than `addr_of!(expr)`. - Shouldn't we have a completely new, better raw pointer type instead? Yes we all want to see that happen -- but I don't think we should block stabilization on that, given that such a nicer type is not on the horizon currently and given the issues with `addr_of!` mentioned above. (If we keep the `&raw $place` syntax free for this, we could use it in the future for that new type.) - What about the lint the RFC talked about? It hasn't been implemented yet. Given that the problematic code is UB with or without this stabilization, I don't think the lack of the lint should block stabilization. - I created an issue to track adding it: rust-lang#127724 - Other points from the "future possibilites of the RFC - "Syntactic sugar" extension: this has not been implemented. I'd argue this is too confusing, we should stick to what the RFC suggested and if we want to do anything about such expressions, add the lint. - Encouraging / requiring `&raw` in situations where references are often/definitely incorrect: this has been / is being implemented. On packed fields this already is a hard error, and for `static mut` a lint suggesting raw pointers is being rolled out. - Lowering of casts: this has been implemented. (It's also an invisible implementation detail.) - `offsetof` woes: we now have native `offset_of` so this is not relevant any more. To be done before landing: - [x] Suppress `unused_parens` lint around `&raw {const|mut}` expressions - See bottom of rust-lang#127679 (comment) for rationale - Implementation: rust-lang#128782 - [ ] Update the Reference. - rust-lang/reference#1567 Fixes rust-lang#64490 cc `@rust-lang/lang` `@rust-lang/opsem` try-job: x86_64-msvc try-job: test-various try-job: dist-various-1 try-job: armhf-gnu try-job: aarch64-apple
Stabilize `raw_ref_op` (RFC 2582) This stabilizes the syntax `&raw const $expr` and `&raw mut $expr`. It has existed unstably for ~4 years now, and has been exposed on stable via the `addr_of` and `addr_of_mut` macros since Rust 1.51 (released more than 3 years ago). I think it has become clear that these operations are here to stay. So it is about time we give them proper primitive syntax. This has two advantages over the macro: - Being macros, `addr_of`/`addr_of_mut` could in theory do arbitrary magic with the expression on which they work. The only "magic" they actually do is using the argument as a place expression rather than as a value expression. Place expressions are already a subtle topic and poorly understood by many programmers; having this hidden behind a macro using unstable language features makes this even worse. Conversely, people do have an idea of what happens below `&`/`&mut`, so we can make the subtle topic a lot more approachable by connecting to existing intuition. - The name `addr_of` is quite unfortunate from today's perspective, given that we have accepted provenance as a reality, which means that a pointer is *not* just an address. Strict provenance has a method, `addr`, which extracts the address of a pointer; using the term `addr` in two different ways is quite unfortunate. That's why this PR soft-deprecates `addr_of` -- we will wait a long time before actually showing any warning here, but we should start telling people that the "addr" part of this name is somewhat misleading, and `&raw` avoids that potential confusion. In summary, this syntax improves developers' ability to conceptualize the operational semantics of Rust, while making a fundamental operation frequently used in unsafe code feel properly built in. Possible questions to consider, based on the RFC and [this](rust-lang#64490 (comment)) great summary by `@CAD97:` - Some questions are entirely about the semantics. The semantics are the same as with the macros so I don't think this should have any impact on this syntax PR. Still, for completeness' sake: - Should `&raw const *mut_ref` give a read-only pointer? - Tracked at: rust-lang/unsafe-code-guidelines#257 - I think ideally the answer is "no". Stacked Borrows says that pointer is read-only, but Tree Borrows says it is mutable. - What exactly does `&raw const (*ptr).field` require? Answered in [the reference](https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html): the arithmetic to compute the field offset follows the rules of `ptr::offset`, making it UB if it goes out-of-bounds. Making this a safe operation (using `wrapping_offset` rules) is considered too much of a loss for alias analysis. - Choose a different syntax? I don't want to re-litigate the RFC. The only credible alternative that has been proposed is `&raw $place` instead of `&raw const $place`, which (IIUC) could be achieved by making `raw` a contextual keyword in a new edition. The type is named `*const T`, so the explicit `const` is consistent in that regard. `&raw expr` lacks the explicit indication of immutability. However, `&raw const expr` is quite a but longer than `addr_of!(expr)`. - Shouldn't we have a completely new, better raw pointer type instead? Yes we all want to see that happen -- but I don't think we should block stabilization on that, given that such a nicer type is not on the horizon currently and given the issues with `addr_of!` mentioned above. (If we keep the `&raw $place` syntax free for this, we could use it in the future for that new type.) - What about the lint the RFC talked about? It hasn't been implemented yet. Given that the problematic code is UB with or without this stabilization, I don't think the lack of the lint should block stabilization. - I created an issue to track adding it: rust-lang#127724 - Other points from the "future possibilites of the RFC - "Syntactic sugar" extension: this has not been implemented. I'd argue this is too confusing, we should stick to what the RFC suggested and if we want to do anything about such expressions, add the lint. - Encouraging / requiring `&raw` in situations where references are often/definitely incorrect: this has been / is being implemented. On packed fields this already is a hard error, and for `static mut` a lint suggesting raw pointers is being rolled out. - Lowering of casts: this has been implemented. (It's also an invisible implementation detail.) - `offsetof` woes: we now have native `offset_of` so this is not relevant any more. To be done before landing: - [x] Suppress `unused_parens` lint around `&raw {const|mut}` expressions - See bottom of rust-lang#127679 (comment) for rationale - Implementation: rust-lang#128782 - [ ] Update the Reference. - rust-lang/reference#1567 Fixes rust-lang#64490 cc `@rust-lang/lang` `@rust-lang/opsem` try-job: x86_64-msvc try-job: test-various try-job: dist-various-1 try-job: armhf-gnu try-job: aarch64-apple
Rollup merge of rust-lang#127679 - RalfJung:raw_ref_op, r=jieyouxu Stabilize `raw_ref_op` (RFC 2582) This stabilizes the syntax `&raw const $expr` and `&raw mut $expr`. It has existed unstably for ~4 years now, and has been exposed on stable via the `addr_of` and `addr_of_mut` macros since Rust 1.51 (released more than 3 years ago). I think it has become clear that these operations are here to stay. So it is about time we give them proper primitive syntax. This has two advantages over the macro: - Being macros, `addr_of`/`addr_of_mut` could in theory do arbitrary magic with the expression on which they work. The only "magic" they actually do is using the argument as a place expression rather than as a value expression. Place expressions are already a subtle topic and poorly understood by many programmers; having this hidden behind a macro using unstable language features makes this even worse. Conversely, people do have an idea of what happens below `&`/`&mut`, so we can make the subtle topic a lot more approachable by connecting to existing intuition. - The name `addr_of` is quite unfortunate from today's perspective, given that we have accepted provenance as a reality, which means that a pointer is *not* just an address. Strict provenance has a method, `addr`, which extracts the address of a pointer; using the term `addr` in two different ways is quite unfortunate. That's why this PR soft-deprecates `addr_of` -- we will wait a long time before actually showing any warning here, but we should start telling people that the "addr" part of this name is somewhat misleading, and `&raw` avoids that potential confusion. In summary, this syntax improves developers' ability to conceptualize the operational semantics of Rust, while making a fundamental operation frequently used in unsafe code feel properly built in. Possible questions to consider, based on the RFC and [this](rust-lang#64490 (comment)) great summary by `@CAD97:` - Some questions are entirely about the semantics. The semantics are the same as with the macros so I don't think this should have any impact on this syntax PR. Still, for completeness' sake: - Should `&raw const *mut_ref` give a read-only pointer? - Tracked at: rust-lang/unsafe-code-guidelines#257 - I think ideally the answer is "no". Stacked Borrows says that pointer is read-only, but Tree Borrows says it is mutable. - What exactly does `&raw const (*ptr).field` require? Answered in [the reference](https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html): the arithmetic to compute the field offset follows the rules of `ptr::offset`, making it UB if it goes out-of-bounds. Making this a safe operation (using `wrapping_offset` rules) is considered too much of a loss for alias analysis. - Choose a different syntax? I don't want to re-litigate the RFC. The only credible alternative that has been proposed is `&raw $place` instead of `&raw const $place`, which (IIUC) could be achieved by making `raw` a contextual keyword in a new edition. The type is named `*const T`, so the explicit `const` is consistent in that regard. `&raw expr` lacks the explicit indication of immutability. However, `&raw const expr` is quite a but longer than `addr_of!(expr)`. - Shouldn't we have a completely new, better raw pointer type instead? Yes we all want to see that happen -- but I don't think we should block stabilization on that, given that such a nicer type is not on the horizon currently and given the issues with `addr_of!` mentioned above. (If we keep the `&raw $place` syntax free for this, we could use it in the future for that new type.) - What about the lint the RFC talked about? It hasn't been implemented yet. Given that the problematic code is UB with or without this stabilization, I don't think the lack of the lint should block stabilization. - I created an issue to track adding it: rust-lang#127724 - Other points from the "future possibilites of the RFC - "Syntactic sugar" extension: this has not been implemented. I'd argue this is too confusing, we should stick to what the RFC suggested and if we want to do anything about such expressions, add the lint. - Encouraging / requiring `&raw` in situations where references are often/definitely incorrect: this has been / is being implemented. On packed fields this already is a hard error, and for `static mut` a lint suggesting raw pointers is being rolled out. - Lowering of casts: this has been implemented. (It's also an invisible implementation detail.) - `offsetof` woes: we now have native `offset_of` so this is not relevant any more. To be done before landing: - [x] Suppress `unused_parens` lint around `&raw {const|mut}` expressions - See bottom of rust-lang#127679 (comment) for rationale - Implementation: rust-lang#128782 - [ ] Update the Reference. - rust-lang/reference#1567 Fixes rust-lang#64490 cc `@rust-lang/lang` `@rust-lang/opsem` try-job: x86_64-msvc try-job: test-various try-job: dist-various-1 try-job: armhf-gnu try-job: aarch64-apple
Stabilize `raw_ref_op` (RFC 2582) This stabilizes the syntax `&raw const $expr` and `&raw mut $expr`. It has existed unstably for ~4 years now, and has been exposed on stable via the `addr_of` and `addr_of_mut` macros since Rust 1.51 (released more than 3 years ago). I think it has become clear that these operations are here to stay. So it is about time we give them proper primitive syntax. This has two advantages over the macro: - Being macros, `addr_of`/`addr_of_mut` could in theory do arbitrary magic with the expression on which they work. The only "magic" they actually do is using the argument as a place expression rather than as a value expression. Place expressions are already a subtle topic and poorly understood by many programmers; having this hidden behind a macro using unstable language features makes this even worse. Conversely, people do have an idea of what happens below `&`/`&mut`, so we can make the subtle topic a lot more approachable by connecting to existing intuition. - The name `addr_of` is quite unfortunate from today's perspective, given that we have accepted provenance as a reality, which means that a pointer is *not* just an address. Strict provenance has a method, `addr`, which extracts the address of a pointer; using the term `addr` in two different ways is quite unfortunate. That's why this PR soft-deprecates `addr_of` -- we will wait a long time before actually showing any warning here, but we should start telling people that the "addr" part of this name is somewhat misleading, and `&raw` avoids that potential confusion. In summary, this syntax improves developers' ability to conceptualize the operational semantics of Rust, while making a fundamental operation frequently used in unsafe code feel properly built in. Possible questions to consider, based on the RFC and [this](rust-lang/rust#64490 (comment)) great summary by `@CAD97:` - Some questions are entirely about the semantics. The semantics are the same as with the macros so I don't think this should have any impact on this syntax PR. Still, for completeness' sake: - Should `&raw const *mut_ref` give a read-only pointer? - Tracked at: rust-lang/unsafe-code-guidelines#257 - I think ideally the answer is "no". Stacked Borrows says that pointer is read-only, but Tree Borrows says it is mutable. - What exactly does `&raw const (*ptr).field` require? Answered in [the reference](https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html): the arithmetic to compute the field offset follows the rules of `ptr::offset`, making it UB if it goes out-of-bounds. Making this a safe operation (using `wrapping_offset` rules) is considered too much of a loss for alias analysis. - Choose a different syntax? I don't want to re-litigate the RFC. The only credible alternative that has been proposed is `&raw $place` instead of `&raw const $place`, which (IIUC) could be achieved by making `raw` a contextual keyword in a new edition. The type is named `*const T`, so the explicit `const` is consistent in that regard. `&raw expr` lacks the explicit indication of immutability. However, `&raw const expr` is quite a but longer than `addr_of!(expr)`. - Shouldn't we have a completely new, better raw pointer type instead? Yes we all want to see that happen -- but I don't think we should block stabilization on that, given that such a nicer type is not on the horizon currently and given the issues with `addr_of!` mentioned above. (If we keep the `&raw $place` syntax free for this, we could use it in the future for that new type.) - What about the lint the RFC talked about? It hasn't been implemented yet. Given that the problematic code is UB with or without this stabilization, I don't think the lack of the lint should block stabilization. - I created an issue to track adding it: rust-lang/rust#127724 - Other points from the "future possibilites of the RFC - "Syntactic sugar" extension: this has not been implemented. I'd argue this is too confusing, we should stick to what the RFC suggested and if we want to do anything about such expressions, add the lint. - Encouraging / requiring `&raw` in situations where references are often/definitely incorrect: this has been / is being implemented. On packed fields this already is a hard error, and for `static mut` a lint suggesting raw pointers is being rolled out. - Lowering of casts: this has been implemented. (It's also an invisible implementation detail.) - `offsetof` woes: we now have native `offset_of` so this is not relevant any more. To be done before landing: - [x] Suppress `unused_parens` lint around `&raw {const|mut}` expressions - See bottom of rust-lang/rust#127679 (comment) for rationale - Implementation: rust-lang/rust#128782 - [ ] Update the Reference. - rust-lang/reference#1567 Fixes rust-lang/rust#64490 cc `@rust-lang/lang` `@rust-lang/opsem` try-job: x86_64-msvc try-job: test-various try-job: dist-various-1 try-job: armhf-gnu try-job: aarch64-apple
I hadn't seen this but it's very surprising and should be documented better. Apparently,
*mut T
and*const T
aren't equivalent — if the raw pointer starts out as a*const T
it will always be illegal to write to, even nothing beyond the pointer's "memory" of its initial state is the reason for this.See: rust-lang/rust-clippy#4774 (comment)
This is extremely surprising, as lots of documentation and common wisdom indicates that
*const T
vs*mut T
are identical except as a sort of lint, and that the variance is different.In fact, often having correct variance in your types often forces using
*const
even for mutable data (hence NonNull uses const). The prevalence of this certainly helps contribute to programmers belief that there's no meaningful difference, you just have to be sure the data you write to is legal for you to write to.A common case where this happens is if you write a helper method to return a pointer, you might write this just once for the
*const T
case, and use it even if you're a&mut self
and need a*mut T
result. I wouldn't think twice about this, mostly because the myth they're equivalent is so widespreadRalf's comment here even further propagates this myth, in a thread explicitly asking about the differences here... https://internals.rust-lang.org/t/what-is-the-real-difference-between-const-t-and-mut-t-raw-pointers/6127/18 :
More broadly, nowhere in the thread does the 'once
*const
, always*const
' behavior come up, just that you need to make sure that you maintain the normal rust rules (e.g. the rules which would apply had you started out with a*mut T
).I looked and in none of Rust's reference material could I find any mention of behavior like this. This is very surprising, and I had been under the impression that optimising accesses to raw pointers wasn't beneficial enough for Rust to care strongly about them.
I also think this breaks a lot of existing unsafe code given how widespread the belief that they are equivalent is, and makes non-const-correct C libraries much thornier to bind to :(
The text was updated successfully, but these errors were encountered: