Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

const interning: decide about mutability purely based on the kind of interning, not the types we see #116745

Closed
wants to merge 2 commits into from

Conversation

RalfJung
Copy link
Member

@RalfJung RalfJung commented Oct 14, 2023

This entirely replaces our const-eval interner, i.e. the code that takes the final result of a constant evaluation from the local memory of the const-eval machine to the global tcx memory. The main goal of this change is to ensure that we can detect mutable references that sneak into this final value -- this is something we want to reject for static and const, and while const-checking performs some static analysis to ensure this, I would be much more comfortable stabilizing const_mut_refs if we had a dynamic check that sanitizes the final value. (This is generally the approach we have been using on const-eval: do a static check to give nice errors upfront, and then do a dynamic check to be really sure that the properties we need for soundness, actually hold.)

The new interner is a lot simpler than the old one: it recursively traverses the allocation holding the final result, and all allocations reachable from it, and ensures they all get interned. The initial allocation is interned as immutable for const and pomoted and non-interior-mutable static; all other allocations are interned as immutable for static, const, and promoted. The main subtlety is justifying that those inner allocations may indeed be interned immutably, i.e., that mutating them later would anyway already be UB:

  • for promoteds, we rely on the analysis that does promotion to ensure that this is sound
  • for const and static, we check that all pointers in the final result that point to things that are new (i.e., part of this const evaluation) are immutable, i.e., were created via &<expr> at a non-interior-mutable type.

Interning raises an error if it encounters a dangling pointer or a mutable pointer that violates the above rules.

I also extended our type-driven const validity checks to ensure that &mut T in the final value of a const points to mutable memory, at least if T is not zero-sized. This catches cases of people turning &i32 into &mut i32 (which would still be considered a read-only pointer). Similarly, when these checks encounter an UnsafeCell, they are checking that it lives in mutable memory. (Both of these only traverse the newly created values; if those point to other consts/promoteds, the check stops there. But that's okay, we don't have to catch all the UB.) I co-developed this with the stricter interner changes but I can split it out into a separate PR if you prefer.

This PR does have the immediate effect of allowing some new code on stable, for instance:

const CONST_RAW: *const Vec<i32> = &Vec::new() as *const _;

Previously that code got rejected since the type-based interner didn't know what to do with that pointer. It's a raw pointer, we cannot trust its type. The new interner does not care about types so it sees no issue with this code; there's an immutable pointer pointing to some read-only memory (storing a Vec<i32>), all is good. I can't think of a way to reject this code without type-based interning. Accepting this code pretty much commits us to non-type-based interning, but I think that's the better strategy anyway.

This PR also leads to slightly worse error messages when the final value of a const contains a dangling reference. Previously we would complete interning and then the type-based validation would detect this dangling reference and show a nice error saying where in the value (i.e., in which field) the dangling reference is located. However, the new interner cannot distinguish dangling references from dangling raw pointers, so it must throw an error when it encounters either of them. It doesn't have an understanding of the value structure so all it can say is "somewhere in this constant there's a dangling pointer". (Later parts of the compiler don't like dangling pointers/references so we have to reject them either during interning or during validation.) This could potentially be improved by doing validation before interning, but that's a larger change that I have not attempted yet. (It's also subtle since we do want validation to use the final mutability bits of all involved allocations, and currently it is interning that marks a bunch of allocations as immutable -- that would have to still happen before validation.)

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Oct 14, 2023
@rustbot
Copy link
Collaborator

rustbot commented Oct 14, 2023

Some changes occurred to the CTFE / Miri engine

cc @rust-lang/miri

@oli-obk
Copy link
Contributor

oli-obk commented Oct 15, 2023

  1. This affects for instance static mut FOO: &i32 = &0;.

That uses promotion, but static mut FOO: &i32 = {let x = 0; &{x}}; or something like it should do it

@oli-obk
Copy link
Contributor

oli-obk commented Oct 15, 2023

2. and same for this one.

I think you meant https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=20d43a96a5f0b49e77da0a54a397633b which is rejected during interning due to a dangling pointer

@RalfJung
Copy link
Member Author

RalfJung commented Oct 15, 2023 via email

@oli-obk
Copy link
Contributor

oli-obk commented Oct 15, 2023

I thought leading & is not promotion but the "outer scope rule"?

Promotion happens first 😅

@RalfJung
Copy link
Member Author

RalfJung commented Oct 15, 2023 via email

@rustbot
Copy link
Collaborator

rustbot commented Oct 15, 2023

Some changes occurred to the CTFE / Miri engine

cc @rust-lang/miri

@RalfJung RalfJung force-pushed the intern-without-types branch 2 times, most recently from dc43935 to 22106c3 Compare October 15, 2023 15:02
@RalfJung
Copy link
Member Author

Okay, I added the test, as a Miri-unleash test to ensure this works even if the static checks are sleeping. The new checks are basically just slightly generalizing existing checks we already had to prevent UnsafeCell and &mut in constants.

I do wonder if there is any way one could have previously declared a static that contains an UnsafeCell, which would now be rejected. Might be worth a crater run.
@bors try

@RalfJung RalfJung changed the title Draft: const interning: decide about mutability purely based on the kind of interning, not the types we see const interning: decide about mutability purely based on the kind of interning, not the types we see Oct 15, 2023
@bors
Copy link
Contributor

bors commented Oct 15, 2023

⌛ Trying commit 22106c3 with merge e32241d...

bors added a commit to rust-lang-ci/rust that referenced this pull request Oct 15, 2023
const interning: decide about mutability purely based on the kind of interning, not the types we see

r? `@oli-obk` this is what I meant on Zulip. For now I left the type visitor in the code; removing it and switching to a simple interning loop will mean we accept code that we currently reject, such as this
```rust
const CONST_RAW: *const Vec<i32> = &Vec::new() as *const _;
```
I see no reason for us to reject such code, but accepting it should go through t-lang FCP, so I want to do that in a follow-up PR.

This PR does change behavior in the following situations:
1. Shared references inside `static mut` are no longer put in read-only memory. This affects for instance `static mut FOO: &i32 = &0;`. We never *promised* that this would be read-only, and `static mut` is [an anti-pattern anyway](rust-lang#53639), so I think this is fine. If you want read-only memory, write this as `static INNER: i32 = 0; static mut FOO: &i32 = &INNER;`.
2. Potentially, mutable things in a `static` are now marked read-only. That would be a problem. But I am not sure if that can happen? The code mentions `static FOO: *const AtomicUsize = &AtomicUsize::new(42)`, but that is rejected for being non-`Sync`. [This variant](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=112e930ae1b3ef285812ab404ca296fa) also gets rejected, and same for [this one](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=0dac8d173a2b3099b9c2854fdad7a87c). I think we should reject all cases where a `static` introduces mutable state, except for the outermost allocation itself which can have interior mutability (and which is the one allocation where we have fully reliable type information).

What I still want to do in this PR before it is ready for review it is ensure we detect situations where `&mut` or `&UnsafeCell` points to immutable allocations. That should detect if we have any instance of case (2). That check should be part of the regular type validity check though, not part of interning.
@bors
Copy link
Contributor

bors commented Oct 15, 2023

☀️ Try build successful - checks-actions
Build commit: e32241d (e32241d5b12c6460525a3802a78cb7b87f73242c)

@RalfJung
Copy link
Member Author

@craterbot check

@craterbot
Copy link
Collaborator

👌 Experiment pr-116745 created and queued.
🤖 Automatically detected try build e32241d
⚠️ Try build based on commit 22106c3, but latest commit is 8e4a512. Did you forget to make a new try build?
🔍 You can check out the queue and this experiment's details.

ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

@craterbot craterbot added S-waiting-on-crater Status: Waiting on a crater run to be completed. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Oct 15, 2023
@oli-obk
Copy link
Contributor

oli-obk commented Oct 16, 2023

@rust-timer build e32241d

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (e32241d): comparison URL.

Overall result: no relevant changes - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
2.1% [2.1%, 2.1%] 1
Regressions ❌
(secondary)
3.3% [2.2%, 4.3%] 2
Improvements ✅
(primary)
-2.7% [-2.7%, -2.7%] 1
Improvements ✅
(secondary)
-1.3% [-1.4%, -1.1%] 2
All ❌✅ (primary) -0.3% [-2.7%, 2.1%] 2

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-2.7% [-2.8%, -2.6%] 2
All ❌✅ (primary) - - 0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 623.941s -> 623.55s (-0.06%)
Artifact size: 305.50 MiB -> 305.49 MiB (-0.00%)

@RalfJung
Copy link
Member Author

I expect we'll see perf improvements when the type-based traversal is completely removed from the interner.

@bors
Copy link
Contributor

bors commented Oct 16, 2023

☔ The latest upstream changes (presumably #114330) made this pull request unmergeable. Please resolve the merge conflicts.

bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 7, 2023
…saethlin

compile-time evaluation: detect writes through immutable pointers

This has two motivations:
- it unblocks rust-lang#116745 (and therefore takes a big step towards `const_mut_refs` stabilization), because we can now detect if the memory that we find in `const` can be interned as "immutable"
- it would detect the UB that was uncovered in rust-lang#117905, which was caused by accidental stabilization of `copy` functions in `const` that can only be called with UB

When UB is detected, we emit a future-compat warn-by-default lint. This is not a breaking change, so completely in line with [the const-UB RFC](https://rust-lang.github.io/rfcs/3016-const-ub.html), meaning we don't need t-lang FCP here. I made the lint immediately show up for dependencies since it is nearly impossible to even trigger this lint without `const_mut_refs` -- the accidentally stabilized `copy` functions are the only way this can happen, so the crates that popped up in rust-lang#117905 are the only causes of such UB (in the code that crater covers), and the three cases of UB that we know about have all been fixed in their respective crates already.

The way this is implemented is by making use of the fact that our interpreter is already generic over the notion of provenance. For CTFE we now use the new `CtfeProvenance` type which is conceptually an `AllocId` plus a boolean `immutable` flag (but packed for a more efficient representation). This means we can mark a pointer as immutable when it is created as a shared reference. The flag will be propagated to all pointers derived from this one. We can then check the immutable flag on each write to reject writes through immutable pointers.

I just hope perf works out.
github-actions bot pushed a commit to rust-lang/miri that referenced this pull request Dec 8, 2023
compile-time evaluation: detect writes through immutable pointers

This has two motivations:
- it unblocks rust-lang/rust#116745 (and therefore takes a big step towards `const_mut_refs` stabilization), because we can now detect if the memory that we find in `const` can be interned as "immutable"
- it would detect the UB that was uncovered in rust-lang/rust#117905, which was caused by accidental stabilization of `copy` functions in `const` that can only be called with UB

When UB is detected, we emit a future-compat warn-by-default lint. This is not a breaking change, so completely in line with [the const-UB RFC](https://rust-lang.github.io/rfcs/3016-const-ub.html), meaning we don't need t-lang FCP here. I made the lint immediately show up for dependencies since it is nearly impossible to even trigger this lint without `const_mut_refs` -- the accidentally stabilized `copy` functions are the only way this can happen, so the crates that popped up in #117905 are the only causes of such UB (in the code that crater covers), and the three cases of UB that we know about have all been fixed in their respective crates already.

The way this is implemented is by making use of the fact that our interpreter is already generic over the notion of provenance. For CTFE we now use the new `CtfeProvenance` type which is conceptually an `AllocId` plus a boolean `immutable` flag (but packed for a more efficient representation). This means we can mark a pointer as immutable when it is created as a shared reference. The flag will be propagated to all pointers derived from this one. We can then check the immutable flag on each write to reject writes through immutable pointers.

I just hope perf works out.
bjorn3 pushed a commit to rust-lang/rustc_codegen_cranelift that referenced this pull request Dec 12, 2023
compile-time evaluation: detect writes through immutable pointers

This has two motivations:
- it unblocks rust-lang/rust#116745 (and therefore takes a big step towards `const_mut_refs` stabilization), because we can now detect if the memory that we find in `const` can be interned as "immutable"
- it would detect the UB that was uncovered in rust-lang/rust#117905, which was caused by accidental stabilization of `copy` functions in `const` that can only be called with UB

When UB is detected, we emit a future-compat warn-by-default lint. This is not a breaking change, so completely in line with [the const-UB RFC](https://rust-lang.github.io/rfcs/3016-const-ub.html), meaning we don't need t-lang FCP here. I made the lint immediately show up for dependencies since it is nearly impossible to even trigger this lint without `const_mut_refs` -- the accidentally stabilized `copy` functions are the only way this can happen, so the crates that popped up in #117905 are the only causes of such UB (in the code that crater covers), and the three cases of UB that we know about have all been fixed in their respective crates already.

The way this is implemented is by making use of the fact that our interpreter is already generic over the notion of provenance. For CTFE we now use the new `CtfeProvenance` type which is conceptually an `AllocId` plus a boolean `immutable` flag (but packed for a more efficient representation). This means we can mark a pointer as immutable when it is created as a shared reference. The flag will be propagated to all pointers derived from this one. We can then check the immutable flag on each write to reject writes through immutable pointers.

I just hope perf works out.
@RalfJung RalfJung marked this pull request as draft December 16, 2023 15:22
@rust-log-analyzer

This comment has been minimized.

@RalfJung
Copy link
Member Author

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Dec 17, 2023
@bors
Copy link
Contributor

bors commented Dec 17, 2023

⌛ Trying commit 6854fa3 with merge 04931d4...

bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 17, 2023
const interning: decide about mutability purely based on the kind of interning, not the types we see

r? `@oli-obk` this is what I meant on Zulip. For now I left the type visitor in the code; removing it and switching to a simple interning loop will mean we accept code that we currently reject, such as this
```rust
const CONST_RAW: *const Vec<i32> = &Vec::new() as *const _;
```
I see no reason for us to reject such code, but accepting it should go through t-lang FCP, so I want to do that in a follow-up PR.

This PR does change behavior in the following situations:
1. Shared references inside `static mut` are no longer put in read-only memory. This affects for instance `static mut FOO: &i32 = &0;`. We never *promised* that this would be read-only, and `static mut` is [an anti-pattern anyway](rust-lang#53639), so I think this is fine. If you want read-only memory, write this as `static INNER: i32 = 0; static mut FOO: &i32 = &INNER;`.
2. Potentially, mutable things in a `static` are now marked read-only. That would be a problem. But I am not sure if that can happen? The code mentions `static FOO: *const AtomicUsize = &AtomicUsize::new(42)`, but that is rejected for being non-`Sync`. [This variant](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=112e930ae1b3ef285812ab404ca296fa) also gets rejected, and same for [this one](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=0dac8d173a2b3099b9c2854fdad7a87c). I think we should reject all cases where a `static` introduces mutable state, except for the outermost allocation itself which can have interior mutability (and which is the one allocation where we have fully reliable type information).

What I still want to do in this PR before it is ready for review it is ensure we detect situations where `&mut` or `&UnsafeCell` points to immutable allocations. That should detect if we have any instance of case (2). That check should be part of the regular type validity check though, not part of interning.
@rust-log-analyzer

This comment has been minimized.

@bors
Copy link
Contributor

bors commented Dec 17, 2023

☀️ Try build successful - checks-actions
Build commit: 04931d4 (04931d4a7480f5d6051fea7226a93ba2785927ac)

@rust-timer

This comment has been minimized.

@RalfJung RalfJung removed the S-blocked Status: Marked as blocked ❌ on something else such as an RFC or other implementation work. label Dec 17, 2023
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (04931d4): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.9% [-1.0%, -0.7%] 2
Improvements ✅
(secondary)
-3.6% [-8.2%, -0.5%] 15
All ❌✅ (primary) -0.9% [-1.0%, -0.7%] 2

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
2.1% [2.1%, 2.1%] 1
Improvements ✅
(primary)
-0.7% [-0.7%, -0.7%] 1
Improvements ✅
(secondary)
-3.0% [-4.0%, -2.6%] 5
All ❌✅ (primary) -0.7% [-0.7%, -0.7%] 1

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-6.7% [-10.0%, -4.8%] 9
All ❌✅ (primary) - - 0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 674.824s -> 671.516s (-0.49%)
Artifact size: 312.45 MiB -> 312.41 MiB (-0.01%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Dec 17, 2023
@RalfJung
Copy link
Member Author

As I was hoping, getting rid of that type-driven traversal does help quite a bit with perf. :)

I'll close this PR though, it contains too much old discussion. Will start a new one.

@RalfJung RalfJung closed this Dec 17, 2023
GuillaumeGomez pushed a commit to GuillaumeGomez/rustc_codegen_gcc that referenced this pull request Feb 15, 2024
compile-time evaluation: detect writes through immutable pointers

This has two motivations:
- it unblocks rust-lang/rust#116745 (and therefore takes a big step towards `const_mut_refs` stabilization), because we can now detect if the memory that we find in `const` can be interned as "immutable"
- it would detect the UB that was uncovered in rust-lang/rust#117905, which was caused by accidental stabilization of `copy` functions in `const` that can only be called with UB

When UB is detected, we emit a future-compat warn-by-default lint. This is not a breaking change, so completely in line with [the const-UB RFC](https://rust-lang.github.io/rfcs/3016-const-ub.html), meaning we don't need t-lang FCP here. I made the lint immediately show up for dependencies since it is nearly impossible to even trigger this lint without `const_mut_refs` -- the accidentally stabilized `copy` functions are the only way this can happen, so the crates that popped up in #117905 are the only causes of such UB (in the code that crater covers), and the three cases of UB that we know about have all been fixed in their respective crates already.

The way this is implemented is by making use of the fact that our interpreter is already generic over the notion of provenance. For CTFE we now use the new `CtfeProvenance` type which is conceptually an `AllocId` plus a boolean `immutable` flag (but packed for a more efficient representation). This means we can mark a pointer as immutable when it is created as a shared reference. The flag will be propagated to all pointers derived from this one. We can then check the immutable flag on each write to reject writes through immutable pointers.

I just hope perf works out.
lnicola pushed a commit to lnicola/rust-analyzer that referenced this pull request Apr 7, 2024
compile-time evaluation: detect writes through immutable pointers

This has two motivations:
- it unblocks rust-lang/rust#116745 (and therefore takes a big step towards `const_mut_refs` stabilization), because we can now detect if the memory that we find in `const` can be interned as "immutable"
- it would detect the UB that was uncovered in rust-lang/rust#117905, which was caused by accidental stabilization of `copy` functions in `const` that can only be called with UB

When UB is detected, we emit a future-compat warn-by-default lint. This is not a breaking change, so completely in line with [the const-UB RFC](https://rust-lang.github.io/rfcs/3016-const-ub.html), meaning we don't need t-lang FCP here. I made the lint immediately show up for dependencies since it is nearly impossible to even trigger this lint without `const_mut_refs` -- the accidentally stabilized `copy` functions are the only way this can happen, so the crates that popped up in #117905 are the only causes of such UB (in the code that crater covers), and the three cases of UB that we know about have all been fixed in their respective crates already.

The way this is implemented is by making use of the fact that our interpreter is already generic over the notion of provenance. For CTFE we now use the new `CtfeProvenance` type which is conceptually an `AllocId` plus a boolean `immutable` flag (but packed for a more efficient representation). This means we can mark a pointer as immutable when it is created as a shared reference. The flag will be propagated to all pointers derived from this one. We can then check the immutable flag on each write to reject writes through immutable pointers.

I just hope perf works out.
RalfJung pushed a commit to RalfJung/rust-analyzer that referenced this pull request Apr 27, 2024
compile-time evaluation: detect writes through immutable pointers

This has two motivations:
- it unblocks rust-lang/rust#116745 (and therefore takes a big step towards `const_mut_refs` stabilization), because we can now detect if the memory that we find in `const` can be interned as "immutable"
- it would detect the UB that was uncovered in rust-lang/rust#117905, which was caused by accidental stabilization of `copy` functions in `const` that can only be called with UB

When UB is detected, we emit a future-compat warn-by-default lint. This is not a breaking change, so completely in line with [the const-UB RFC](https://rust-lang.github.io/rfcs/3016-const-ub.html), meaning we don't need t-lang FCP here. I made the lint immediately show up for dependencies since it is nearly impossible to even trigger this lint without `const_mut_refs` -- the accidentally stabilized `copy` functions are the only way this can happen, so the crates that popped up in #117905 are the only causes of such UB (in the code that crater covers), and the three cases of UB that we know about have all been fixed in their respective crates already.

The way this is implemented is by making use of the fact that our interpreter is already generic over the notion of provenance. For CTFE we now use the new `CtfeProvenance` type which is conceptually an `AllocId` plus a boolean `immutable` flag (but packed for a more efficient representation). This means we can mark a pointer as immutable when it is created as a shared reference. The flag will be propagated to all pointers derived from this one. We can then check the immutable flag on each write to reject writes through immutable pointers.

I just hope perf works out.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants