Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

interpret: better control over whether we read data with provenance #97684

Merged
merged 2 commits into from
Jun 6, 2022

Conversation

RalfJung
Copy link
Member

@RalfJung RalfJung commented Jun 3, 2022

The resolution in rust-lang/unsafe-code-guidelines#286 seems to be that when we load data at integer type, we implicitly strip provenance. So let's implement that in Miri at least for scalar loads. This makes use of the fact that Scalar layouts distinguish pointer-sized integers and pointers -- so I was expecting some wild bugs where layouts set this incorrectly, but so far that does not seem to happen.

This does not entirely implement the solution to rust-lang/unsafe-code-guidelines#286; we still do the wrong thing for integers in larger types: we will copy_op them and then do validation, and validation will complain about the provenance. To fix that we need mutating validation; validation needs to strip the provenance rather than complaining about it. This is a larger undertaking (but will also help with rust-lang/miri#845 since we can reset padding to Uninit).

The reason this is useful is that we can now implement addr as a transmute from a pointer to an integer, and actually get the desired behavior of stripping provenance without exposing it!

@rustbot rustbot added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Jun 3, 2022
@rust-highfive
Copy link
Collaborator

Some changes occured to the CTFE / Miri engine

cc @rust-lang/miri

Some changes occured to the CTFE / Miri engine

cc @rust-lang/miri

@rust-highfive
Copy link
Collaborator

r? @nagisa

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jun 3, 2022
@RalfJung
Copy link
Member Author

RalfJung commented Jun 3, 2022

r? @oli-obk

@rust-highfive rust-highfive assigned oli-obk and unassigned nagisa Jun 3, 2022
@RalfJung RalfJung force-pushed the better-provenance-control branch from 9a3458f to 0900569 Compare June 3, 2022 12:03
@@ -25,10 +24,12 @@ const BAD_ENUM: Enum = unsafe { mem::transmute(1usize) };
//~^ ERROR is undefined behavior

const BAD_ENUM_PTR: Enum = unsafe { mem::transmute(&1) };
//~^ ERROR is undefined behavior
//~^ ERROR any use of this value will cause an error
//~| WARN this was previously accepted by the compiler but is being phased out
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We now error earlier here (already during evaluation, not just during validation), which leads to a different error message.

@RalfJung
Copy link
Member Author

RalfJung commented Jun 3, 2022

The reason this is useful is that we can now implement addr as a transmute from a pointer to an integer, and actually get the desired behavior of stripping provenance without exposing it!

I have now tested that this is the case. :) But I will leave actually changing addr to a future PR (since that is also a libs thing and should probably involve other people).

@rust-log-analyzer

This comment has been minimized.

@RalfJung RalfJung force-pushed the better-provenance-control branch from 0900569 to 7d5c833 Compare June 3, 2022 13:03
Comment on lines 292 to 295
let scalar = alloc.read_scalar(
alloc_range(Size::ZERO, size),
s.is_ptr() || (number_may_have_provenance && size == self.pointer_size()),
)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there should be an InterpCx method for this (it's repeated 3x here after all)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For what part exactly?
This should just be s.is_ptr(); all the rest is just to support the Miri flag that allows ptr-int transmutation...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, but just from a code perspective, these 3 duplications could be deduplicated with a method. The miri flag support won't go away after all.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well I am actually considering removing that Miri flag, given how complicated the provenance story is anyway and how we don't know if there even is demand for such a flag.

Making it into a method has the problem that it'd take Size and Scalar and require both to match; it's not a great API. I can try making it a local closure so at least nobody will try to use it anywhere else.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I made it a closure. Does that work for you?

This should not be used outside of read_immediate_from_mplace_raw so a method would send a wrong signal IMO.

compiler/rustc_middle/src/mir/interpret/allocation.rs Outdated Show resolved Hide resolved
&self,
range: AllocRange,
read_pointer: bool,
) -> InterpResult<'tcx, ScalarMaybeUninit<Tag>> {
let range = self.range.subrange(range);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General note on AllocRef: what bugged me a few times before is that all these methods take an AllocRange. Maybe we should move to a scheme where we add a slice method to AllocRef and thus require chaining of the sort of alloc.slice(range).read_pointer() instead of alloc.read_pointer(range).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mostly added the AllocRange to avoid having two Size parameters whose meaning is unclear. But yeah I guess in practice this did not work out as well as I had hoped...

That's a topic for a different PR though, I think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed

@RalfJung RalfJung force-pushed the better-provenance-control branch 2 times, most recently from fa59ef1 to ecf34dd Compare June 3, 2022 14:26
@oli-obk
Copy link
Contributor

oli-obk commented Jun 3, 2022

@bors r+

@bors
Copy link
Contributor

bors commented Jun 3, 2022

📌 Commit ecf34dd720223c5f6e6c682313268a76dfb4d9b6 has been approved by oli-obk

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 3, 2022
}
= note: `#[deny(const_err)]` on by default
= warning: this was previously accepted by the compiler but is being phased out; it will become a hard error in a future release!
= note: for more information, see issue #71800 <https://github.com/rust-lang/rust/issues/71800>
Copy link
Member

@bjorn3 bjorn3 Jun 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes a hard error into a future compat lint that can be disabled.

Copy link
Member Author

@RalfJung RalfJung Jun 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is what I noted above. Errors during const evaluation (as opposed to errors occurring during validation, which is after evaluation finished) are still future-compat lints (#71800).

@bors
Copy link
Contributor

bors commented Jun 5, 2022

⌛ Testing commit ecf34dd720223c5f6e6c682313268a76dfb4d9b6 with merge 6ac75ce48f727d41c0560e056a8c27e1d080cde7...

@bors
Copy link
Contributor

bors commented Jun 5, 2022

💔 Test failed - checks-actions

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Jun 5, 2022
@rust-log-analyzer

This comment has been minimized.

@RalfJung
Copy link
Member Author

RalfJung commented Jun 5, 2022

@bors r=oli-obk

@bors
Copy link
Contributor

bors commented Jun 5, 2022

📌 Commit d208f80 has been approved by oli-obk

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 5, 2022
bors added a commit to rust-lang-ci/rust that referenced this pull request Jun 6, 2022
implement ptr.addr() via transmute

As per the discussion in rust-lang/unsafe-code-guidelines#286, the semantics for ptr-to-int transmutes that we are going with for now is to make them strip provenance without exposing it. That's exactly what `ptr.addr()` does! So we can implement `ptr.addr()` via `transmute`. This also means that once rust-lang#97684 lands, Miri can distinguish `ptr.addr()` from `ptr.expose_addr()`, and the following code will correctly be called out as having UB (if permissive provenance mode is enabled, which will become the default once the [implementation is complete](rust-lang/miri#2133)):

```rust
fn main() {
    let x: i32 = 3;
    let x_ptr = &x as *const i32;

    let x_usize: usize = x_ptr.addr();
    // Cast back an address that did *not* get exposed.
    let ptr = std::ptr::from_exposed_addr::<i32>(x_usize);
    assert_eq!(unsafe { *ptr }, 3); //~ ERROR Undefined Behavior: dereferencing pointer failed
}
```

This completes the Miri implementation of the new distinctions introduced by strict provenance. :)

Cc `@Gankra` -- for now I left in your `FIXME(strict_provenance_magic)` saying these should be intrinsics, but I do not necessarily agree that they should be. Or if we have an intrinsic, I think it should behave exactly like the `transmute` does, which makes one wonder why the intrinsic should be needed.
Dylan-DPC added a commit to Dylan-DPC/rust that referenced this pull request Jun 6, 2022
… r=oli-obk

interpret: better control over whether we read data with provenance

The resolution in rust-lang/unsafe-code-guidelines#286 seems to be that when we load data at integer type, we implicitly strip provenance. So let's implement that in Miri at least for scalar loads. This makes use of the fact that `Scalar` layouts distinguish pointer-sized integers and pointers -- so I was expecting some wild bugs where layouts set this incorrectly, but so far that does not seem to happen.

This does not entirely implement the solution to rust-lang/unsafe-code-guidelines#286; we still do the wrong thing for integers in larger types: we will `copy_op` them and then do validation, and validation will complain about the provenance. To fix that we need mutating validation; validation needs to strip the provenance rather than complaining about it. This is a larger undertaking (but will also help resolve rust-lang/miri#845 since we can reset padding to `Uninit`).

The reason this is useful is that we can now implement `addr` as a `transmute` from a pointer to an integer, and actually get the desired behavior of stripping provenance without exposing it!
@RalfJung
Copy link
Member Author

RalfJung commented Jun 6, 2022

@bors p=1
Needed to get Miri back into shape

@Dylan-DPC
Copy link
Member

@bors p=6

@bors
Copy link
Contributor

bors commented Jun 6, 2022

⌛ Testing commit d208f80 with merge 9d20fd1...

@bors
Copy link
Contributor

bors commented Jun 6, 2022

☀️ Test successful - checks-actions
Approved by: oli-obk
Pushing 9d20fd1 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Jun 6, 2022
@bors bors merged commit 9d20fd1 into rust-lang:master Jun 6, 2022
@rustbot rustbot added this to the 1.63.0 milestone Jun 6, 2022
@RalfJung RalfJung deleted the better-provenance-control branch June 6, 2022 16:10
@rust-highfive
Copy link
Collaborator

📣 Toolstate changed by #97684!

Tested on commit 9d20fd1.
Direct link to PR: #97684

💔 miri on windows: test-fail → build-fail (cc @eddyb @oli-obk @RalfJung).
💔 miri on linux: test-fail → build-fail (cc @eddyb @oli-obk @RalfJung).

rust-highfive added a commit to rust-lang-nursery/rust-toolstate that referenced this pull request Jun 6, 2022
Tested on commit rust-lang/rust@9d20fd1.
Direct link to PR: <rust-lang/rust#97684>

💔 miri on windows: test-fail → build-fail (cc @eddyb @oli-obk @RalfJung).
💔 miri on linux: test-fail → build-fail (cc @eddyb @oli-obk @RalfJung).
bors added a commit to rust-lang/miri that referenced this pull request Jun 6, 2022
adjust for better provenance control

This is the Miri side of rust-lang/rust#97684.
bors added a commit to rust-lang/miri that referenced this pull request Jun 6, 2022
adjust for better provenance control

This is the Miri side of rust-lang/rust#97684.
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (9d20fd1): comparison url.

Instruction count

  • Primary benchmarks: 🎉 relevant improvements found
  • Secondary benchmarks: 🎉 relevant improvements found
mean1 max count2
Regressions 😿
(primary)
N/A N/A 0
Regressions 😿
(secondary)
N/A N/A 0
Improvements 🎉
(primary)
-0.7% -1.9% 8
Improvements 🎉
(secondary)
-5.5% -10.5% 12
All 😿🎉 (primary) -0.7% -1.9% 8

Max RSS (memory usage)

Results
  • Primary benchmarks: 🎉 relevant improvement found
  • Secondary benchmarks: 🎉 relevant improvement found
mean1 max count2
Regressions 😿
(primary)
N/A N/A 0
Regressions 😿
(secondary)
N/A N/A 0
Improvements 🎉
(primary)
-2.4% -2.4% 1
Improvements 🎉
(secondary)
-3.8% -3.8% 1
All 😿🎉 (primary) -2.4% -2.4% 1

Cycles

Results
  • Primary benchmarks: mixed results
  • Secondary benchmarks: 🎉 relevant improvements found
mean1 max count2
Regressions 😿
(primary)
2.4% 2.4% 1
Regressions 😿
(secondary)
N/A N/A 0
Improvements 🎉
(primary)
-2.9% -2.9% 2
Improvements 🎉
(secondary)
-9.8% -14.7% 9
All 😿🎉 (primary) -1.1% -2.9% 3

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

@rustbot label: -perf-regression

Footnotes

  1. the arithmetic mean of the percent change 2 3

  2. number of relevant changes 2 3

@RalfJung
Copy link
Member Author

RalfJung commented Jun 6, 2022

Well that is unexpected, but I won't complain.^^

RalfJung added a commit to RalfJung/rust that referenced this pull request Jul 30, 2022
bors added a commit to rust-lang-ci/rust that referenced this pull request Aug 4, 2022
allow numbers with provenance within CTFE execution

This effectively reverts rust-lang#97684 for CTFE.

Undoes the diagnostic changes that are tracked in rust-lang#99923, only for beta.
(On master this patch wouldn't apply any more, `enforce_number_no_provenance` is gone with rust-lang#99644 since the interpreter engine is not supposed to ever have provenance on integers.)

The test changes are an exact un-do of rust-lang#97684. However there is still some risk here since this exact code is not what has been battle-tested.

r? `@Mark-Simulacrum`
bors added a commit to rust-lang-ci/rust that referenced this pull request Sep 9, 2022
…=pnkfelix

beta-backport of provenance-related CTFE changes

This is all part of dealing with rust-lang#99923.

The first commit backports the effects of rust-lang#101101. `@pnkfelix` asked for this and it turned out to be easy, so I think this is uncontroversial.

The second commit effectively repeats rust-lang#99965, which un-does the effects of rust-lang#97684 and therefore means rust-lang#99923 does not apply to the beta branch. I honestly don't think we should do this; the sentiment in rust-lang#99923 was that we should go ahead with the change but improve diagnostics. But `@pnkfelix` seemed to request such a change so I figured I would offer the option.

I'll be on vacation soon, so if you all decide to take the first commit only, then someone please just force-push to this branch and remove the 2nd commit.
workingjubilee pushed a commit to tcdi/postgrestd that referenced this pull request Sep 15, 2022
implement ptr.addr() via transmute

As per the discussion in rust-lang/unsafe-code-guidelines#286, the semantics for ptr-to-int transmutes that we are going with for now is to make them strip provenance without exposing it. That's exactly what `ptr.addr()` does! So we can implement `ptr.addr()` via `transmute`. This also means that once rust-lang/rust#97684 lands, Miri can distinguish `ptr.addr()` from `ptr.expose_addr()`, and the following code will correctly be called out as having UB (if permissive provenance mode is enabled, which will become the default once the [implementation is complete](rust-lang/miri#2133)):

```rust
fn main() {
    let x: i32 = 3;
    let x_ptr = &x as *const i32;

    let x_usize: usize = x_ptr.addr();
    // Cast back an address that did *not* get exposed.
    let ptr = std::ptr::from_exposed_addr::<i32>(x_usize);
    assert_eq!(unsafe { *ptr }, 3); //~ ERROR Undefined Behavior: dereferencing pointer failed
}
```

This completes the Miri implementation of the new distinctions introduced by strict provenance. :)

Cc `@Gankra` -- for now I left in your `FIXME(strict_provenance_magic)` saying these should be intrinsics, but I do not necessarily agree that they should be. Or if we have an intrinsic, I think it should behave exactly like the `transmute` does, which makes one wonder why the intrinsic should be needed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants