Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rustup for panic changes #1048

Closed
wants to merge 11 commits into from
Closed

Conversation

Aaron1011
Copy link
Member

This gets Miri working again, but doesn't actually implement unwinding

@oli-obk
Copy link
Contributor

oli-obk commented Nov 12, 2019

Do we need to wait for the nightly?

@RalfJung
Copy link
Member

No but the rustc PR hasn't landed yet ;)

src/machine.rs Outdated Show resolved Hide resolved
This gets Miri working again, but doesn't actually implement unwinding
@Aaron1011
Copy link
Member Author

@RalfJung: Updated

@RalfJung
Copy link
Member

@Aaron1011 now that your main PR landed (:tada: ), could you update the rust-version file to bump Miri to the new Rust? Then we should be able to land this.

@RalfJung
Copy link
Member

@bors r+

@bors
Copy link
Contributor

bors commented Nov 13, 2019

📌 Commit 7bb3052 has been approved by RalfJung

bors added a commit that referenced this pull request Nov 13, 2019
Rustup for panic changes

This gets Miri working again, but doesn't actually implement unwinding
@bors
Copy link
Contributor

bors commented Nov 13, 2019

⌛ Testing commit 7bb3052 with merge 14cd503...

@bors
Copy link
Contributor

bors commented Nov 13, 2019

💔 Test failed - status-appveyor

@Aaron1011
Copy link
Member Author

Aaron1011 commented Nov 13, 2019

This seems to be caused by the fact that we're trying to build actual bytecode for our custom libstd in unwind mode, but aren't using the platform-specific libpanic_unwind mod.rs.

Specifically, we end up missing eh_catch_typeinfo from libpanic_unwind/seh.rs on Windows

We only need crate metadata and MIR, not the built object files, so this
allows us to skip unecessary work when building `libstd`. It also
prevents us from trying to codegen the `try` panicking intrinsic, which
assumes that `libpanic_unwind` is being compiled for the target
platform. We built `libpanic_unwind` with `cfg(miri)`, which may
invalidate these assumptions (e.g. the `eh_catch_typeinfo` lang item is
no longer defined).
@Aaron1011
Copy link
Member Author

I just discovered -Z no-codegen. It appears to have existed for a long time, but Miri has suprisingly not been taking advantage of it.

This should cause us to skip codegen entirely, thus avoiding the Windows-specific panic intrinsic codegen issue.

@Aaron1011
Copy link
Member Author

@RalfJung: This should be ready to merge.

@RalfJung
Copy link
Member

Nice catch! Does that mean you can remove the miri_start_panic implementation from codegen, make it ICE instead?

@bors r+

@bors
Copy link
Contributor

bors commented Nov 13, 2019

📌 Commit c6d3179 has been approved by RalfJung

@bors
Copy link
Contributor

bors commented Nov 13, 2019

⌛ Testing commit c6d3179 with merge 3d159a6...

bors added a commit that referenced this pull request Nov 13, 2019
Rustup for panic changes

This gets Miri working again, but doesn't actually implement unwinding
@bors
Copy link
Contributor

bors commented Nov 13, 2019

💔 Test failed - status-appveyor

@RalfJung
Copy link
Member

@bors delegate+

@Aaron1011 with this you can now issue try builds if you want to test/debug a bit more. Please don't r+!

@bors
Copy link
Contributor

bors commented Nov 13, 2019

✌️ @Aaron1011 can now approve this pull request

@Aaron1011
Copy link
Member Author

This works fine for me on Linux - there must be a difference in how linking and/or codegen happens on Windows.

This gives us crate metadata without running any codegen
@Aaron1011
Copy link
Member Author

@bors try

@bors
Copy link
Contributor

bors commented Nov 13, 2019

⌛ Trying commit da61b24 with merge c987ab9...

bors added a commit that referenced this pull request Nov 14, 2019
Rustup for panic changes

This gets Miri working again, but doesn't actually implement unwinding
@bors
Copy link
Contributor

bors commented Nov 14, 2019

💔 Test failed - status-appveyor

@Aaron1011
Copy link
Member Author

@bors try

@bors
Copy link
Contributor

bors commented Nov 14, 2019

⌛ Trying commit 4e56a43 with merge e4edfc0...

bors added a commit that referenced this pull request Nov 14, 2019
Rustup for panic changes

This gets Miri working again, but doesn't actually implement unwinding
@bors
Copy link
Contributor

bors commented Nov 14, 2019

☀️ Try build successful - checks-travis, status-appveyor
Build commit: e4edfc0 (e4edfc0201062bdf8781ee4edf07257a1143763b)

Aaron1011 added a commit to Aaron1011/xargo that referenced this pull request Nov 14, 2019
This allows configuring Xargo to run `cargo check`
instead of `cargo build`

Needed to support rust-lang/miri#1048
@Aaron1011
Copy link
Member Author

This PR has evolved significantly from what was originally a simple update to the latest rustc.

The issue

Prior to the Rustc panic changes in rust-lang/rust#60026 being merged, Miri's custom libstd was essentially identical to the normal libstd shipped with Rust. While we do pass some -Z flags, none of them affect the actual behavior of libstd. Importantly, there were no #[cfg(miri)]s in libstd or any of its dependencies (other than in tests).

With rust-lang/rust#60026 merged, there is a significant difference between a #[cfg(miri)] libstd and a normal libstd. We now have a #[cfg(miri)] in libpanic_unwind, which causes us to call a special intrinsic instead of using the normal platform unwinding code.

On Linux, this works fine: we are able to run all of our tests, and cargo miri works fine.

On Windows (and maybe any MSVC target), it's a different story. During codegen, we attempt to lookup the eh_catch_typeinfo. This lang item is somewhat unusual in that is it only defined in the MSVC-specific part of libpanic_unwind. When libpanic_unwind is built with #[cfg(miri)], this code is no longer compiled, leading to a missing definition of the eh_catch_typeinfo lang item during codegen.

The root cause

Up until now, Miri has been using Xargo to build its custom libstd. In addition to producing metadata files (which is all Miri actually needs, since they contain the built MIR), Xargo produces compiled libraries (rlib, staticlib, dylib, etc.) for the target platform.

At first, I thought that these compiled libraries were never being used. However, it turns out that cargo miri currently forces our custom libstd (via --sysroot) for both normal dependenices and for build dependencies. This means that any build.rs files and build dependencies (e.g. cc) are currently linked against Miri's custom libstd. When Cargo runs the compiled buildscript, it is running a binary that was compiled from the Miri sysroot, not the system sysroot.

This doesn't appear to have cauesd any problems up until now (though it's certainly pretty weird). However, this approach is no longer possible with the most recent libstd. Since libpanic_unwind has special cfg(miri) logic, any build scripts that attempt to panic will simply explode (currently, they will abort without printing any message). This will break any build dependencies that use catch_unwind, since all panics will turn into aborts.

The solution

The quickest solution for this issue, as @RalfJung mentioned, would be make libpanic_unwind define the eh_catch_typeinfo lang item when built with #[cfg(miri)]. This would allow Miri to build and run on Windows again.

However, this is not an idea long-term solution. First of all, we are still left with the problem of build dependencies being unable to panic. Most users will probably not run into this - but when they do, it will be extremely frustrating, since a build script will appear to abruptly segfault without any error message.

Second, nothing prevents the compiler from adding other target-specific lang items, or making other assumptions about exports from libpanic_unwind. We're setting ourselves up for an endless game of whack-a-mole, where we're forced to keep piling hacks on top of hacks so that we can pretend to codegen a working libpanic_unwind (as opposed to just generating MIR for it).

The long-term solution is to switch to using cargo check. This completely disables codegen in a way that's built into the compiler. We will no longer need to worry about any assumptions made by the codegen backend, since that code will simply never be executed.

However, this turned out to be more complicated than I anticipated. There are several moving parts here:

  1. Making xargo run cargo check

Currently, xargo unconditionally uses cargo build when compiling libstd. I've opened a PR to add a cargo_mode = "check" option to Xargo.toml: japaric/xargo#267

  1. Making cargo-miri run cargo check.

This is complicated by the fact that we currently run cargo rustc in order to pass arguments to the last (and only to the last) invocation of rustc. Sadly, there appears to be no way to combine cargo check and cargo rustc - you can either use cargo rustc and have no way to skip codegen, or use cargo check and have no way to pass final-crate-specific rustc arguments.

To work around this issue, I've added a hack to serialize arguments to an environment variable. When we compile the final crate, we deserialize these arguments from the environment variable, and manually pass them to the rustc invocation.

  1. Making build dependencies use the proper libstd.

This is by far the trickiest part of this entire PR. When Cargo invokes our cargo-miri wrapper, we have three cases to worry about:

  1. Build dependencies (including build scripts): We pass through all arguments completely unmodified to rustc. Miri does not interact in any way with build scripts, so we want to treat them as if we were doing a normal run of cargo.

  2. Normal dependencies: We add our custom sysroot, but still invoke rustc. Since we are in cargo check mode, this will cause rustc to produce metadata for the normal dependenices of our runtime crate, using our custom libstd.

  3. The target itself (e.g. a test or a binary): We invoke miri, and actually begin execution.

Handles these three cases ensures that build dependencies are built using the normal platform libstd (as if cargo-miri did not exist), while normal dependencies and our target crate are built against our custom libstd).

Unfortunately, distinguishing between these three cases is a huge pain. I'm currently relying on the following tricks:

  1. Use CARGO_MANIFEST_DIR to detect when our target crate is being built.

The CARGO_MANIFEST_DIR is set by Cargo to the directory containing the manifest of the package currently being built. During the initial, non-wrapper invocation of cargo miri (e.g. when the user types cargo miri run on the command line), we determine the manifest directly for the crate they are building. We then compare this to CARGO_MANIFEST_DIR when we are being invoked by Cargo as a wrapper.

However, this fails to distinguish between a build.rs and the actual compilation, since both use the same manifest directory. This brings us to the second trick.

  1. Inspect the --emit= flag passed to rustc by Cargo.

This trick relies on the fact that we are using cargo check to build the crate. When Cargo compiles a build dependency, the --emit= flag will always contain link- this is because we always need to produce a runnable binary for build scripts.

When we are building normal dependencies, the --emit= flag will not contain link. This is how cargo check tells rustc not to perform codegen - the --emit= flag will be --emit=dep-info,metadata.

By checking for the presence of link, we can determine whether or not Cargo is trying to compile a build dependency. Note that the same crate could theoretically be build as both - e.g. you could add [dependencies] cc=x.y.z and [build-dependencies] cc=x.y.z to your Cargo.toml.

Adding more information to Cargo

While I believe assumptions behind the above Cargo hacks are fairly sound, this is not really a viable long-term solution. For example, Cargo could choose to stop passing the --emit flag if rustc would have it default to what Cargo already wanted.

Ideally, Cargo would set an environment variable to let us know which of the three cases we are in - target crate, build dependency, or normal dependency. I'm currently working on a PR that does just that.

Conclusion

I think our best path forward is to:

  1. Land cargo check support in xargo
  2. Merge some form of this PR, possible in multiple pieces. Depending on how long review takes, it might make sense to merge the eh_catch_typeinfo hack into rustc so that nightly users can have a (somewhat) working Miri again
  3. Work with the Cargo team to expose the information we need in a cleaner way. I think this information could be of use to other Cargo wrappers. In particular, build dependencies have very different semantics from regular dependencies (e.g. they will be built for a different target during cross-compilation), but wrappers have no clean way of determining which they are building.

// that `cargo check` has exactly the behavior that we want:
// it emits crate metadata (including MIR) without running any
// codegen.
command.arg("check").arg("-q");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check vs build here just affects the empty "fake crate" that we have for xargo, right? This comment here should be attached to the cargo_mode line above, not the command invocation down here.

prefixed_args += &arg.len().to_string();
prefixed_args.push(';');
prefixed_args += &arg;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When inventing a serialization scheme, at the very least please put this into separate functions.

Though maybe at this point we should just use serde and serialize this as JSON, or so?

@RalfJung
Copy link
Member

RalfJung commented Nov 15, 2019

@Aaron1011 thanks for the detailed write-up! This is a lot. I am somewhat concerned about the amount of hacks that are needed, given in particular that this is to avoid other hacks... one core problem seems to be that cargo is not very "instrumentable". I have heard plans before to improve that; this might be useful input for that discussion. Cc @ehuss.

It would be good to be able to work on this "in due time", without the pressure of Miri master being broken and, worse, without blocking any other update of Miri to sync it with rustc changes. So I think short-term I'd prefer either a less invasive hack, such as exposing the required lang items even with cfg(miri), or reverting the libpanic_unwind changes (blocking "just" Miri unwinding on this seems better than blocking all of Miri on it).

For a "less hacky"/"more proper" solution, I think we should have a "project issue" here in the Miri repo, with the project being something like "avoid codegen for Miri-interpreted crates". This has plenty of benefits as you said, but it is also non-trivial. Most of your excellent summary should be put into that project description. I have some thoughts on the details of that approach, but that project issue seems like a better place for discussion then, so I will hold off on these for now.

Does that seem like a reasonable plan? I will prepare a rustc PR along the lines of what I suggested earlier. EDIT: rust-lang/rust#66441

Centril added a commit to Centril/rust that referenced this pull request Nov 15, 2019
libpanic_unwind for Miri: make sure we have the SEH lang items when needed

r? @oli-obk  @alexcrichton This is required to fix the Miri toolstate. Turns out rustc complains when doing codegen for MSVC and these lang items do not exist. For now `cfg(miri)` needs to still be able to codegen (we [plan to change that](rust-lang/miri#1048 (comment)) but that's a larger project requiring improvements to xargo and maybe also cargo; that should not block fixing the toolstate). Yes, this is a hack, but it is inside `cfg(miri)` so I hope this is okay.

Cc @Aaron1011
Centril added a commit to Centril/rust that referenced this pull request Nov 15, 2019
libpanic_unwind for Miri: make sure we have the SEH lang items when needed

r? @oli-obk  @alexcrichton This is required to fix the Miri toolstate. Turns out rustc complains when doing codegen for MSVC and these lang items do not exist. For now `cfg(miri)` needs to still be able to codegen (we [plan to change that](rust-lang/miri#1048 (comment)) but that's a larger project requiring improvements to xargo and maybe also cargo; that should not block fixing the toolstate). Yes, this is a hack, but it is inside `cfg(miri)` so I hope this is okay.

Cc @Aaron1011
@RalfJung
Copy link
Member

Getting Miri itself to work again was done by #1055, and I created #1057 for discussing our move to cargo check. Closing this PR.

@RalfJung RalfJung closed this Nov 17, 2019
Aaron1011 added a commit to Aaron1011/xargo that referenced this pull request Nov 28, 2019
This allows configuring Xargo to run `cargo check`
instead of `cargo build`

Needed to support rust-lang/miri#1048
Aaron1011 added a commit to Aaron1011/xargo that referenced this pull request Dec 23, 2019
This allows configuring Xargo to run `cargo check`
instead of `cargo build`

Needed to support rust-lang/miri#1048
bors bot added a commit to japaric/xargo that referenced this pull request Dec 29, 2019
267: Add `xargo-check` command r=RalfJung a=Aaron1011

This allows configuring Xargo to run `cargo check`
instead of `cargo build`

Needed to support rust-lang/miri#1048

Co-authored-by: Aaron Hill <aa1ronham@gmail.com>
bors bot added a commit to japaric/xargo that referenced this pull request Dec 30, 2019
267: Add `xargo-check` command r=RalfJung a=Aaron1011

This allows configuring Xargo to run `cargo check`
instead of `cargo build`

Needed to support rust-lang/miri#1048

Co-authored-by: Aaron Hill <aa1ronham@gmail.com>
bors bot added a commit to japaric/xargo that referenced this pull request Dec 31, 2019
267: Add `xargo-check` command r=RalfJung a=Aaron1011

This allows configuring Xargo to run `cargo check`
instead of `cargo build`

Needed to support rust-lang/miri#1048

Co-authored-by: Aaron Hill <aa1ronham@gmail.com>
Aaron1011 added a commit to Aaron1011/miri that referenced this pull request Jan 1, 2020
Fixes rust-lang#1057

I'm using my original approach from PR rust-lang#1048. Ideally, we would
distinguish between build-deps/dependencies/'final crate' via a
different approach (e.g. the target directory). However, I
haven't been able to get that to work just yet.

However, everything should be working with the approach I'm using. At a
minimum, we can use this PR to verify that everything works as expected
when we don't actually produce native build outputs.
Aaron1011 added a commit to Aaron1011/miri that referenced this pull request Jan 1, 2020
Fixes rust-lang#1057

I'm using my original approach from PR rust-lang#1048. Ideally, we would
distinguish between build-deps/dependencies/'final crate' via a
different approach (e.g. the target directory). However, I
haven't been able to get that to work just yet.

However, everything should be working with the approach I'm using. At a
minimum, we can use this PR to verify that everything works as expected
when we don't actually produce native build outputs.
Aaron1011 added a commit to Aaron1011/miri that referenced this pull request Jan 9, 2020
Fixes rust-lang#1057

I'm using my original approach from PR rust-lang#1048. Ideally, we would
distinguish between build-deps/dependencies/'final crate' via a
different approach (e.g. the target directory). However, I
haven't been able to get that to work just yet.

However, everything should be working with the approach I'm using. At a
minimum, we can use this PR to verify that everything works as expected
when we don't actually produce native build outputs.
bors added a commit that referenced this pull request Feb 24, 2020
Use 'cargo check' to build the sysroot and target crate

Fixes #1057

I'm using my original approach from PR #1048. Ideally, we would
distinguish between build-deps/dependencies/'final crate' via a
different approach (e.g. the target directory). However, I
haven't been able to get that to work just yet.

However, everything should be working with the approach I'm using. At a
minimum, we can use this PR to verify that everything works as expected
when we don't actually produce native build outputs.
bors added a commit that referenced this pull request Feb 24, 2020
Use 'cargo check' to build the sysroot and target crate

Fixes #1057

I'm using my original approach from PR #1048. Ideally, we would
distinguish between build-deps/dependencies/'final crate' via a
different approach (e.g. the target directory). However, I
haven't been able to get that to work just yet.

However, everything should be working with the approach I'm using. At a
minimum, we can use this PR to verify that everything works as expected
when we don't actually produce native build outputs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants