Work with MIR-libstd #171

RalfJung · 2017-05-24T22:41:54Z

This makes println!("String literal") mostly work when miri has a libstd with full MIR at its hands. It still complains about things not being deallocated when miri terminates; that may be related to thread-local dtors not running.

println!("{}", foo) still doesn't work because that code does some crazy casts which are currently rejected by miri.

Also I figured out how to use xargo to build a MIR-libstd, so I documented that. Now there's "just" some bugs to be fixed that currently break xargo's libstd support...

oli-obk · 2017-05-24T22:49:11Z

Interesting... How is travis still passing?

RalfJung · 2017-05-24T22:49:46Z

I assume travis doesn't execute the xargo stuff, nor does it even hit the other code paths I changed?

oli-obk · 2017-05-24T22:51:37Z

README.md

+
+Notice that you will have to re-run the last step of the preparations above when
+your toolchain changes (e.g., when you update the nightly).  Also, xargo doesn't
+currently work with nightlies newer than 2017-04-23.


How did you get this working then? miri won't compile on such an old nightly.

I used the old miri from back then.
I'm currently trying to fix the creation of the rust-src distribution component such that xargo works again.

oli-obk · 2017-05-24T22:58:54Z

I assume travis doesn't execute the xargo stuff, nor does it even hit the other code paths I changed?

Oh right. You didn't actually remove the "dead" parts. That's cool that it'll work on both systems for now.

@eddby I see no reason not to merge this, even if it's not very convenient to use right now. It'll continue working as intended, but make future PRs easier since there won't be any merge issues with this PR. Or do you see this differently?

RalfJung · 2017-05-24T23:04:31Z

I assume you meant @eddyb ;)

RalfJung · 2017-05-25T02:02:42Z

I submitted rust-lang/rust#42214 to rustc to fix using xargo against the latest nightlies.

eddyb · 2017-05-25T07:34:03Z

src/terminator/mod.rs

+        // Intercept some methods (even if we can find MIR for them)
+        if let ty::InstanceDef::Item(def_id) = instance.def {
+            match &self.tcx.item_path_str(def_id)[..] {
+                "std::sys::imp::fast_thread_local::register_dtor" => {


Why do it through the path instead of intercepting the actual FFI?

There is no FFI here. The trouble is that std::sys::imp::fast_thread_local::register_dtor accesses a global static variable with "extern weak" linkage. This is passed to ConstantExtractor::global_item, where it fails to find a MIR for this global. I tried to make it so that miri pretends this symbol is always NULL, but could not figure out how to make that work. Isn't global_item for looking up global variables? Why does that end up pushing a stack frame?

Because it evaluates the static the first time it is accessed. There is no miri support for weak linkage, in fact I'd be surprised if many people knew it turns the static into a pointer behind the scenes.

Because it evaluates the static the first time it is accessed.

Ah, I see, so the self.ecx.globals stores the already initialized globals. global_item just has the side-effect of making sure the global is in that table?

There is no miri support for weak linkage

There's not much miri can do, is there? Just like with C abi calls, I guess.

it turns the static into a pointer behind the scenes.

"it" = rustc or miri?

Oh I was misremembering, this is easier than I thought (it also proves that the semantics are tricky)! Turns out the static has to contain a pointer which can be observed as null (unless overridden). So you were on the right track, just need to bypass static initializer evaluation.

RalfJung · 2017-05-26T00:55:33Z

All right, I added thread-local storage to the memory subsystem and hooked the pthread calls into it. This is enough to make println!("String") work.

Just like in my original PR, when the program is done, miri complains about memory leaks.
I tried also handling the destructors that are attached to the thread-local allocations. However, I didn't even manage to load the function pointer passed to pthread_key_create... I tried

                let dtor_ty = self.operand_ty(&arg_operands[1]);
                let dtor = self.value_to_primval(args[1], dtor_ty)?;
                trace!("TLS dtor: {:?}", dtor);

which fails with "primitive read failed for type: std::option::Option<unsafe extern "C" fn(*mut libc::c_void)>".

RalfJung · 2017-05-26T05:39:43Z

I implemented support for destructors. Now things fail with

DEBUG:miri::memory: reading fn ptr: 22
TRACE:miri::terminator:  eval_fn_call: Instance {
    def: Item(
        DefId { krate: CrateNum(1), node: DefIndex(4935) => std/c49d12e8e6f2573b3232a9791cb110b3::sys[0]::imp[0]::fast_thread_local[0]::destroy_value[0] }
    ),
    substs: Slice(
        [
            std::cell::RefCell<std::option::Option<std::boxed::Box<std::io::Write + std::marker::Send>>>
        ]
    )
}
TRACE:miri::eval_context:  _26: Ptr(Pointer { alloc_id: AllocId(21), offset: 0 })
TRACE:miri::memory:  Alloc 21:   00 00 00 00 00 00 00 00 __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ __ 01 00 __ __ __ __ __ __ (40 bytes) (static mut)
error: can't call C ABI function: destroy_value
   |
note: inside call to std::sys::imp::fast_thread_local::register_dtor_fallback::run_dtors

Also, there are some FIXME in the latest commit that should be fixed before this gets merged.

eddyb · 2017-05-26T05:50:59Z

I'm pretty sure miri gives up on calling C ABI functions defined in Rust right now - it should be easy to enable as long as the call ABI and the definition ABI are the same (compatibility across ABIs would require an interpretation of those ABIs).

oli-obk · 2017-05-26T09:11:20Z

src/memory.rs

+
+    pub fn is_null_ptr(&self) -> bool {
+        // FIXME: Is this the right way?
+        return *self == Pointer::from_int(0)


Yes, that's correct. It's basically what happens when miri interprets the Mir of is_null

I'd do a match so that it works with Bytes(0) as well.

This is a Pointer, not a Primval ;)

So, no match then? All right, I'll remove this FIXME.

oli-obk · 2017-05-26T09:14:02Z

src/step.rs

+                trace!("Running TLS dtor {:?} on {:?}", instance, ptr);
+                // TODO: Potientiually, this has to support all the other possible instances? See eval_fn_call in terminator/mod.rs
+                let mir = self.load_mir(instance.def)?;
+                // FIXME: Are these the right dummy values?


We could let the span be the span of the global, other than that it's fine

Yeah there is no need for any values to be dummy.

Well I kind of considered the return pointer also a dummy value (as the function should not return anything), and I have no idea what the StackPopCleanup thing does, so I referred to them as dummys as well.

Are you saying that I should get the span from the instance, and then nothing is actually a dummy value? Not sure how I would get that span though.

mir.span should suffice for now

Ah, thanks :) I've been digging in rustc to find a way to go from a DefId to a Span, and found nothing. This works :)

def_id -> span is tcx.def_span(def_id)

Ah, good to know, thanks :)

oli-obk · 2017-05-26T09:28:06Z

src/terminator/mod.rs

+
+                // Extract the function type out of the signature (that seems easier than constructing it ourselves...)
+                // FIXME: Or should we instead construct the type we expect it to have?
+                let dtor_fn_ty = match self.operand_ty(&arg_operands[1]) {


nit: match on .sty to reduce boilerplate

Fixed.

What about the FIXME? Right now I am extracting the type from the fn that was called. That's slightly odd though: The type is Option<fn ...>, but if I try to load from memory at that type, it fails. I have to load at the fn type. I suppose this is related to the representation optimization on Option<Pointer> -- I thought that would be done "below" MIR level, but it seems like that's not the case and MIR can already observe this optimization?

Anyway, so right now I am extracting the fn out of the substitution attached to the Option. That doesn't seem very nice, but mk_fn_ptr looks rather ugly to call ('d have to somehow create a PolyFnSig).

I think this is fine, we do similar hackery in other locations. This is a c-function, so we don't have any "compile-time" type info available. You can read an Option<Pointer>, but it would require quite some boilerplate.

oli-obk · 2017-05-26T09:28:30Z

src/terminator/mod.rs

+
+                // Figure out how large a pthread TLS key actually is. This is libc::pthread_key_t.
+                let key_size = match self.operand_ty(&arg_operands[0]) {
+                    &TyS { sty: TypeVariants::TyRawPtr(TypeAndMut { ty, .. }), .. } => {


RalfJung · 2017-05-26T17:19:07Z

As a matter of general process, do you want me to just keep adding new commits to this PR (like it is handled in the rustc repo), or should I squash commits like the one I am writing now (applying your feedback) into previous commits to have a cleaner history?

oli-obk · 2017-05-26T17:20:20Z

As a matter of general process, do you want me to just keep adding new commits to this PR

Yea, that's great. No need to squash

RalfJung · 2017-05-26T17:25:32Z

All right, fixed the things that came up. Next I will look at what goes wrong with this destroy_value.

RalfJung · 2017-05-26T19:42:01Z

Pushed things to make the call do destroy_value work. Unfortunately, there are still memory leaks left. It may just be that this check is a little too restrictive for some of the more "interesting" libstd code -- maybe these things also leak in real executions, but that's fine because they are needed until the process terminates.

RalfJung · 2017-05-27T00:43:01Z

I changed the initialization process so that if there is a "start" lang item with MIR, that one is executed instead of directly running main. As a consequence of this, the at_exit handlers are actually run, which fixes all memory leaks in a simple "Hello World" :-)

There are some open questions:

How should tests be started? For now, I did not touch this path. This will probably mean that tests leak when run with MIR-libstd.
The "start" lang item has a return value. Currently, I am allocating a memory location to pass as return_lvalue to this function. @eddyb suggested I could allocate a fake stack frame, but I could not get that to work. Also, the return location is currently allocated even if we don't run the "start" lang item, to keep the code somewhat simpler. This is why I had to change the OOM test. Let me know if you want me to change this.
I left some FIXMEs in the code which are mostly questions that should be answered before this is merged.

I would never have gotten so far without the patient support of @eddyb who kept answering my questions about weird internals of miri and/or the compiler -- thanks a lot :)

RalfJung · 2017-05-27T03:44:36Z

With my latest commits, the entire test suite now passes with MIR-libstd :-) (of course, it also still works with the default, non-MIR-libstd)
Once the rust-src component in rustup is fixed, we could probably add this to travis CI if you want.

I will now stop adding new things to this PR, and instead await your review and your feedback regarding my open questions.

bjorn3

Noticed some more spell errors.

bjorn3 · 2017-05-27T08:38:51Z

src/terminator/mod.rs

+                let f_instance = self.memory.get_fn(f.alloc_id)?;
+                self.write_primval(dest, PrimVal::Bytes(0), dest_ty)?;
+
+                // Now we make a functon call.  TODO: Consider making this re-usable?  EvalContext::step does sth. similar for the TLS dtors,


bjorn3 · 2017-05-27T08:39:11Z

src/terminator/mod.rs

+                self.write_primval(dest, PrimVal::Bytes(0), dest_ty)?;
+
+                // Now we make a functon call.  TODO: Consider making this re-usable?  EvalContext::step does sth. similar for the TLS dtors,
+                // and of coruse eval_main.


eddyb · 2017-05-27T08:42:12Z

src/bin/miri.rs

@@ -95,7 +95,9 @@ fn after_analysis<'a, 'tcx>(state: &mut CompileState<'a, 'tcx>) {
        state.hir_crate.unwrap().visit_all_item_likes(&mut Visitor(limits, tcx, state));
    } else if let Some((entry_node_id, _)) = *state.session.entry_fn.borrow() {
        let entry_def_id = tcx.hir.local_def_id(entry_node_id);
-        miri::eval_main(tcx, entry_def_id, limits);
+        let start_wrapper = tcx.lang_items.start_fn()
+                                      .and_then(|start_fn| if tcx.is_mir_available(start_fn) { Some(start_fn) } else { None });


This can be indented better, probably by putting the if on its own line.

oli-obk · 2017-05-27T08:43:39Z

src/eval_context.rs

+        ecx.frame_mut().return_lvalue = Lvalue::from_ptr(ret_ptr);
+
+        loop {
+            if !ecx.step()? {


This is essentially a while loop

eddyb · 2017-05-27T08:46:11Z

src/terminator/mod.rs

@@ -138,6 +131,69 @@ impl<'a, 'tcx> EvalContext<'a, 'tcx> {
        Ok(())
    }

+    /// Decides whether it is okay to call the method with signature `real_sig` using signature `sig`


Please add FIXME to use a proper platform-specific ABI description.

I don't know what exactly this means -- why should this be platform-dependent? Rust permits using non-capturing Fn as fn on all platforms, doesn't it?

You're doing things like argument compatibility (e.g. between all pointers) which can be done properly by getting the actual ABI descriptions (once they're lifted from trans).

Well, ideally we should also have platform-independent rules for this, or at least a set of rules guaranteed to be safe on all platforms. It should eventually be possible to use miri to run a piece of unsafe code and get a guarantee that the code behaves like this on all platforms (or, at least, on all platforms where the types have the same size and alignment).

That's almost literally impossible. The only thing we can do is track down all publicly documented ABIs, encode their rules, and apply the rules of all ABIs at once, comparing the result.
You could try to approximate a common core, but it only makes sense as the intersection of all supported ABIs. It's also a bit risky to specify due to forwards compatibility.

I feel like there should be some common core that Rust pretty much guarantees to be available... unsafe code that doesn't do low-level data representation stuff shouldn't really have to care about the platform, I feel.
But this is a whole new topic, so for now, I added your FIXME :)

eddyb · 2017-05-27T08:46:58Z

src/terminator/mod.rs

                let mut arg_locals = self.frame().mir.args_iter();
                trace!("ABI: {:?}", sig.abi);
                trace!("arg_locals: {:?}", self.frame().mir.args_iter().collect::<Vec<_>>());
                trace!("arg_operands: {:?}", arg_operands);
                match sig.abi {
-                    Abi::Rust => {
+                    Abi::Rust | Abi::C => {


This whole match is actually about RustCall (the tuple hack) vs everything else.

So do you want me to change it accordingly? Everything not mentioned is currently unimplemented!, so I left it that way.

Yeah but they all behave the same way (as long as the call and definition have the same ABI), so there is no reason to leave all of those ABIs "unimplemented".

All right, fixed that.

oli-obk · 2017-05-27T08:49:47Z

src/eval_context.rs

+        }
+
+        // Allocate memory for the return value.  We have to do this when a stack frame was already pushed as the type code below
+        // calls EvalContext::substs, which needs a frame to be allocated (?!?)


You can use type_layout_with_substs to prevent the stack frame from being relevant

But I don't have a subst, do I?

Right. but just use an empty subst. That's what you're doing with the first frame anyway

oli-obk · 2017-05-27T08:49:47Z

src/eval_context.rs

+        }
+
+        // Allocate memory for the return value.  We have to do this when a stack frame was already pushed as the type code below
+        // calls EvalContext::substs, which needs a frame to be allocated (?!?)


You can use the 'with substs' variant of mehods to skip the frame lookup

…t C ABI calls if MIR is missing

…ing main directly This fixes the memory leaks when running a simple "Hello World" with MIR-libstd

… permitted Also properly check the "non-capturing Fn to fn" case

Also enable some tests that were disabled for no apparant reason. (The comment in zst.rs was wrong, the test was disabled also for miri execution.) Delete env_args test as the args can actually be queried with MIR-libstd (currently, they are always empty)

RalfJung · 2017-05-30T17:44:40Z

Rebased branch pushed. Also, I added a test for thread-local key without dtor.

RalfJung · 2017-05-30T20:21:10Z

I don't understand this travis failure. The same test is working perfectly fine here.

RalfJung · 2017-05-30T21:04:19Z

I added the promised test for string formatting. I would like to use something more like the following:

use std::io::{self, Write};

fn main() {
    let s = format!("{}", 13);
    let _ = io::stdout().write(s.as_bytes());
}

However, then the test suite fails with non-MIR libstd (failing to load MIR for std::fmt::format). So until we have a way to exclude tests there, I just went with println!("Hello {}", 13).

oli-obk · 2017-05-30T22:13:04Z

It only fails on I686-windows-gnu. Just ignore the test for that platform for now

…lang item

RalfJung · 2017-05-30T22:45:18Z

It only fails on I686-windows-gnu. Just ignore the test for that platform for now

Okay, will do.

I think this leaves the error shown when std::env::args called as the only open item.

RalfJung · 2017-05-30T22:46:26Z

(Oh, I also changed the README reflecting that latest rust-src does have all the files it takes for xargo to build libstd with MIR. Right now we still need to fix some permissions, which is silly; I am working on fixing that in rustup. Also, a PR for miri which lets travis run the test suite against a MIR-libstd is in the works, but it obviously depends o this PR so I won't submit it before this one gets merged.)

oli-obk · 2017-05-31T06:23:53Z

Lgtm. I'm fine with the Travis failure for now.

RalfJung · 2017-05-31T18:05:26Z

Cool, and thanks again to @eddyb for all his help :)

oli-obk reviewed May 24, 2017

View reviewed changes

RalfJung mentioned this pull request May 25, 2017

Support setting rust flags in Xargo.toml japaric/xargo#142

Closed

RalfJung force-pushed the println branch from 9c2b653 to 442968b Compare May 25, 2017 04:02

eddyb reviewed May 25, 2017

View reviewed changes

oli-obk reviewed May 26, 2017

View reviewed changes

RalfJung changed the title ~~Document how to get a MIR-libstd with xargo; make println! work~~ Work with MIR-libstd May 27, 2017

bjorn3 requested changes May 27, 2017

View reviewed changes

eddyb reviewed May 27, 2017

View reviewed changes

oli-obk reviewed May 27, 2017

View reviewed changes

eddyb reviewed May 27, 2017

View reviewed changes

oli-obk reviewed May 27, 2017

View reviewed changes

RalfJung added 13 commits May 30, 2017 10:41

also support writing to stderr

33d42f4

handle statics with linkage: initialize them with NULL

452cc9b

implement TLS

238211e

support TLS destructors

a66f359

use proper span for TLS dtors; fix some nits

14b16dc

unify the way we intercept missing MIR and C ABI calls; only intercep…

55438fe

…t C ABI calls if MIR is missing

implement __rust_maybe_catch_panic

720c5f8

If a "start" lang item incl. MIR is present, run that instead of runn…

cd6e3e6

…ing main directly This fixes the memory leaks when running a simple "Hello World" with MIR-libstd

improve fn pointer signature check to allow some casts that should be…

99433a1

… permitted Also properly check the "non-capturing Fn to fn" case

re-disable aux_test -- it passes here, but not on Travis

633a34d

fix various small nits

24a9a14

test thread-local key with no dtor

dad9547

RalfJung force-pushed the println branch from 6b047ff to dad9547 Compare May 30, 2017 17:41

latest rust nightly contains all the bits needed to re-compile libstd

cdf7a05

RalfJung force-pushed the println branch from 3b55519 to cdf7a05 Compare May 30, 2017 20:25

add a test for output string formatting

b8e0b79

allocate return pointer only when we start the program via the start …

b44babf

…lang item

RalfJung added 2 commits May 30, 2017 17:03

simplify determining size and alignment of a pointer

2b37d50

simplify xargo instructions

d06c165

oli-obk merged commit c817e6e into rust-lang:master May 31, 2017

RalfJung deleted the println branch June 23, 2017 05:53

Work with MIR-libstd #171

Work with MIR-libstd #171

Conversation

RalfJung commented May 24, 2017

oli-obk commented May 24, 2017

RalfJung commented May 24, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

oli-obk commented May 24, 2017

RalfJung commented May 24, 2017

RalfJung commented May 25, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

RalfJung commented May 26, 2017

RalfJung commented May 26, 2017

eddyb commented May 26, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

RalfJung commented May 26, 2017

oli-obk commented May 26, 2017

RalfJung commented May 26, 2017

RalfJung commented May 26, 2017

RalfJung commented May 27, 2017 • edited Loading

RalfJung commented May 27, 2017

bjorn3 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

RalfJung May 30, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

RalfJung commented May 30, 2017

RalfJung commented May 30, 2017

RalfJung commented May 30, 2017 • edited Loading

oli-obk commented May 30, 2017

RalfJung commented May 30, 2017

RalfJung commented May 30, 2017

oli-obk commented May 31, 2017

RalfJung commented May 31, 2017

RalfJung commented May 27, 2017 •

edited

Loading

RalfJung May 30, 2017 •

edited

Loading

RalfJung commented May 30, 2017 •

edited

Loading