construct MIR for all crates up to and including rustdoc #27893

nikomatsakis · 2015-08-18T22:32:24Z

This PR contains a new crate, rustc_mir, which implements the MIR as specified in the RFC (more or less). There are no targeted unit tests at the moment, as I didn't decide what kind of infrastructure would be best and didn't take the time to implement it.

NB: In packaging up this PR, I realized that MIR construction code is not triggering for methods right now, I think it's only for fixed fns. I'll push a fix for this soon. Hopefully it doesn't stop any crates from building. :) Fixed. Everything still seems to work.

However, the MIR construction code (librustc_mir/build) is intentionally quite distinct from the code which munges the compiler's data structures (librustc_mir/tcx). The interface between the two is the HIR trait (librustc_mir/hir). To avoid confusion with @nrc's work, perhaps a better name for this trait is warranted, although ultimately this trait will be connected to the HIR, I imagine, so in a way the name is perfect. Anyway, I'm open to suggestions. The initial motivation for this split was to allow for the MIR construction code to be unit-tested. But while I didn't end up writing unit tests (yet), I did find the split made the code immensely easier to think about, since the messiness of our existing system, with its myriad hashtables, punning, and so forth, is confined to one part, which simply transforms to a more fully explicit AST-like form. I tried to separate out the commits somewhat, but since this mostly new code, it mostly winds up coming in one fell swoop in the MIR commit.

Quick guide to the MIR crate:

repr.rs defines the MIR itself; each MIR instance is parameterized by some HIR H
build/ is the MIR construction code, parameterized by a particular HIR
hir/ is the definition of the HIR interface
tcx/ is the impl of the HIR interface for the tcx
dump.rs is the minimal compiler pass that invokes the HIR

One open question:

In the HIR trait, I used exclusively struct-like variants. I found I like this more, since things have names. Should I convert the repr code?

nikomatsakis · 2015-08-18T22:32:35Z

cc @rust-lang/compiler

rust-highfive · 2015-08-18T22:32:37Z

r? @huonw

(rust_highfive has picked a reviewer for you, use r? to override)

steveklabnik · 2015-08-18T23:08:03Z

src/libgraphviz/lib.rs

@@ -479,6 +494,16 @@ pub trait Labeller<'a,N,E> {
    }
 }

+/// Escape tags in such a way that it is suitable for inclusion in a
+/// Graphviz HTML label.
+pub fn escape_html(s: &str) -> String {


... this isn't a great escape function, but I guess it's not really untrusted input.

heh, yes. :)

arielb1 · 2015-08-18T23:15:25Z

@nikomatsakis

The HIR in this patch is fairly distant from @nrc's HIR: his HIR is very close to the (expanded) syntax, and intended to separate type-checking from the details of expansion/resolution, while the HIR in this patch is more like an elaborated, desugared version of Rust.

arielb1 · 2015-08-18T23:18:11Z

src/librustc_mir/hir.rs

+    Vec { fields: Vec<ExprRef<H>> },
+    Tuple { fields: Vec<ExprRef<H>> },
+    Adt { adt_def: H::AdtDef,
+          variant_index: usize,


why don't you store the variant itself?

@arielb1

why don't you store the variant itself?

Because it was inconvenient to get from the variant to the containing enum and recompute its index, I think. Maybe I should just add those two bits of information to the VariantDef though.

nikomatsakis · 2015-08-19T00:08:53Z

@arielb1

The HIR in this patch is fairly distant from @nrc's HIR: his HIR is very close to the (expanded) syntax, and intended to separate type-checking from the details of expansion/resolution, while the HIR in this patch is more like an elaborated, desugared version of Rust.

Yes, sure. It may be that we evolve @nrc's HIR downwards (I hope we do), but in any case I agree they are not identical, though in my head they play pretty much the same role: a relatively high-level, AST-oriented view on Rust. Anyway I didn't see a suggestion for a better name in your comment. :P

glaebhoerl · 2015-08-19T00:40:38Z

MHIR or HMIR? :p

EDIT: Perhaps AIR, for Abstract Intermediate Representation. Or HAIR.

Diggsey · 2015-08-19T00:44:44Z

EIR: Elevated Intermediate Representation...

nrc · 2015-08-19T03:58:12Z

@glaebhoerl Her Majesty's Intermediate Representation? /British

Ms2ger · 2015-08-19T07:20:00Z

src/libgraphviz/lib.rs

+pub fn escape_html(s: &str) -> String {
+    s
+        .replace("\"", "&quot;")
+        .replace("&", "&amp;")


Always replace & first, to avoid &quot;.

@Ms2ger

Always replace & first, to avoid ".

heh, yes.

nikomatsakis · 2015-08-19T13:12:07Z

@glaebhoerl hmm, I do like "HAIR" :)

oli-obk · 2015-08-20T17:17:25Z

src/librustc_mir/build/expr.rs

+    }
+
+    /// Compile `expr`, yielding a compile-time constant. Assumes that
+    /// `expr` is a valid compile-time constant!


is the idea that after check_const runs, everything marked as const will get turned into a const-rvalue?

I'd have thought the const-evaluator would be like an llvm-optimization-pass, simply running on the MIR without much more information (global consts + extern consts + const-fn are needed of course).

@oli-obk I started writing a response and found I had more to say than seemed appropriate for a GitHub comment. See this thread: https://internals.rust-lang.org/t/how-to-handle-constants-in-the-compiler/2543

nikomatsakis · 2015-08-21T14:01:12Z

r? @pnkfelix

after discussing with @huonw, decided to change official reviewer to pnkfelix, but I figure everyone will kibbitz anyhow.

oli-obk · 2015-08-21T21:03:18Z

src/librustc_mir/repr.rs

+    Use(Lvalue<H>),
+
+    // [x; 32]
+    Repeat(Lvalue<H>, Lvalue<H>),


so the second value should be a Constant<H>?

nikomatsakis · 2015-08-21T21:15:31Z

@arielb1

Also, no, it's not only primitive operators, but all operators except for explicit lvalue projections (i.e. addr-of, deref, field access, match, the LHS of array indexing) have their arguments copied to temporaries (up to optimizations) rather than accessed directly.

Hmm, yes, I see, I see. Of course I copied this approach from how trans does things, but I agree it's not right. I'm modifying the MIR construction now, but it seems to push me more in favor of introducing an Operand category, so that "by-value" Rvalue things take Operand rather than Lvalue. This has a few advantages: we can inline constants more easily (since Operand can include constants) and also we force ourselves to through as_operand (or something similar) where appropriate, which will ensure we don't accidentally wind up referencing some data in place that was (logically) moved. And then, as you say, we can later strip out intermediates that are not needed to achieve better re-use. I'll do some experiments and see how this works out.

nikomatsakis · 2015-08-24T13:22:25Z

OK, I just did a major rebase that folded in all of the WIP comments. In particular I introduced a new concept into the MIR, Operand, that corresponds (at construction time) to a value. It can be either an lvalue (as before) or a constant. This means that when you have an expression like x + y that references lvalues, you'll have temporaries introduced:

tmp0 = x
tmp1 = y
result = tmp0 + tmp1

It also means we do NOT introduce temporaries for constants in expressions like x + 1, those can now be directly folded in.

It seems like these properties are worthy of unit testing, so I may turn some attention to how we can build up a unit test framework. That said, introducing the notion of Operand makes it clear in the type system where something is being used "by value" versus "by reference" (what we have sometimes called an "lvalue context"). This means that the construction code is, at least, known to invoke as_operand for each "by value" use, so so long as we can convince ourselves that it introduces a temporary for lvalues, we're all set.

Also, it should be no problem to purge unnecessary temporaries later, once we have ascertained for sure that there is no mutation to the source of the temporary. As @arielb1 said, I'd rather do that in a separate, dedicated pass (this can also cover the optimizations around pattern matching we now do in trans, I imagine.)

This may be worth updating the RFC for as well, since it marks the first departure, but I'll hold off on that. (And the RFC was clearly intended to evolve anyhow.)

bors · 2015-08-25T01:03:39Z

☔ The latest upstream changes (presumably #27943) made this pull request unmergeable. Please resolve the merge conflicts.

eddyb · 2015-08-25T10:47:26Z

src/librustc_mir/repr.rs

+
+#[derive(Clone, PartialEq)]
+pub enum Operand<H:HIR> {
+    Lvalue(Lvalue<H>),


I'd almost want to name this variant Move or Consume.

@eddyb

I'd almost want to name this variant Move or Consume.

Seems good.

arielb1 · 2015-08-25T15:30:38Z

@nikomatsakis

this can also cover the optimizations around pattern matching we now do in trans, I imagine.

Of course this won't do the decision-tree optimizations, but the temporary-elimination optimizations we do in trans are basically the reverse of RVO.

We translate code like

    match foo {
        Some(foo_inner) => return Some(f(foo_inner));
        None => return None
    }

into MIR (sugared-assmebly-format, rather than graph-format, temporaries are %name, ordinary variables var)

    %discr: Option::Discr = foo.@discr ; match takes an lvalue
    switch %discr, None=>.none, Some=>.some
.some:
    foo_inner = foo.inner ; this isn't even a temporary - borrowck ensures no trouble
.some_body:
    %f_arg : T = foo_inner
    %f_ret : U = call f, %f_arg, onerr <snip> 
    %ret_val_2 : Option<U> = Some(%f_ret)
    *%ret = %ret_val_2
    jmp exit
.none:
.none_body:
    %ret_val : Option<U> = None
    *%ret = %ret_val
exit:
    ret

Standard RVO would replaces temporaries that are written somewhere with the place they are written to:

    %discr: Option::Discr = foo.@discr ; match takes an lvalue
    switch %discr, None=>.none, Some=>.some
.some:
    foo_inner = foo.inner
.some_body:
    %f_arg : Option<T> = foo_inner
    ; %ret_val_2.as<Some>.0 = %f_ret
    ; (*%ret) = %ret_val_2
    (*%ret).as<Some>.0 = call f, %f_arg, onerr <snip> 
    *%ret = Some((*%ret).as<Some>.0) ; lowering ignores the self-assignment
    jmp exit
.none:
.none_body:
    *%ret = None
exit:
    ret

Match optimization substitutes temporaries with their values

    %discr: Option::Discr = foo.@discr ; match takes an lvalue
    switch %discr, None=>.none, Some=>.some
.some:
.some_body:
    ; foo_inner = foo.inner - not really a temporary, but good enough
    ; %f_arg = foo_inner
    (*%ret).as<Some>.0 = call f, foo.inner, onerr <snip> 
    *%ret = Some((*%ret).as<Some>.0) ; lowering ignores the self-assignment
    jmp exit
.none:
.none_body:
    *%ret = None
exit:
    ret

arielb1 · 2015-08-26T17:26:08Z

Maybe DVIR (desugared value-level IR) for the "HIR"?

nikomatsakis · 2015-08-31T20:47:01Z

@arielb1

Of course this won't do the decision-tree optimizations, but the temporary-elimination optimizations...

Yes, this is what I meant.

helper

this serves as a poor man's unit test infrastructure until MIR is more built up

works around a stage0 bug that has since been fixed.

nikomatsakis · 2015-09-06T11:28:12Z

@bors try

bors · 2015-09-06T11:28:15Z

⌛ Trying commit 2f00086 with merge ea3892f...

@nrc

This PR contains a new crate, `rustc_mir`, which implements the MIR as specified in the RFC (more or less). There are no targeted unit tests at the moment, as I didn't decide what kind of infrastructure would be best and didn't take the time to implement it. ~~NB: In packaging up this PR, I realized that MIR construction code is not triggering for methods right now, I think it's only for fixed fns. I'll push a fix for this soon. Hopefully it doesn't stop any crates from building. :)~~ Fixed. Everything still seems to work. However, the MIR construction code (`librustc_mir/build`) is intentionally quite distinct from the code which munges the compiler's data structures (`librustc_mir/tcx`). The interface between the two is the `HIR` trait (`librustc_mir/hir`). To avoid confusion with @nrc's work, perhaps a better name for this trait is warranted, although ultimately this trait *will* be connected to the HIR, I imagine, so in a way the name is perfect. Anyway, I'm open to suggestions. The initial motivation for this split was to allow for the MIR construction code to be unit-tested. But while I didn't end up writing unit tests (yet), I did find the split made the code immensely easier to think about, since the messiness of our existing system, with its myriad hashtables, punning, and so forth, is confined to one part, which simply transforms to a more fully explicit AST-like form. I tried to separate out the commits somewhat, but since this mostly new code, it mostly winds up coming in one fell swoop in the MIR commit. Quick guide to the MIR crate: - `repr.rs` defines the MIR itself; each MIR instance is parameterized by some HIR `H` - `build/` is the MIR construction code, parameterized by a particular HIR - `hir/` is the definition of the HIR interface - `tcx/` is the impl of the HIR interface for the tcx - `dump.rs` is the minimal compiler pass that invokes the HIR One open question: - In the HIR trait, I used exclusively struct-like variants. I found I like this more, since things have names. Should I convert the repr code?

nikomatsakis · 2015-09-06T20:49:17Z

@bors r=nrc

(@nrc gave thumbs up over IRC)

bors · 2015-09-06T20:49:18Z

📌 Commit 2f00086 has been approved by nrc

nikomatsakis · 2015-09-06T20:49:48Z

@bors r=nrc

bors · 2015-09-06T20:49:49Z

📌 Commit c8a6618 has been approved by nrc

bors · 2015-09-06T21:31:27Z

⌛ Testing commit c8a6618 with merge 3dd1a48...

@nrc

This PR contains a new crate, `rustc_mir`, which implements the MIR as specified in the RFC (more or less). There are no targeted unit tests at the moment, as I didn't decide what kind of infrastructure would be best and didn't take the time to implement it. ~~NB: In packaging up this PR, I realized that MIR construction code is not triggering for methods right now, I think it's only for fixed fns. I'll push a fix for this soon. Hopefully it doesn't stop any crates from building. :)~~ Fixed. Everything still seems to work. However, the MIR construction code (`librustc_mir/build`) is intentionally quite distinct from the code which munges the compiler's data structures (`librustc_mir/tcx`). The interface between the two is the `HIR` trait (`librustc_mir/hir`). To avoid confusion with @nrc's work, perhaps a better name for this trait is warranted, although ultimately this trait *will* be connected to the HIR, I imagine, so in a way the name is perfect. Anyway, I'm open to suggestions. The initial motivation for this split was to allow for the MIR construction code to be unit-tested. But while I didn't end up writing unit tests (yet), I did find the split made the code immensely easier to think about, since the messiness of our existing system, with its myriad hashtables, punning, and so forth, is confined to one part, which simply transforms to a more fully explicit AST-like form. I tried to separate out the commits somewhat, but since this mostly new code, it mostly winds up coming in one fell swoop in the MIR commit. Quick guide to the MIR crate: - `repr.rs` defines the MIR itself; each MIR instance is parameterized by some HIR `H` - `build/` is the MIR construction code, parameterized by a particular HIR - `hir/` is the definition of the HIR interface - `tcx/` is the impl of the HIR interface for the tcx - `dump.rs` is the minimal compiler pass that invokes the HIR One open question: - In the HIR trait, I used exclusively struct-like variants. I found I like this more, since things have names. Should I convert the repr code?

bors · 2015-09-06T23:24:10Z

☀️ Test successful - auto-linux-32-nopt-t, auto-linux-32-opt, auto-linux-64-nopt-t, auto-linux-64-opt, auto-linux-64-x-android-t, auto-mac-32-opt, auto-mac-64-nopt-t, auto-mac-64-opt, auto-win-gnu-32-nopt-t, auto-win-gnu-32-opt, auto-win-gnu-64-nopt-t, auto-win-gnu-64-opt, auto-win-msvc-32-opt, auto-win-msvc-64-opt

Remove unnecessary FIXME Found this while browsing rustc, I traced it back to rust-lang#27893 when MIR first introduced, some time passed since then and I think this FIXME is no longer necessary.

rust-highfive assigned huonw Aug 18, 2015

nikomatsakis force-pushed the mir branch from b32e41a to e2ba889 Compare August 18, 2015 22:44

steveklabnik mentioned this pull request Aug 18, 2015

Tracking issue for MIR (RFC #1211) #27840

Closed

16 tasks

steveklabnik reviewed Aug 18, 2015
View reviewed changes

arielb1 reviewed Aug 18, 2015
View reviewed changes

Ms2ger reviewed Aug 19, 2015
View reviewed changes

oli-obk reviewed Aug 20, 2015
View reviewed changes

rust-highfive assigned pnkfelix and unassigned huonw Aug 21, 2015

oli-obk reviewed Aug 21, 2015
View reviewed changes

nikomatsakis force-pushed the mir branch 2 times, most recently from fe02dd8 to 107f822 Compare August 24, 2015 13:17

eddyb reviewed Aug 25, 2015
View reviewed changes

nikomatsakis force-pushed the mir branch 2 times, most recently from 2cee71c to d44473b Compare September 1, 2015 15:33

nikomatsakis added 8 commits September 6, 2015 07:27

move fnv hashing support into librustc_data_structures

714f2a8

extract autoderef type adjustment code into a reusable

0e764ec

helper

add MIR code (unused thus far)

9bd35c0

add -Z always-build-mir option

6a2ca37

add MIR crate and link it into the driver

faa9ec8

plumbing to automatically run MIR for crates where it works;

9b45874

this serves as a poor man's unit test infrastructure until MIR is more built up

fix EUV delegate to take advantage of defaults. This also

013c37a

works around a stage0 bug that has since been fixed.

convert TODOs to FIXMEs

99f3bfc

nikomatsakis force-pushed the mir branch from af818c0 to 2f00086 Compare September 6, 2015 11:28

enable slice patterns and enable building rustdoc

c8a6618

nikomatsakis force-pushed the mir branch from 2f00086 to c8a6618 Compare September 6, 2015 20:49

bors merged commit c8a6618 into rust-lang:master Sep 6, 2015

This was referenced Sep 6, 2015

rustfmt libgraphviz #28255

Merged

Begin splitting middle::ty #28274

Merged

Initial import of the Aster syntax::ast builder library #28006

Closed

nikomatsakis deleted the mir branch March 30, 2016 16:16

ouz-a mentioned this pull request Aug 14, 2023

Remove unnecessary FIXME #114817

Merged

Zalathar mentioned this pull request Jun 25, 2024

Tweak a confusing comment in create_match_candidates #126926

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

construct MIR for all crates up to and including rustdoc #27893

construct MIR for all crates up to and including rustdoc #27893

nikomatsakis commented Aug 18, 2015

nikomatsakis commented Aug 18, 2015

rust-highfive commented Aug 18, 2015

steveklabnik Aug 18, 2015

nikomatsakis Aug 19, 2015

arielb1 commented Aug 18, 2015

arielb1 Aug 18, 2015

nikomatsakis Aug 19, 2015

nikomatsakis commented Aug 19, 2015

glaebhoerl commented Aug 19, 2015

Diggsey commented Aug 19, 2015

nrc commented Aug 19, 2015

Ms2ger Aug 19, 2015

nikomatsakis Aug 19, 2015

nikomatsakis commented Aug 19, 2015

oli-obk Aug 20, 2015

nikomatsakis Aug 21, 2015

nikomatsakis commented Aug 21, 2015

oli-obk Aug 21, 2015

nikomatsakis commented Aug 21, 2015

nikomatsakis commented Aug 24, 2015

bors commented Aug 25, 2015

eddyb Aug 25, 2015

nikomatsakis Aug 31, 2015

arielb1 commented Aug 25, 2015

arielb1 commented Aug 26, 2015

nikomatsakis commented Aug 31, 2015

nikomatsakis commented Sep 6, 2015

bors commented Sep 6, 2015

nikomatsakis commented Sep 6, 2015

bors commented Sep 6, 2015

nikomatsakis commented Sep 6, 2015

bors commented Sep 6, 2015

bors commented Sep 6, 2015

bors commented Sep 6, 2015

construct MIR for all crates up to and including rustdoc #27893

construct MIR for all crates up to and including rustdoc #27893

Conversation

nikomatsakis commented Aug 18, 2015

nikomatsakis commented Aug 18, 2015

rust-highfive commented Aug 18, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

arielb1 commented Aug 18, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nikomatsakis commented Aug 19, 2015

glaebhoerl commented Aug 19, 2015

Diggsey commented Aug 19, 2015

nrc commented Aug 19, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nikomatsakis commented Aug 19, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nikomatsakis commented Aug 21, 2015

Choose a reason for hiding this comment

nikomatsakis commented Aug 21, 2015

nikomatsakis commented Aug 24, 2015

bors commented Aug 25, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

arielb1 commented Aug 25, 2015

arielb1 commented Aug 26, 2015

nikomatsakis commented Aug 31, 2015

nikomatsakis commented Sep 6, 2015

bors commented Sep 6, 2015

nikomatsakis commented Sep 6, 2015

bors commented Sep 6, 2015

nikomatsakis commented Sep 6, 2015

bors commented Sep 6, 2015

bors commented Sep 6, 2015

bors commented Sep 6, 2015