turn on linear IR #24113

JeffBezanson · 2017-10-13T03:38:06Z

This should be a fairly agreeable version of #24027. All calls are pulled out of argument position, but are still allowed as any assignment RHS and as arguments to return. The Expr .typ field is still there. I updated codevalidation.jl with the rules implemented here, and got it passing on everything. I hacked in a solution for cglobal by pre-evaluating constant tuples in jl_resolve_globals.

This only increases the sysimg by about 15%, and with a few more things like #24109 I think we'll be fine. I think we should merge this soon and work on optimizations later.

@nanosoldier runbenchmarks(ALL, vs=":master")

nanosoldier · 2017-10-13T05:47:10Z

Something went wrong when running your job:

NanosoldierError: failed to run benchmarks against primary commit: stored type BenchmarkTools.ParametersPreV006 does not match currently loaded type

Logs and partial data can be found here
cc @ararslan

ararslan · 2017-10-13T05:48:35Z

Nanosoldier won't be functional again until JuliaIO/JLD.jl#196 is fixed.

JeffBezanson · 2017-10-13T06:17:34Z

Ah, llvmcall has the same problem as cglobal, but more complex. It will probably have to be a macro that expands to a foreigncall (edit: or an llvmcall expr head).

vtjnash · 2017-11-07T17:41:57Z

If you rebase this, we can now run nanosoldier against it

JeffBezanson · 2017-11-14T20:06:54Z

@nanosoldier runbenchmarks(ALL, vs=":master")

nanosoldier · 2017-11-15T00:11:12Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

JeffBezanson · 2017-11-15T04:49:29Z

@nanosoldier runbenchmarks(ALL, vs=":master")

nanosoldier · 2017-11-15T08:54:29Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

JeffBezanson · 2017-11-15T17:29:06Z

@nanosoldier runbenchmarks(ALL, vs=":master")

nanosoldier · 2017-11-15T21:34:47Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

StefanKarpinski · 2017-11-15T22:35:33Z

Oy vey. Some of those regressions!

quinnj · 2017-11-15T22:46:49Z

Looks like it's almost entirely SubArray benchmarks w/ significant regressions though, so at least it's targeted.

JeffBezanson · 2017-11-15T22:49:36Z

Yep, looks scary but not a huge deal. I've been picking these off one by one.

vtjnash · 2017-11-15T22:55:23Z

How is the performance of this for building the system image and running tests? On my machine, it seems to be about ~~25% slower~~ 10% slower at building and generated about 25% larger AST data. But also seems to have emitted about 10% more functions (I'm not sure whether that's good or bad).

sysimg size breakdown (PR):
     sys data: 52669172
  isbits data: 28600958
      symbols:   260314
    tags list:  2679720
   reloc list: 10264512
    gvar list:    76048
    fptr list:   238240

sysimg size breakdown (master):
     sys data: 52659044
  isbits data: 22744512
      symbols:   266667
    tags list:  2589636
   reloc list:  9858840
    gvar list:    76232
    fptr list:   212744

EDIT: fixed build time – I realized I used the wrong reference initially

JeffBezanson · 2017-11-15T23:43:24Z

Yes, I see the same numbers. I think we'll be able to simplify some of the optimization passes (probably including merging #23240) and add some more compact encodings of common patterns like ssavalue assignment. There are also lots of sequences like this:

        # meta: location subarray.jl reindex 179
        SSAValue(309) = SSAValue(103)
        # meta: pop location

that we can peephole optimize away. We'll see how far that gets us.

JeffBezanson · 2017-11-16T18:13:43Z

This now includes #23240, but using front-end linearization instead of its own linearize pass. Seems to help clean up some of the extra allocations in the benchmarks here. Let's see.

@nanosoldier runbenchmarks(ALL, vs=":master")

nanosoldier · 2017-11-16T22:07:20Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

JeffBezanson · 2017-11-16T22:41:49Z

Ok, it appears that adding some @yuyichao magic has indeed fixed the remaining performance regressions. I worked through a couple more bugs and I think this is working now.

StefanKarpinski · 2017-11-17T01:36:49Z

Everything’s coming up Milhouse!

JeffBezanson · 2017-11-28T17:41:01Z

This recent test is failing on this branch:

@test contains(get_llvm_noopt(foo24632, (Bool,), false), "!dereferenceable_or_null")

The reason appears to be that the better optimizations here are able to eliminate the x variable. This:

        x = #temp#@_5
        #= line 319 =#
        unless (x isa Main.Int)::Bool goto 25

became this:

        SSAValue(2) = #temp#@_5
        #= line 319 =#
        SSAValue(4) = (SSAValue(2) isa Main.Int)::Bool
        unless SSAValue(4) goto 27

I'm not sure why x gets this annotation but the temp variable it's initialized from doesn't (in the first version of the code).

JeffBezanson · 2017-11-29T04:34:56Z

Ok, I believe I've fixed that, by avoiding replacing TypedSlots. @yuyichao would be good if you can take a look at this.

JeffBezanson · 2017-11-29T19:24:11Z

... In particular, in my latest commit I wasn't fully sure whether to return true or false.

quinnj · 2017-12-06T22:46:18Z

Looks like this needs a rebase; but the last CI run had quite a bit of green.

These objects make it really hard to mutate the AST correctly since one mutation can be accidentally done at places where it is invalid.

The hardest part for running non-local optimization passes (i.e. the transformation does not rely only on one or a few neighboring expressions) is to avoid re-analyse the code. Our current IR, though easy for linear scanning, interpreting, codegen and, to a certain degree, storage, is not very easy for making random updates. Try to workaround this issue in two ways, 1. Never resize the code array when doing updates. Instead, inserting nested arrays that we'll later splice back in for code addition and use `nothing` for code deletion. This way, the array index we cached for other metadata about the code can stay valid. 2. Based on the previous approach, pre-scan the use-def info for all variables before starting the optimization and run the optimization recursively. Code changes will also update this use-def data so that it's always valid for the user. Changes that can affect the use or def of another value will re-trigger the optimization so that we can take advantage of new optimization opportunities. This optimization pass should now handle most of the control-flow insensitive optimizations. Code patterns that are handled partially by this pass but will benefit greatly from an control-flow sensitive version includes, 1. Split slots (based on control flow) This way we can completely eliminate the surprising cost due to variable name conflicts, even when one of the def-use is not type stable. (This pass currently handles the case where all the def/uses are type stable) 2. Delay allocations There are cases where the allocation escapes but only in some branches. This will be especially for error path since we cannot eliminate some `SubArray` allocation only because we want to maintain them for the bounds error. This is very stupid and we should be able to do the allocation only when we throw the error, leaving the performance critical non-error path allocation-free. 3. Reordering assignments It is in general illegal to move an assignment when the slot assigned to is not SSA. However, there are many case that is actually legal (i.e. if there's no other use or def in between) to do so. This shows up a lot in code like ``` SSA = alloc slot = SSA ``` which we currently can't optimize since the slot can't see the assignment is actually an allocation and not a generic black box. We should be able to merge this and eliminate the SSA based on control flow info. For this case, a def info that looks through SSA values can also help.

Sacha0 · 2017-12-11T19:53:11Z

🎉!

JeffBezanson force-pushed the jb/linear-ir-1 branch from 3d46a01 to 4345023 Compare October 13, 2017 04:01

JeffBezanson force-pushed the jb/linear-ir-1 branch 2 times, most recently from e819d82 to 430c537 Compare October 14, 2017 00:02

This was referenced Oct 14, 2017

improve REPL completion by inferring subexpressions #24141

Merged

misc. small improvements to inference code #24143

Merged

JeffBezanson force-pushed the jb/linear-ir-1 branch from 430c537 to c280f8a Compare November 14, 2017 20:06

JeffBezanson force-pushed the jb/linear-ir-1 branch from c280f8a to 61b1e56 Compare November 15, 2017 04:49

JeffBezanson force-pushed the jb/linear-ir-1 branch from 61b1e56 to c1aa491 Compare November 15, 2017 17:28

JeffBezanson mentioned this pull request Nov 16, 2017

Much more aggressive alloc_elim_pass! #23240

Closed

JeffBezanson force-pushed the jb/linear-ir-1 branch from c1aa491 to 867fd7d Compare November 16, 2017 18:11

JeffBezanson force-pushed the jb/linear-ir-1 branch 2 times, most recently from b6a8088 to ec20c0b Compare November 16, 2017 22:39

JeffBezanson force-pushed the jb/linear-ir-1 branch 2 times, most recently from 67cb6f0 to 5edaa85 Compare November 17, 2017 04:36

JeffBezanson force-pushed the jb/linear-ir-1 branch from 5edaa85 to 1a2b3af Compare November 28, 2017 00:02

StefanKarpinski mentioned this pull request Nov 28, 2017

RFC: Make inbounds macros expression-like more often #15558

Closed

KristofferC mentioned this pull request Nov 28, 2017

use unsafe_views and remove branch for arrays JuliaStats/Distances.jl#86

Closed

JeffBezanson force-pushed the jb/linear-ir-1 branch 2 times, most recently from 292f959 to 3a9ed80 Compare December 2, 2017 22:31

JeffBezanson force-pushed the jb/linear-ir-1 branch from 3a9ed80 to 0f4884b Compare December 9, 2017 04:35

JeffBezanson and others added 4 commits December 8, 2017 23:55

turn on linear IR, update codevalidation and get it passing

4e81b9c

Add a simple pass to copy Exprs that appears multiple times in the AST

8c71aff

These objects make it really hard to mutate the AST correctly since one mutation can be accidentally done at places where it is invalid.

avoid losing type info from TypedSlots when replacing variables

952c7a5

JeffBezanson force-pushed the jb/linear-ir-1 branch from 0f4884b to 952c7a5 Compare December 9, 2017 04:55

Keno merged commit 66b2090 into master Dec 11, 2017

StefanKarpinski deleted the jb/linear-ir-1 branch December 11, 2017 19:52

maleadt mentioned this pull request Dec 12, 2017

Linear IR broke llvmcall(something(...), ...) #25041

Closed

jrevels mentioned this pull request Dec 12, 2017

Broken by linear IR change in Base JuliaLabs/Cassette.jl#16

Closed

maleadt mentioned this pull request Dec 13, 2017

Code returned from generated function calls behaves incorrectly #25055

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

turn on linear IR #24113

turn on linear IR #24113

JeffBezanson commented Oct 13, 2017

nanosoldier commented Oct 13, 2017

ararslan commented Oct 13, 2017

JeffBezanson commented Oct 13, 2017 •

edited

Loading

vtjnash commented Nov 7, 2017

JeffBezanson commented Nov 14, 2017

nanosoldier commented Nov 15, 2017

JeffBezanson commented Nov 15, 2017

nanosoldier commented Nov 15, 2017

JeffBezanson commented Nov 15, 2017

nanosoldier commented Nov 15, 2017

StefanKarpinski commented Nov 15, 2017

quinnj commented Nov 15, 2017

JeffBezanson commented Nov 15, 2017

vtjnash commented Nov 15, 2017 •

edited

Loading

JeffBezanson commented Nov 15, 2017

JeffBezanson commented Nov 16, 2017

nanosoldier commented Nov 16, 2017

JeffBezanson commented Nov 16, 2017

StefanKarpinski commented Nov 17, 2017

JeffBezanson commented Nov 28, 2017

JeffBezanson commented Nov 29, 2017

JeffBezanson commented Nov 29, 2017

quinnj commented Dec 6, 2017

Sacha0 commented Dec 11, 2017

turn on linear IR #24113

turn on linear IR #24113

Conversation

JeffBezanson commented Oct 13, 2017

nanosoldier commented Oct 13, 2017

ararslan commented Oct 13, 2017

JeffBezanson commented Oct 13, 2017 • edited Loading

vtjnash commented Nov 7, 2017

JeffBezanson commented Nov 14, 2017

nanosoldier commented Nov 15, 2017

JeffBezanson commented Nov 15, 2017

nanosoldier commented Nov 15, 2017

JeffBezanson commented Nov 15, 2017

nanosoldier commented Nov 15, 2017

StefanKarpinski commented Nov 15, 2017

quinnj commented Nov 15, 2017

JeffBezanson commented Nov 15, 2017

vtjnash commented Nov 15, 2017 • edited Loading

JeffBezanson commented Nov 15, 2017

JeffBezanson commented Nov 16, 2017

nanosoldier commented Nov 16, 2017

JeffBezanson commented Nov 16, 2017

StefanKarpinski commented Nov 17, 2017

JeffBezanson commented Nov 28, 2017

JeffBezanson commented Nov 29, 2017

JeffBezanson commented Nov 29, 2017

quinnj commented Dec 6, 2017

Sacha0 commented Dec 11, 2017

JeffBezanson commented Oct 13, 2017 •

edited

Loading

vtjnash commented Nov 15, 2017 •

edited

Loading