WIP: Enable LLVM loop vectorizer #3929

simonster · 2013-08-03T21:08:17Z

While there's been some discussion of adding SIMD types in #2299, I thought it might be fun to see how well the LLVM loop vectorizer can do with Julia code. This PR compiles (but doesn't compile sysimg.jl), runs, and vectorizes things (sample), but there are two very big problems with it.

The first issue is that this PR turns all integer add operations into add nsw in order to get that proof of concept to work. Scalar evolution analysis requires that operations on the loop index produce undefined behavior on overflow, but to change this generally might be too unsafe for a high-level language. Since we should be able to guarantee that next(::Range{Int})/next(::Range1{Int}) doesn't overflow, one option is to add nsw intrinsics and use those.

The second issue is that this PR turns jl_value_t into int8*. Not only does this seem very wrong, it also breaks building sysimg.jl, although Julia seems to run fine with a sysimg.jl built without this change. Unfortunately, I haven't been able to get the loop vectorizer to work with jl_value_t as a structure type. With this change, the IR going into the loop vectorization pass looks like this, whereas without it, the IR looks like this. Notice that, with jl_value_t as i8*, the bitcast is outside of the loop, whereas with jl_value_t as a structure type, it is inside the loop. This seems to bother the loop vectorizer, which tells me:

LV: Found a loop: if
LV: Found an induction variable.
LV: Found a runtime check ptr:  %7 = bitcast %jl_value_t* %6 to double*, !dbg !3370
LV: Found a runtime check ptr:  %7 = bitcast %jl_value_t* %6 to double*, !dbg !3370
LV: We need to compare 1 ptrs.
LV: We can perform a memory runtime check if needed.
LV: Found an unidentified write ptr:  %4 = load %jl_value_t** %3, align 8, !dbg !3369
LV: Adding Underlying value:  %4 = load %jl_value_t** %3, align 8, !dbg !3369
LV: Found an unidentified read ptr:  %4 = load %jl_value_t** %3, align 8, !dbg !3369
LV: Found a possible write-write reorder:  %4 = load %jl_value_t** %3, align 8, !dbg !3369
LV: Can't vectorize due to memory conflicts
LV: Not vectorizing.

If I move the bitcast out of the loop and compile the IR manually with opt, it seems to work, but I'm a little confused about what makes these cases different.

If you have a debug build of LLVM, this can be used to produce debug output for specific passes using JULIA_LLVM_ARGS="-debug-only=passname"

This information is required by the loop vectorizer, but may benefit other analysis passes as well

With these changes, Julia can now vectorize a simple loop. Unfortunately, these changes are unacceptable. I've changed addition to produce undefined behavior on signed integer overflow, and I've changed jl_value_t from an LLVM structure type to a pointer to an Int8.

ViralBShah · 2013-08-04T16:03:54Z

This is really exciting, and I am anxious to see how well this works out.

simonster · 2013-08-05T17:18:51Z

I can get this to work without changing jl_value_t to i8* if I comment out the GEP optimizations in LLVM's instruction combining pass, so it seems like the problem is between what the instruction combining pass does with the jl_value_t bitcasts and the loop vectorizer's inability to recognize those bitcasts as no-ops.

simonster · 2014-01-13T16:31:23Z

Superseded by #5355.

simonster added 3 commits August 3, 2013 19:38

Use JULIA_LLVM_ARGS as LLVM command-line arguments in debug build

932e0fc

If you have a debug build of LLVM, this can be used to produce debug output for specific passes using JULIA_LLVM_ARGS="-debug-only=passname"

Pass DataLayout to LLVM and add target analysis passes

87ecaa3

This information is required by the loop vectorizer, but may benefit other analysis passes as well

simonster mentioned this pull request Nov 14, 2013

experiment with llvm vectorization passes #4786

Closed

ArchRobison mentioned this pull request Jan 10, 2014

Add support for @simd #5355

Merged

simonster closed this Jan 13, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: Enable LLVM loop vectorizer #3929

WIP: Enable LLVM loop vectorizer #3929

simonster commented Aug 3, 2013

ViralBShah commented Aug 4, 2013

simonster commented Aug 5, 2013

simonster commented Jan 13, 2014

WIP: Enable LLVM loop vectorizer #3929

WIP: Enable LLVM loop vectorizer #3929

Conversation

simonster commented Aug 3, 2013

ViralBShah commented Aug 4, 2013

simonster commented Aug 5, 2013

simonster commented Jan 13, 2014