-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
propagate effect_free information out of functions #9974
Comments
the problem is that we don't store/propagate the knowledge that |
This works around issue #9974 for BitArray indexing. BitArrays use inlined helper functions, `unsafe_bit(get|set)index`, to do the dirty work of picking bits out of the chunks array. Previously, these helpers took the array of chunks as an argument, but that array needs a GC root since BitArrays are mutable. This changes those helper functions to work on whole BitArray itself, which enables an optimization to avoid that root (since the chunks array is only accessed as an argument to `arrayref`, which is special-cased). The ~25% performance gain is for unsafe_getindex; the difference isn't quite as big for getindex (only ~10%) since there's still a GC root for the BoundsError. That can also be avoided, but I'd rather make that change more systematically (as `checkbounds`) with #10525 or a subset thereof.
I was looking into this a bit more last night, hoping it might be an easy fix. There's several intertwined optimizations here, but it seems like import Base: _div64, _mod64
inner(Bc, i) = (Bc[_div64(i-1)+1] & (UInt64(1) << _mod64(i-1))) != 0
outer(B, i) = inner(B.chunks, i)
code_typed(outer, (BitVector, Int)) # No temporary assignment
code_llvm(outer, (BitVector, Int)) # No GC frame Now it's easy to play around and see what is blocking the desired optimization. The goal is to eliminate the temporary assignment of the argument and allow the expression to be plugged directly into the argument of
So I'm not sure that there's anything directly actionable here, outside of a much larger project on escape analysis to ensure that a mutable field an object was assigned from isn't reassigned (and therefore unrooted) before that object is used… which seems terribly specific and like lots of hard work. (Note that this isn't an issue for an immutable type, which BitArrays could almost be were it not for mutable-length vectors.) |
Looks like this got fixed somewhere along the way — I now see 0.29s for both examples in the original post. |
Or, alternately, "when
@inline
is slower than manually inlining." Here's an example:Looking at the results of
code_llvm(unsafe_inX, (T, Int))
, we emit a GC Frame in all of the above functions exceptunsafe_in4
. This can cause slowdowns of up to 35% on very simple functions like the above:This is true both before and after the SSA patch. I've done spot-checks on today's 61d3ece and a month-old 48aae1f with the same results.
The text was updated successfully, but these errors were encountered: