Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make SubArrays immutable #9322

Merged
merged 1 commit into from
Dec 12, 2014
Merged

Make SubArrays immutable #9322

merged 1 commit into from
Dec 12, 2014

Conversation

simonster
Copy link
Member

Since #8867 this actually makes a difference in the optimizations LLVM is able to perform. On the benchmark from #8809, the effect is quite dramatic. The code is:

function f1(x,v)
    for j=1:length(x)
        @inbounds x[j] += v
    end
end
let
    n = 600
    tn = 10^7
    x = randn(n,3)
    v = 1.00001
    xslice = sub(x,:, 2)
    @time for i=1:tn
        f1(xslice,v)
    end
end

Before, this took ~9 seconds to execute. With this change, it takes ~2.5. However, LLVM 3.3 still won't vectorize it, even though it vectorizes with Arrays and ArrayViews. It looks like there's an extra mul in the inner loop.

Since #8867 this actually makes a difference in the optimizations LLVM
is able to perform.
@andreasnoack
Copy link
Member

👍

@timholy
Copy link
Member

timholy commented Dec 12, 2014

I remember thinking about this and deciding not to do it, but for the life of me I can't remember why. It might have been some intermediate testing thing, or something. Let's do it.

timholy added a commit that referenced this pull request Dec 12, 2014
@timholy timholy merged commit d6c6276 into master Dec 12, 2014
@timholy
Copy link
Member

timholy commented Dec 12, 2014

And it would be really, really great to figure out the vectorization thing. Do we have any diagnostic tools that will tell us why it didn't vectorize?

@simonster
Copy link
Member Author

I think that it's not vectorizing (at least on my system; from #9318 it sounds like other systems may vectorize non-unit strides) because of the multiplication by V.stride1 here. Otherwise the inner loop IR is identical to the IR for the non-vectorized branch for Arrays and ArrayViews. Perhaps we can avoid emitting that when we can determine that V.stride1 will be 1 based on the types?

@simonster simonster deleted the sjk/immutable-subarray branch December 12, 2014 14:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants