mutable objects, aliasing, and code patterns #9755

simonster · 2015-01-13T18:29:56Z

I noticed there's a lot of sparse matrix code that does things like:

for col = 1:n, p = A.colptr[col]:(A.colptr[col+1]-1)
    C.nzval[p] = A.nzval[p] * b[A.rowval[p]]
end

The compiler presently assumes that the contents of C.nzval, A.nzval, and A.rowval can alias the parts of C and A that contain the pointers to the array objects, and thus needs to reload those pointers on each loop iteration. The performance cost depends on how much is in cache, but for medium-sized sparse matrices this can be ~30% faster on my system:

Anzval = A.nzval
Arowval = A.rowval
Cnzval = C.nzval
for col = 1:n, p = A.colptr[col]:(A.colptr[col+1]-1)
    Cnzval[p] = Anzval[p] * b[Arowval[p]]
end

If @inbounds is added, the performance difference is even larger, since the optimizer can hoist the loads of both the pointer to the array object and the pointer to the array data in the second case but not the first.

Since #8867, I suspect the performance gap would disappear if SparseMatrixCSC were made immutable. However, there is code that relies on its mutability, and code patterns like this are probably also present outside of Base. Thus, I wonder whether we could place some restrictions on when array pointers and mutable objects can alias each other, so that extracting variables isn't necessary to achieve optimal performance.

I wonder whether it ever happens that arrays alias objects. This is possible using pointer_to_array(convert(Ptr{T}, pointer_from_objref(x)), ...) but probably isn't common. It would also be sufficient to tell TBAA that arrays never alias portions of objects that contain pointers.

The text was updated successfully, but these errors were encountered:

ArchRobison · 2015-01-14T17:36:35Z

I also find the need for hand-hoisting irksome. I ran into something like this for my own types in the Al Zimmerman contest. The optimization good of the many should outweigh the abusive type-punning needs of the few.

I saw this issue decades ago in C++. The "2nd-level indirect" problem was common in scientific C++ codes that used user-defined array objects. KAI C++ had a rule that a pointer load never had a flow dependency on a floating-point store. We still allowed anti-dependencies and output dependencies, since C++ has unchecked union types. I don't remember a user ever reporting a problem with our scheme. We occasionally had users complain about a similar rule for int/float (even though in the reported contexts it was not standard conforming, and there was a standard-conforming way that worked.)

ArchRobison · 2015-01-14T18:59:45Z

Going down the TBAA route, we could have two disjoint subsets of tbaa_user, like this (comment indentation indicates inclusion relationship):

static MDNode* tbaa_user;           // User data that is mutable
static MDNode* tbaa_user_ptr_free;      // User data with no pointers
static MDNode* tbaa_user_ptr_tainted;   // User data with pointers in it

The aliasing rule would be that objects with pointers cannot alias objects that are pointer-free.

The tainted subset could be further subdivided according to where the pointers are laid out, but that seems like a lesser gain for significantly more work.

ViralBShah · 2015-01-14T19:06:47Z

@simonster I always try to write sparse matrix code in the pattern you suggested above.

I am guessing that we can make SparseMatrixCSC immutable, and only values in the various arrays contained in it will change - in operations like setindex and various in-place operations.

ViralBShah · 2015-01-14T19:13:31Z

The real challenge would be setindex! cases tha cause nzval and rowval to grow.

simonster · 2015-01-14T22:42:14Z

Growing nzval and rowval is not a problem with an immutable SparseMatrixCSC as long as it is done with push!, append!, or resize! rather than by constructing an entirely new array. However, I still think it's worthwhile to consider putting some restrictions on aliasing here, since this affects user code as well and sometimes immutability is not feasible.

simonster added the performance Must go faster label Jan 13, 2015

simonster mentioned this issue Mar 17, 2015

fixes #8416 (and duplicated issue #9664) #10106

Merged

tkelman mentioned this issue May 3, 2015

Improve performance of length on UTF8String and UTF16String #11107

Merged

simonster mentioned this issue May 22, 2015

Speed up scalar BitArray indexing by ~25% #11403

Closed

tkelman mentioned this issue Jul 29, 2015

Improve IOBuffer read performance #12364

Merged

tkelman mentioned this issue Oct 29, 2015

Tuple type member matrix vs. simple matrix performance difference #13816

Closed

simonster mentioned this issue May 14, 2016

SparseMatrixCSC should probably be an immutable #15668

Closed

KristofferC mentioned this issue May 19, 2017

SharedArray performance overhead #21957

Open

brenhinkeller added the sparse Sparse arrays label Nov 19, 2022

willow-ahrens mentioned this issue May 5, 2023

Consider loop invariant code motion on getfields finch-tensor/Finch.jl#188

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mutable objects, aliasing, and code patterns #9755

mutable objects, aliasing, and code patterns #9755

simonster commented Jan 13, 2015

ArchRobison commented Jan 14, 2015

ArchRobison commented Jan 14, 2015

ViralBShah commented Jan 14, 2015

ViralBShah commented Jan 14, 2015

simonster commented Jan 14, 2015

mutable objects, aliasing, and code patterns #9755

mutable objects, aliasing, and code patterns #9755

Comments

simonster commented Jan 13, 2015

ArchRobison commented Jan 14, 2015

ArchRobison commented Jan 14, 2015

ViralBShah commented Jan 14, 2015

ViralBShah commented Jan 14, 2015

simonster commented Jan 14, 2015