Slow vcat for Sparse Matrices #7926

lruthotto · 2014-08-08T23:29:34Z

I found a performance issue when vertically concatenating sparse matrices. Here is a minimal example:

N = 1000000;
A = sprandn(N,N,1e-5);
Z = spzeros(N,N);
@time B = [Z;A];
elapsed time: 1.359868004 seconds (915883824 bytes allocated, 31.18% gc time)

This should actually be a trivial operation. Note that it is equivalent to:

@time Bt = copy(A); Bt.m += N; Bt.rowval +=N;
elapsed time: 0.073445038 seconds (167965824 bytes allocated)

Does anyone have an elegant fix for this? I think a speedup here will be quite interesting for many people working with numerical PDEs etc.

The text was updated successfully, but these errors were encountered:

ViralBShah · 2014-08-09T05:24:35Z

This optimization would be too special case to put in by itself, but perhaps we can do something slightly more general to improve the performance in such a case.

Cc @tanmaykm

tkelman · 2014-08-09T13:16:18Z

It's a little slower than your version that modifies the data fields directly, but I like blkdiag(spzeros(N,0), A) for this kind of thing (part of why adding blkdiag was just about the first thing I did with Julia).

What's even more worrying is the fact that == on sparse matrices is going to the generic implementation in abstractarray.jl that does elementwise getindex. Something like a short-circuiting version of sparse subtraction would be much faster.

ViralBShah · 2014-08-09T13:29:26Z

Good point, we should fix ==.

lruthotto · 2014-08-09T15:46:15Z

Indeed blkdiag seems to be more efficient. I will have a look at its implementation to learn what the difference to vcat is.

Talking about comparision: Also element-wise operators such as .==, .<=, .>= are using abstractarray.jl and return dense matrices.

ViralBShah · 2014-08-09T15:53:20Z

We should probably open a separate issue for all the operators that need better implementations. Probably just a matter of adding operators to the right code generation macro.

ViralBShah · 2014-08-09T15:55:41Z

Probably dot-wise comparison operations between sparse matrices and scalars will have to return dense matrices anyways, but we need to have efficient implementations rather than using the abstractarray implementations.

tkelman · 2014-08-09T16:05:18Z

Generally those elementwise comparisons that are true for equality should return dense - keep in mind the implicit zeros. (Or as @StefanKarpinski would say, they would be a good use case for sparse-with-nonzero-default.)

blkdiag is almost identical to hcat. Since we're using compressed sparse column storage by default (there may soon be alternatives), it's very simple to concatenate column-wise, difficult to concatenate row-wise. It might be worth looking into improvements in vcat when a large number of successive columns of one input or the other are empty. Non-copying slices should also help significantly, from profiling most of the time is on these 2 lines.

IainNZ · 2014-08-09T16:54:40Z

+1 for a meta-issue that has a checklist for all operations you'd reasonably want to do on sparse matrices so we can get them all.

tkelman · 2014-08-09T17:03:38Z

Especially if/when we add additional sparse types (CSR, COO, block versions thereof?), we'll start wanting convenient & efficient operations between different types of sparse matrices. Doing it all cleanly might require some higher-level pondering of how to best design the system to avoid haphazard explosion of number of different operations we need to write. (Same goes for the whole family of dense array types too tbh, not saying I have a solution but it's something to start thinking about in case anybody else does.)

IainNZ · 2014-08-09T17:07:08Z

Indeed. I strongly feel that work on those new sparse matrices should start in a package before getting PRed against julia, its going to take some tire-kicking.

StefanKarpinski · 2014-08-09T17:42:58Z

I think we should seriously consider putting all sparse support in a package so that it can get the focused attention that it deserves (and needs).

tkelman · 2014-08-09T17:49:44Z

The same could be said of Diagonal, Bidiagonal, Tridiagonal, Triangular, Banded, Symmetric, Hermitian, etc. Once default (and precompiled) packages exist, sure it should move out of Base along with FFTW, possibly GMP and MPFR, maybe even PCRE if that would be remotely possible.

What Base will eventually need to provide is a higher-level framework for making it easier to create new array types and have them interoperate with one another, in a nicer more performant way than falling back to naive generic operations or writing a combinatorial number of specialized routines. That will be hard, and is yet to be designed much less implemented.

ViralBShah · 2014-08-10T07:46:09Z

Putting all sparse support in a package would lead to a huge amount of deprecation. It already gets the attention it needs. Also @tkelman is right that if we want to do such a thing, we should do it for many other things. New sparse formats should start out as packages, but perhaps not CSR, as that needs serious code sharing with CSC. However, packages get far less visibility than PRs to base, so serious tire kicking only happens after something is on master.

ViralBShah · 2014-08-10T07:48:04Z

Let's move much of this discussion that is not related to this issue to the mailing list or separate issues. It is guaranteed to get lost here.

tknopp · 2014-08-10T08:56:11Z

#1906 and #5155 are the relevant issues regarding default packages. The point of developing a more general sparse matrix package in a package is that it will be much easier to experiment with the code structure. Further people cannot try things out without compiling Julia which is a pretty high restriction.

tkelman · 2014-08-10T14:59:02Z

Putting even the most basic support for sparse matrices out in a package is going to be a big headache for virtually every package in JuliaOpt for example. JuMP etc already take long enough to load that it hurts the Julia pitch to people in my field, for whom those packages are the unique selling point of Julia (many kudos deserved to @mlubin @IainNZ etc for making this the case).

Get package precompilation working and solid first, then worry about shrinking base when it'll be less painful to do so.

(with apologies to Viral - the closest thing to package precompilation issue would be what, #4373 ?)

lindahua · 2014-08-10T21:00:23Z

+1 for first focusing on having packages load much faster then worrying about separating things from Base.

tknopp · 2014-08-12T10:31:24Z

While I agree that it would be great to first have fast package loading and then shrinking based, these things are less coupled than one might think.

One simple technical solution is to pull the default packages during make and precompile the code into the system image.

The point here really is to make thinks more modular. If the sparse matrix functionality is developed in base, nobody can simple test it without compiling Julia from source. Further, within a package one does not have to add deprecations as the user could simply rely on an older version if he/she does not want to update to a new interface.

ViralBShah · 2015-02-15T07:41:10Z

Turns out this has more to do with type instability than special casing. Submitting a PR soon.

allocation in sparse vcat. Fixes #7926.

ViralBShah · 2015-02-15T08:33:59Z

With the PR above, we are now significantly faster and memory efficient.

julia> N = 1000000;
julia> A = sprandn(N,N,1e-5);
julia> Z = spzeros(N,N);
julia> @time B = [Z;A];
elapsed time: 0.15032013 seconds (160 MB allocated, 11.98% gc time in 1 pauses with 1 full sweep)

allocation in sparse vcat. Fixes #7926.

allocation in sparse vcat. Fixes #7926. Add a test for vcat of sparse matrices of different element/index types.

lruthotto · 2015-02-15T22:43:33Z

Amazing! Thanks @ViralBShah!

I repeated the above example an equivalent system an got and incredible (around 10 times) speedup. Hope this PR gets merged soon.

elapsed time: 0.104847899 seconds (160 MB allocated, 34.00% gc time in 2 pauses with 1 full sweep)

ViralBShah · 2015-02-16T04:40:10Z

This is merged.

jiahao added linear algebra labels Aug 9, 2014

tknopp mentioned this issue Aug 12, 2014

Shrinking Base and Introducing a Standard Library #5155

Closed

ViralBShah added this to the 0.4 milestone Feb 14, 2015

ViralBShah added a commit that referenced this issue Feb 15, 2015

Fix the type instability causing slowdown and extra memory

8bd5e5e

allocation in sparse vcat. Fixes #7926.

ViralBShah mentioned this issue Feb 15, 2015

Improve sparse vcat #10206

Merged

ViralBShah added a commit that referenced this issue Feb 15, 2015

Fix the type instability causing slowdown and extra memory

5242c3f

allocation in sparse vcat. Fixes #7926.

ViralBShah added a commit that referenced this issue Feb 15, 2015

Fix the type instability causing slowdown and extra memory

3e1d2f2

allocation in sparse vcat. Fixes #7926. Add a test for vcat of sparse matrices of different element/index types.

lruthotto closed this as completed Feb 15, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slow vcat for Sparse Matrices #7926

Slow vcat for Sparse Matrices #7926

lruthotto commented Aug 8, 2014

ViralBShah commented Aug 9, 2014

tkelman commented Aug 9, 2014

ViralBShah commented Aug 9, 2014

lruthotto commented Aug 9, 2014

ViralBShah commented Aug 9, 2014

ViralBShah commented Aug 9, 2014

tkelman commented Aug 9, 2014

IainNZ commented Aug 9, 2014

tkelman commented Aug 9, 2014

IainNZ commented Aug 9, 2014

StefanKarpinski commented Aug 9, 2014

tkelman commented Aug 9, 2014

ViralBShah commented Aug 10, 2014

ViralBShah commented Aug 10, 2014

tknopp commented Aug 10, 2014

tkelman commented Aug 10, 2014

lindahua commented Aug 10, 2014

tknopp commented Aug 12, 2014

ViralBShah commented Feb 15, 2015

ViralBShah commented Feb 15, 2015

lruthotto commented Feb 15, 2015

ViralBShah commented Feb 16, 2015

Slow vcat for Sparse Matrices #7926

Slow vcat for Sparse Matrices #7926

Comments

lruthotto commented Aug 8, 2014

ViralBShah commented Aug 9, 2014

tkelman commented Aug 9, 2014

ViralBShah commented Aug 9, 2014

lruthotto commented Aug 9, 2014

ViralBShah commented Aug 9, 2014

ViralBShah commented Aug 9, 2014

tkelman commented Aug 9, 2014

IainNZ commented Aug 9, 2014

tkelman commented Aug 9, 2014

IainNZ commented Aug 9, 2014

StefanKarpinski commented Aug 9, 2014

tkelman commented Aug 9, 2014

ViralBShah commented Aug 10, 2014

ViralBShah commented Aug 10, 2014

tknopp commented Aug 10, 2014

tkelman commented Aug 10, 2014

lindahua commented Aug 10, 2014

tknopp commented Aug 12, 2014

ViralBShah commented Feb 15, 2015

ViralBShah commented Feb 15, 2015

lruthotto commented Feb 15, 2015

ViralBShah commented Feb 16, 2015