Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow vcat for Sparse Matrices #7926

Closed
lruthotto opened this issue Aug 8, 2014 · 22 comments · Fixed by #10206
Closed

Slow vcat for Sparse Matrices #7926

lruthotto opened this issue Aug 8, 2014 · 22 comments · Fixed by #10206
Labels
linear algebra Linear algebra performance Must go faster sparse Sparse arrays
Milestone

Comments

@lruthotto
Copy link

I found a performance issue when vertically concatenating sparse matrices. Here is a minimal example:

N = 1000000;
A = sprandn(N,N,1e-5);
Z = spzeros(N,N);
@time B = [Z;A];
elapsed time: 1.359868004 seconds (915883824 bytes allocated, 31.18% gc time)

This should actually be a trivial operation. Note that it is equivalent to:

@time Bt = copy(A); Bt.m += N; Bt.rowval +=N;
elapsed time: 0.073445038 seconds (167965824 bytes allocated)

Does anyone have an elegant fix for this? I think a speedup here will be quite interesting for many people working with numerical PDEs etc.

@ViralBShah
Copy link
Member

This optimization would be too special case to put in by itself, but perhaps we can do something slightly more general to improve the performance in such a case.

Cc @tanmaykm

@tkelman
Copy link
Contributor

tkelman commented Aug 9, 2014

It's a little slower than your version that modifies the data fields directly, but I like blkdiag(spzeros(N,0), A) for this kind of thing (part of why adding blkdiag was just about the first thing I did with Julia).

What's even more worrying is the fact that == on sparse matrices is going to the generic implementation in abstractarray.jl that does elementwise getindex. Something like a short-circuiting version of sparse subtraction would be much faster.

@ViralBShah
Copy link
Member

Good point, we should fix ==.

@lruthotto
Copy link
Author

Indeed blkdiag seems to be more efficient. I will have a look at its implementation to learn what the difference to vcat is.

Talking about comparision: Also element-wise operators such as .==, .<=, .>= are using abstractarray.jl and return dense matrices.

@ViralBShah
Copy link
Member

We should probably open a separate issue for all the operators that need better implementations. Probably just a matter of adding operators to the right code generation macro.

@ViralBShah
Copy link
Member

Probably dot-wise comparison operations between sparse matrices and scalars will have to return dense matrices anyways, but we need to have efficient implementations rather than using the abstractarray implementations.

@tkelman
Copy link
Contributor

tkelman commented Aug 9, 2014

Generally those elementwise comparisons that are true for equality should return dense - keep in mind the implicit zeros. (Or as @StefanKarpinski would say, they would be a good use case for sparse-with-nonzero-default.)

blkdiag is almost identical to hcat. Since we're using compressed sparse column storage by default (there may soon be alternatives), it's very simple to concatenate column-wise, difficult to concatenate row-wise. It might be worth looking into improvements in vcat when a large number of successive columns of one input or the other are empty. Non-copying slices should also help significantly, from profiling most of the time is on these 2 lines.

@IainNZ
Copy link
Member

IainNZ commented Aug 9, 2014

+1 for a meta-issue that has a checklist for all operations you'd reasonably want to do on sparse matrices so we can get them all.

@tkelman
Copy link
Contributor

tkelman commented Aug 9, 2014

Especially if/when we add additional sparse types (CSR, COO, block versions thereof?), we'll start wanting convenient & efficient operations between different types of sparse matrices. Doing it all cleanly might require some higher-level pondering of how to best design the system to avoid haphazard explosion of number of different operations we need to write. (Same goes for the whole family of dense array types too tbh, not saying I have a solution but it's something to start thinking about in case anybody else does.)

@IainNZ
Copy link
Member

IainNZ commented Aug 9, 2014

Indeed. I strongly feel that work on those new sparse matrices should start in a package before getting PRed against julia, its going to take some tire-kicking.

@StefanKarpinski
Copy link
Sponsor Member

I think we should seriously consider putting all sparse support in a package so that it can get the focused attention that it deserves (and needs).

@tkelman
Copy link
Contributor

tkelman commented Aug 9, 2014

The same could be said of Diagonal, Bidiagonal, Tridiagonal, Triangular, Banded, Symmetric, Hermitian, etc. Once default (and precompiled) packages exist, sure it should move out of Base along with FFTW, possibly GMP and MPFR, maybe even PCRE if that would be remotely possible.

What Base will eventually need to provide is a higher-level framework for making it easier to create new array types and have them interoperate with one another, in a nicer more performant way than falling back to naive generic operations or writing a combinatorial number of specialized routines. That will be hard, and is yet to be designed much less implemented.

@ViralBShah
Copy link
Member

Putting all sparse support in a package would lead to a huge amount of deprecation. It already gets the attention it needs. Also @tkelman is right that if we want to do such a thing, we should do it for many other things. New sparse formats should start out as packages, but perhaps not CSR, as that needs serious code sharing with CSC. However, packages get far less visibility than PRs to base, so serious tire kicking only happens after something is on master.

@ViralBShah
Copy link
Member

Let's move much of this discussion that is not related to this issue to the mailing list or separate issues. It is guaranteed to get lost here.

@tknopp
Copy link
Contributor

tknopp commented Aug 10, 2014

#1906 and #5155 are the relevant issues regarding default packages. The point of developing a more general sparse matrix package in a package is that it will be much easier to experiment with the code structure. Further people cannot try things out without compiling Julia which is a pretty high restriction.

@tkelman
Copy link
Contributor

tkelman commented Aug 10, 2014

Putting even the most basic support for sparse matrices out in a package is going to be a big headache for virtually every package in JuliaOpt for example. JuMP etc already take long enough to load that it hurts the Julia pitch to people in my field, for whom those packages are the unique selling point of Julia (many kudos deserved to @mlubin @IainNZ etc for making this the case).

Get package precompilation working and solid first, then worry about shrinking base when it'll be less painful to do so.

(with apologies to Viral - the closest thing to package precompilation issue would be what, #4373 ?)

@lindahua
Copy link
Contributor

+1 for first focusing on having packages load much faster then worrying about separating things from Base.

@tknopp
Copy link
Contributor

tknopp commented Aug 12, 2014

While I agree that it would be great to first have fast package loading and then shrinking based, these things are less coupled than one might think.

One simple technical solution is to pull the default packages during make and precompile the code into the system image.

The point here really is to make thinks more modular. If the sparse matrix functionality is developed in base, nobody can simple test it without compiling Julia from source. Further, within a package one does not have to add deprecations as the user could simply rely on an older version if he/she does not want to update to a new interface.

@ViralBShah
Copy link
Member

Turns out this has more to do with type instability than special casing. Submitting a PR soon.

ViralBShah added a commit that referenced this issue Feb 15, 2015
@ViralBShah
Copy link
Member

With the PR above, we are now significantly faster and memory efficient.

julia> N = 1000000;
julia> A = sprandn(N,N,1e-5);
julia> Z = spzeros(N,N);
julia> @time B = [Z;A];
elapsed time: 0.15032013 seconds (160 MB allocated, 11.98% gc time in 1 pauses with 1 full sweep)

ViralBShah added a commit that referenced this issue Feb 15, 2015
ViralBShah added a commit that referenced this issue Feb 15, 2015
allocation in sparse vcat. Fixes #7926.
Add a test for vcat of sparse matrices of different element/index types.
@lruthotto
Copy link
Author

Amazing! Thanks @ViralBShah!

I repeated the above example an equivalent system an got and incredible (around 10 times) speedup. Hope this PR gets merged soon.

elapsed time: 0.104847899 seconds (160 MB allocated, 34.00% gc time in 2 pauses with 1 full sweep)

@ViralBShah
Copy link
Member

This is merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
linear algebra Linear algebra performance Must go faster sparse Sparse arrays
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants