Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MIT-licensed SparseMatrixCSC transposition methods #14631

Merged
merged 1 commit into from
Jan 13, 2016

Conversation

Sacha0
Copy link
Member

@Sacha0 Sacha0 commented Jan 11, 2016

Followup to #13001. This pull request replaces the LGPL-licensed SparseMatrixCSC transposition methods in base/sparse/csparse.jl ([c|f]transpose[!]) with MIT-licensed versions.

Specifically, these methods implement the HALFPERM algorithm described in F. Gustavson, "Two fast algorithms for sparse matrices: multiplication and permuted transposition," ACM TOMS 4(3), 250-269 (1978). The algorithm runs in O(A.m, A.n, nnz(A)) time and requires no space beyond that passed in. If you know of a faster-in-practice algorithm, please let me know. The new methods' performance is nearly identical to the old methods' performance; the new methods might have a slight edge.

When editing base/sparse/csparse.jl to remove the existing methods, I folded all blocks to avoid peaking at the LGPL code. So someone should check that the removal was graceful given it was blind but for method signatures.

@dmbates (or anyone interested) From what I see when I grep base/sparse/csparse.jl for long-form method signatures, I'm guessing that triu! and tril! are short children of fkeep!; is that so? Are there short-form children of fkeep! that I'm not seeing? (I've avoided grepping for short-form methods in case doing so reveals some implementation detail that I should not see.) Additionally, I'm guessing that f takes an element (and its position?) and determines whether that element should be kept; what is the implicit signature of f? What is the purpose and form of other? I might pick off the preceding set of methods next. Concerning sparse, I see multiple possible space/time complexity tradeoffs. How much storage does the existing sparse method allocate, and does it achieve O(A.m, A.n, nnz(A)) time or does it trade time to save space? Thanks, and best!

@tkelman tkelman added the sparse Sparse arrays label Jan 11, 2016
@ViralBShah
Copy link
Member

Could you provide some benchmarks just to ensure no performance regressions?

Cc @jrevels

@ViralBShah
Copy link
Member

This is great to have. Looking forward!

@Sacha0
Copy link
Member Author

Sacha0 commented Jan 11, 2016

Could you provide some benchmarks just to ensure no performance regressions?

Gladly! Something like...

using Benchmarks: @benchmark, SummaryStatistics

function prettytimes(bench)
    stats = SummaryStatistics(bench)
    timecenter = stats.elapsed_time_center
    timelower = get(stats.elapsed_time_lower)
    timeupper = get(stats.elapsed_time_upper)
    # based on Benchmarks.pretty_time_string
    timecenter < 1_000.0 ? (scalefactor = 1.0; units = "ns") :
        timecenter < 1_000_000.0 ? (scalefactor = 1_000.0; units = "μs") :
            timecenter < 1_000_000_000.0 ? (scalefactor = 1_000_000.0; units = "ms") :
                (scalefactor = 1_000_000_000.0; units = " s")
    @sprintf("%6.2f %s [%6.2f,%6.2f]", timecenter/scalefactor, units, timelower/scalefactor, timeupper/scalefactor)
end

smallN, smallerN = 600, 400;
smallsqrA = sprand(smallN, smallN, 0.01);
smallrectA = sprand(smallN, smallerN, 0.01);
smallsqrC = transpose(smallsqrA);
smallrectC = transpose(smallrectA);

largeN, largerN = 100000, 200000;
largesqrA = sprand(largeN, largeN, 0.001);
largerectA = sprand(largeN, largerN, 0.001);
largesqrC = transpose(largesqrA);
largerectC = transpose(largerectA);

wm, wt = 12, 25;
println(" $(lpad("method ", wm)) $(lpad("small square A", wt)), $(lpad("small rect A", wt)) | $(lpad("large square A", wt)), $(lpad("large rect A", wt)) ")
@printf("%s: %s , %s | %s, %s\n", lpad("transpose!", wm),
    prettytimes(@benchmark transpose!(smallsqrC, smallsqrA)),
    prettytimes(@benchmark transpose!(smallrectC, smallrectA)),
    prettytimes(@benchmark transpose!(largesqrC, largesqrA)),
    prettytimes(@benchmark transpose!(largerectC, largerectA)) )
@printf("%s: %s , %s | %s, %s\n", lpad("ctranspose!", wm),
    prettytimes(@benchmark ctranspose!(smallsqrC, smallsqrA)),
    prettytimes(@benchmark ctranspose!(smallrectC, smallrectA)),
    prettytimes(@benchmark ctranspose!(largesqrC, largesqrA)),
    prettytimes(@benchmark ctranspose!(largerectC, largerectA)) )
@printf("%s: %s , %s | %s, %s\n", lpad("transpose", wm),
    prettytimes(@benchmark transpose(smallsqrA)),
    prettytimes(@benchmark transpose(smallrectA)),
    prettytimes(@benchmark transpose(largesqrA)),
    prettytimes(@benchmark transpose(largerectA)) )
@printf("%s: %s , %s | %s, %s\n", lpad("ctranspose", wm),
    prettytimes(@benchmark ctranspose(smallsqrA)),
    prettytimes(@benchmark ctranspose(smallrectA)),
    prettytimes(@benchmark ctranspose(largesqrA)),
    prettytimes(@benchmark ctranspose(largerectA)) )

On master:

     method             small square A,              small rect A |            large square A,              large rect A
 transpose!:  24.18 μs [ 21.62, 26.75] ,  20.14 μs [ 19.75, 20.52] | 472.69 ms [466.36,479.02],   1.02  s [  1.00,  1.04]
ctranspose!:  24.26 μs [ 23.80, 24.71] ,  20.81 μs [ 20.43, 21.19] | 479.05 ms [469.00,489.11],   1.02  s [  0.99,  1.04]
  transpose:  42.98 μs [ 41.80, 44.15] ,  27.63 μs [ 26.82, 28.44] | 518.65 ms [501.76,535.54],   1.19  s [  1.17,  1.20]
 ctranspose:  42.76 μs [ 41.46, 44.07] ,  28.11 μs [ 27.27, 28.95] | 529.80 ms [505.16,554.45],   1.19  s [  1.17,  1.21]

On this PR's branch:

     method             small square A,              small rect A |            large square A,              large rect A
 transpose!:  22.42 μs [ 21.84, 23.00] ,  15.99 μs [ 15.63, 16.34] | 460.70 ms [450.73,470.68],   1.01  s [  0.98,  1.03]
ctranspose!:  22.43 μs [ 21.88, 22.98] ,  15.93 μs [ 15.61, 16.25] | 468.47 ms [462.66,474.28],   1.04  s [  1.00,  1.08]
  transpose:  41.57 μs [ 40.51, 42.63] ,  28.04 μs [ 27.33, 28.75] | 500.71 ms [492.20,509.22],   1.18  s [  1.17,  1.20]
 ctranspose:  40.19 μs [ 39.04, 41.33] ,  28.99 μs [ 28.29, 29.69] | 509.75 ms [500.48,519.03],   1.19  s [  1.16,  1.22]

? Thanks!

…se/sparse/csparse.jl ([c|f]transpose[!]) with MIT-licensed versions. See JuliaLang#13001.
@Sacha0
Copy link
Member Author

Sacha0 commented Jan 11, 2016

Thanks for the line-length feedback @tkelman! Sorry for somehow nixing the related comments. PR updated accordingly.

@jrevels jrevels added the potential benchmark Could make a good benchmark in BaseBenchmarks label Jan 11, 2016
@dmbates
Copy link
Member

dmbates commented Jan 12, 2016

@Sacha0 The signature of f in a call to fkeep! is

f{Tv,Ti}(i::Ti, j::Ti, x::Tv, other::Any)  Bool

The purpose of the other argument is to pass information from the call to fkeep! through to the call to f. As you suspect triu! and tril! are short children of fkeep!. In those cases the other argument is the diagonal below which or above which to zero the contents. It defaults to zero, indicating the main diagonal.

Another use of the other argument is in the droptol! function where it specifies the tolerance on the absolute value of the non-zeros below which they are dropped.

For details on the sparse function I think @ViralBShah is a better source than I.

@Sacha0
Copy link
Member Author

Sacha0 commented Jan 13, 2016

Thanks @dmbates! I will have a go at those methods (fkeep!/triu!/tril!/droptol!) next.

@ViralBShah
Copy link
Member

The sparse algorithm should work in O(max(m,n,nnz(A))) time, IIRC. It is worth exploring different methods that may take more space and are significantly faster than what we have now.

ViralBShah added a commit that referenced this pull request Jan 13, 2016
MIT-licensed SparseMatrixCSC transposition methods
@ViralBShah ViralBShah merged commit a4d1f99 into JuliaLang:master Jan 13, 2016
@Sacha0
Copy link
Member Author

Sacha0 commented Jan 13, 2016

Thanks for the review / merge!

@Sacha0 Sacha0 deleted the csctranspose branch January 13, 2016 23:51
@KristofferC
Copy link
Sponsor Member

Would be nice to add some of these benchmarks to BaseBenchmarks.jl

Sacha0 added a commit to Sacha0/julia that referenced this pull request Jan 17, 2016
…tril!, triu!, droptol!, and dropzeros[!] with MIT-licensed versions. See JuliaLang#13001 and JuliaLang#14631.
Sacha0 added a commit to Sacha0/julia that referenced this pull request Jan 17, 2016
…tril!, triu!, droptol!, and dropzeros[!] with MIT-licensed versions. See JuliaLang#13001 and JuliaLang#14631. Also add a test for dropzeros!.
Sacha0 added a commit to Sacha0/julia that referenced this pull request Jan 26, 2016
Sacha0 added a commit to Sacha0/julia that referenced this pull request Jan 26, 2016
Sacha0 added a commit to Sacha0/julia that referenced this pull request Jan 26, 2016
Sacha0 added a commit to Sacha0/julia that referenced this pull request Jan 26, 2016
Sacha0 added a commit to Sacha0/julia that referenced this pull request Jan 27, 2016
Sacha0 added a commit to Sacha0/julia that referenced this pull request Jan 27, 2016
Sacha0 added a commit to Sacha0/julia that referenced this pull request Jan 27, 2016
Sacha0 added a commit to Sacha0/julia that referenced this pull request Jan 27, 2016
Sacha0 added a commit to Sacha0/julia that referenced this pull request Jan 27, 2016
Sacha0 added a commit to Sacha0/julia that referenced this pull request Jan 27, 2016
Sacha0 added a commit to Sacha0/julia that referenced this pull request Jan 27, 2016
Sacha0 added a commit to Sacha0/julia that referenced this pull request Jan 27, 2016
Sacha0 added a commit to Sacha0/julia that referenced this pull request Jan 27, 2016
Sacha0 added a commit to Sacha0/julia that referenced this pull request Jan 27, 2016
Sacha0 added a commit to Sacha0/julia that referenced this pull request Jan 27, 2016
Sacha0 added a commit to Sacha0/julia that referenced this pull request Jan 27, 2016
Sacha0 added a commit to Sacha0/julia that referenced this pull request Jan 27, 2016
…tril!, triu!, droptol!, and dropzeros[!] with MIT-licensed versions. See JuliaLang#13001 and JuliaLang#14631. Also add a test for dropzeros!.
Sacha0 added a commit to Sacha0/julia that referenced this pull request Jan 27, 2016
…tril!, triu!, droptol!, and dropzeros[!] with MIT-licensed versions. See JuliaLang#13001 and JuliaLang#14631. Also add a test for dropzeros!.
jrevels added a commit to JuliaCI/BaseBenchmarks.jl that referenced this pull request Jan 27, 2016
@jrevels jrevels removed the potential benchmark Could make a good benchmark in BaseBenchmarks label Jan 27, 2016
Sacha0 added a commit to Sacha0/julia that referenced this pull request Jan 27, 2016
Sacha0 added a commit to Sacha0/julia that referenced this pull request Jan 27, 2016
Sacha0 added a commit to Sacha0/julia that referenced this pull request Jan 28, 2016
…tril!, triu!, droptol!, and dropzeros[!] with MIT-licensed versions. See JuliaLang#13001 and JuliaLang#14631. Also add a test for dropzeros!.
Sacha0 added a commit to Sacha0/julia that referenced this pull request Jan 28, 2016
…tril!, triu!, droptol!, and dropzeros[!] with MIT-licensed versions. See JuliaLang#13001 and JuliaLang#14631. Also add a test for dropzeros!.
Sacha0 added a commit to Sacha0/julia that referenced this pull request Jan 28, 2016
…tril!, triu!, droptol!, and dropzeros[!] with MIT-licensed versions. See JuliaLang#13001 and JuliaLang#14631. Also add a test for dropzeros!.
Sacha0 added a commit to Sacha0/julia that referenced this pull request Feb 24, 2016
Sacha0 added a commit to Sacha0/julia that referenced this pull request Feb 24, 2016
Sacha0 added a commit to Sacha0/julia that referenced this pull request Feb 24, 2016
…tril!, triu!, droptol!, and dropzeros[!] with MIT-licensed versions. See JuliaLang#13001 and JuliaLang#14631. Also add a test for dropzeros!.
Sacha0 added a commit to Sacha0/julia that referenced this pull request Feb 24, 2016
Sacha0 added a commit to Sacha0/julia that referenced this pull request Feb 24, 2016
…tril!, triu!, droptol!, and dropzeros[!] with MIT-licensed versions. See JuliaLang#13001 and JuliaLang#14631. Also add a test for dropzeros!.
Sacha0 added a commit to Sacha0/julia that referenced this pull request Feb 25, 2016
Sacha0 added a commit to Sacha0/julia that referenced this pull request Feb 25, 2016
Sacha0 added a commit to Sacha0/julia that referenced this pull request Feb 26, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sparse Sparse arrays
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants