Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved the dot product between two vectors and a sparse matrix #410

Merged
merged 6 commits into from
Sep 9, 2023
Merged

Improved the dot product between two vectors and a sparse matrix #410

merged 6 commits into from
Sep 9, 2023

Conversation

albertomercurio
Copy link
Contributor

The changed function is a little bit faster than the current one.

N1 = 1024*10
N2 = 1024*20
T = Float64

x = Array(sprand(T, N1, 0.1))
y = Array(sprand(T, N2, 0.1))
A = sprand(T, N1, N2, 1/N1)

@benchmark dot($x, $A, $y)

BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):  45.800 μs  287.700 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     57.800 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   58.845 μs ±  10.107 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

               ▁█                                               
  ▁▁▂▃▄▅▅▃▃▄▃▃▄██▆▄▃▂▃▂▂▂▂▁▁▂▁▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂
  45.8 μs         Histogram: frequency by time         96.6 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.



@benchmark mydot($x, $A, $y)

BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):  37.200 μs  896.900 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     50.900 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   49.762 μs ±  14.435 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

   ▂▁▃             ▆█▃                                          
  ▆████▅▄▄▃▃▃▂▂▃▃▅▄███▇▅▃▃▂▂▂▂▂▂▂▂▁▁▁▁▁▁▂▂▂▂▁▂▂▁▁▂▂▂▁▁▁▁▁▁▁▁▁▁ ▂
  37.2 μs         Histogram: frequency by time         84.9 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

And with vectors without zeros:

x = rand(T, N1)
y = rand(T, N2)
A = sprand(T, N1, N2, 1/N1)

@benchmark dot($x, $A, $y)

BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):  166.900 μs   1.750 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     214.800 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   224.317 μs ± 37.141 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

                 ▆█                                             
  ▁▁▁▁▁▁▂▂▂▂▆▅▇▅▅███▆▆▄▆█▆▄▃▂▂▃▃▂▂▂▂▂▂▂▂▂▁▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂
  167 μs          Histogram: frequency by time          331 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.


@benchmark mydot($x, $A, $y)

BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):  146.700 μs  991.600 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     178.900 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   184.942 μs ±  25.418 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

       ▁   ▃▃▅██▆▃▁ ▁▂                                           
  ▂▁▂▄▅█▄▃▃███████████▇▆█▅█▄▄▃▃▂▂▂▂▂▂▂▁▁▁▁▁▂▂▂▂▂▁▁▁▁▂▂▂▁▁▁▁▁▁▁▁ ▃
  147 μs           Histogram: frequency by time          277 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

@albertomercurio
Copy link
Contributor Author

For the moment I left the check iszero for the elements of the y vector. However, this slows down a lot the calculation. I think that, if the user knows that the vector contains several zeros, he/she can convert it to a sparsevector and execute the specific dot product for this type, leaving the maximum performances for the operations with really dense vectors.

Copy link
Member

@dkarrasch dkarrasch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. A few comments.

src/linalg.jl Outdated Show resolved Hide resolved
src/linalg.jl Outdated Show resolved Hide resolved
src/linalg.jl Outdated Show resolved Hide resolved
src/linalg.jl Outdated Show resolved Hide resolved
@dkarrasch
Copy link
Member

I think that, if the user knows that the vector contains several zeros, he/she can convert it to a sparsevector and execute the specific dot product for this type,

Good point. Let's assume that for AbstractVectors generic elements are not zero, and skip that test. The main optimization comes from the known sparsity of A.

Copy link
Member

@dkarrasch dkarrasch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be interesting to see the timings now without the zero-test.

src/linalg.jl Outdated Show resolved Hide resolved
Co-authored-by: Daniel Karrasch <daniel.karrasch@posteo.de>
@albertomercurio
Copy link
Contributor Author

N1 = 1024*20
N2 = 1024*40
T = Float64
x = rand(T, N1)
y = rand(T, N2)


@benchmark dot($x, $A, $y)

BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):  261.300 μs   1.293 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     293.500 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   322.014 μs ± 70.194 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▂▂██▆▅▆▅▅▄▄▄▃▄▃▃▂▂▂▂▂▂▂▂▂▂▂▂▁▁ ▁▁▁▁▁  ▁▁                     ▂
  ██████████████████████████████████████████▇█████▇▇█▇▆▇▇▇▆▆▅▆ █
  261 μs        Histogram: log(frequency) by time       575 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.



@benchmark mydot($x, $A, $y)

BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):  213.700 μs  743.500 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     233.200 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   245.754 μs ±  40.873 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▄▆▅███▇▇▅▅▄▃▃▂▁▂▂▁  ▁▁                                        ▂
  ████████████████████████████████▇▇████▇██▇▆▇▆▆▆▇▇▆▆▆▆▆▆▅▅▆▅▅▄ █
  214 μs        Histogram: log(frequency) by time        428 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

@codecov
Copy link

codecov bot commented Jul 12, 2023

Codecov Report

Merging #410 (3d62748) into main (54f4b39) will increase coverage by 0.78%.
Report is 3 commits behind head on main.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main     #410      +/-   ##
==========================================
+ Coverage   92.42%   93.20%   +0.78%     
==========================================
  Files          12       12              
  Lines        7666     7666              
==========================================
+ Hits         7085     7145      +60     
+ Misses        581      521      -60     
Files Changed Coverage Δ
src/linalg.jl 91.55% <100.00%> (+4.54%) ⬆️

... and 3 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

📢 Have feedback on the report? Share it here.

@albertomercurio
Copy link
Contributor Author

I think it is finished. The codecov is decreased because the amount of lines is less.

Copy link
Member

@dkarrasch dkarrasch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last detail I noticed, but afterwards we should really finally merge this.

src/linalg.jl Outdated Show resolved Hide resolved
@dkarrasch dkarrasch merged commit c93065c into JuliaSparse:main Sep 9, 2023
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants