Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use better sin_sum for F32 #93

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

heltonmc
Copy link
Member

This fixes #90 where performance was fixed in #92.

# before
julia> Bessels.besselj0(328049.34f0)
-0.0013240778f0

# after
julia> Bessels.besselj0(328049.34f0)
-0.0013258625f0

# Float64 number
julia> Bessels.besselj0(Float64(328049.34f0))
-0.001325862383187567

This significantly improves accuracy. The naive version of course is faster..

# Master
julia> @benchmark besselj0(x) setup=(x=Float32(rand()*100 + 20.0))
BenchmarkTools.Trial: 10000 samples with 998 evaluations.
 Range (min  max):  14.996 ns  29.645 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     16.055 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   15.950 ns ±  0.458 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

                                               ▂      ▂█       
  ▂▂▂▂▂▂▂▂▂▂▂▂▃▄▂▂▂▃▃▂▃▃▄▃▂▁▁▁▁▁▁▁▁▁▁▁▂▂▃▃▃▃▃▃▃█▆▂▂▂▃▆██▆▃▅▇▄ ▃
  15 ns           Histogram: frequency by time        16.2 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.


# this PR
julia> @benchmark besselj0(x) setup=(x=Float32(rand()*100 + 20.0))
BenchmarkTools.Trial: 10000 samples with 997 evaluations.
 Range (min  max):  18.653 ns  28.444 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     19.585 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   19.705 ns ±  0.369 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

                           ▆       █       ▁                   
  ▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▃▂▂▁▁▁▁▂▅█▂▁▁▁▁▁▇█▃▁▁▂▁▁▂█▄▂▁▁▁▁▁▆▆▂▂▁▁▁▁▄▇ ▃
  18.7 ns         Histogram: frequency by time        20.3 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

So about 20% slower but performance hit is necessary here as the previous result is inaccurate.

@codecov
Copy link

codecov bot commented Apr 21, 2023

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.40 🎉

Comparison is base (cec8fce) 96.09% compared to head (03bda4b) 96.49%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master      #93      +/-   ##
==========================================
+ Coverage   96.09%   96.49%   +0.40%     
==========================================
  Files          20       20              
  Lines        2228     2228              
==========================================
+ Hits         2141     2150       +9     
+ Misses         87       78       -9     
Impacted Files Coverage Δ
src/besselj.jl 98.31% <100.00%> (ø)

... and 1 file with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@oscardssmith
Copy link
Member

is it also slower in the range where the old one was accurate?

@heltonmc
Copy link
Member Author

Yep it is definitely slower. But I think it's a hard comparison because the old version I'm not for sure we could say was "accurate" in any range.

# old version
julia> x = 2.4f0
2.4f0
julia> 1 - besselj0(Float64(x)) / besselj0(x)
7.178780837779897e-6

# this version
julia> x = 2.4f0
2.4f0

julia> 1 - besselj0(Float64(x)) / besselj0(x)
-2.4906737694507797e-7

Some larger comparisons on accuracy...

# old version
julia> x = Float32.(rand(1000)*50);

julia> err = @. 1 - besselj0(x) / besselj0(Float64(x));

julia> maximum(err)
0.00024567309384593283

julia> median(err)
-2.4034297774200297e-8

julia> mean(err)
2.387311999333397e-7

# new version
julia> x = Float32.(rand(1000)*50);

julia> err = @. 1 - besselj0(x) / besselj0(Float64(x));

julia> maximum(err)
2.1666121468921773e-5

julia> median(err)
-1.5063105207602234e-8

julia> mean(err)
-2.003219752944041e-8

But ya it is a performance degradation.

# old
julia> @benchmark besselj0(x) setup=(x=Float32(rand() + 2.0))
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min  max):  9.000 ns  25.500 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     9.083 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   9.090 ns ±  0.252 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

                         █                                    
  ▂▁▁▁▁▁▁▁▁▁▁▅▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▃▁▁▁▁▁▁▁▁▁▁▁▂▁▁▁▁▁▁▁▁▁▁▂ ▂
  9 ns           Histogram: frequency by time        9.21 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

# new
julia> @benchmark besselj0(x) setup=(x=Float32(rand() + 2.0))
BenchmarkTools.Trial: 10000 samples with 998 evaluations.
 Range (min  max):  15.364 ns  37.074 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     15.489 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   15.500 ns ±  0.417 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

                     ▃         █                               
  ▂▁▁▁▁▁▁▁▁▃▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▆▁▁▁▁▁▁▁▁▂▃▁▁▁▁▁▁▁▁▂ ▂
  15.4 ns         Histogram: frequency by time        15.6 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

@heltonmc
Copy link
Member Author

Side note: It would be really nice to have some CI setup for benchmarking all these functions. Maybe not have it run on every pull request but more on command using either PkgBenchmark.jl or the recent https://github.com/MilesCranmer/AirspeedVelocity.jl which looked interesting.

@heltonmc
Copy link
Member Author

heltonmc commented Apr 30, 2023

@oscardssmith How do we feel about merging this? I think I would prefer the accuracy improvements in this and can improve performance over time if needed. Needs a rebase.

@oscardssmith
Copy link
Member

should this be merged or is it outdated?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Large argument inaccuracy for Float32 in besselj0
2 participants