Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BLAS: Convert alpha & beta to more appropriate types. #129

Merged
merged 1 commit into from
Jul 4, 2023

Conversation

maleadt
Copy link
Member

@maleadt maleadt commented Jul 3, 2023

Default boolean values for alpha and beta cause messy conversions when applied to e.g. Float16:

┌ @ /home/tim/Julia/pkg/GemmKernels/src/blas.jl:71 within `#9`%6097  = Core.getfield(%6096, :alpha)::Bool
│┌ @ bool.jl:182 within `*` @ bool.jl:180
││┌ @ number.jl:221 within `copysign`
│││┌ @ floatfuncs.jl:17 within `signbit`
││││ %6098  = Base.bitcast(Base.Int16, %6063)::Int16
││││ @ floatfuncs.jl:17 within `signbit` @ int.jl:139
││││┌ @ promotion.jl:450 within `<`
│││││┌ @ promotion.jl:381 within `promote`
││││││┌ @ promotion.jl:358 within `_promote`
│││││││┌ @ number.jl:7 within `convert`
││││││││┌ @ boot.jl:784 within `Int64`
│││││││││┌ @ boot.jl:702 within `toInt64`
││││││││││ %6099  = Core.sext_int(Core.Int64, %6098)::Int64
│││││└└└└└
│││││ @ promotion.jl:450 within `<` @ int.jl:83
│││││ %6100  = Base.slt_int(%6099, 0)::Bool
│││└└
│││┌ @ operators.jl:269 within `!=`
││││┌ @ promotion.jl:499 within `==`
│││││ %6101  = (false === %6100)::Bool
││││└
││││┌ @ bool.jl:35 within `!`
│││││ %6102  = Base.not_int(%6101)::Bool
│││└└
│││┌ @ essentials.jl:575 within `ifelse`
││││ %6103  = Core.ifelse(%6102, Float16(-0.0), Float16(0.0))::Float16
││└└
││┌ @ essentials.jl:575 within `ifelse`
│││ %6104  = Core.ifelse(%6097, %6063, %6103)::Float16
│└└

vs simply

┌ @ /home/tim/Julia/pkg/GemmKernels/src/blas.jl:71 within `#13`
│ %6097  = Core.getfield(%6096, :alpha)::Float16
│┌ @ float.jl:410 within `*`
││ %6098  = Base.mul_float(%6063, %6097)::Float16
│└

LLVM IR:

julia> @code_llvm Float16(1) * true
;  @ bool.jl:182 within `*`
define half @"julia_*_146"(half %0, i8 zeroext %1) #0 {
top:
;  @ bool.jl:182 within `*` @ bool.jl:180
; ┌ @ number.jl:221 within `copysign`
; │┌ @ essentials.jl:575 within `ifelse`
    %2 = call half @llvm.copysign.f16(half 0xH0000, half %0)
; └└
; ┌ @ essentials.jl:575 within `ifelse`
   %3 = and i8 %1, 1
   %.not = icmp eq i8 %3, 0
   %4 = select i1 %.not, half %2, half %0
; └
;  @ bool.jl:182 within `*`
  ret half %4
}

Avoiding this reduces register pressure by 15% or so.

@maleadt
Copy link
Member Author

maleadt commented Jul 3, 2023

Benchmark results for commit 4ffa934 (comparing to 1222d80):

ID before after change
["BLAS", "Float16'*Float16'=Float16 (4096×4096×4096, alpha)"] 7.389 ms ± 795.688 μs 6.021 ms ± 554.176 μs 20.3% ✅
["BLAS", "Float16'*Float16'=Float32 (4096×4096×4096, alpha)"] 6.438 ms ± 690.212 μs 6.128 ms ± 690.534 μs 6.8% ✅
["BLAS", "Float16'*Float16=Float16 (4096×4096×4096, alpha)"] 7.706 ms ± 22.187 μs 5.761 ms ± 35.014 μs 25.5% ✅
["BLAS", "Float16*Float16'=Float16 (4096×4096×4096, alpha)"] 7.104 ms ± 569.471 μs 5.807 ms ± 20.311 μs -6.8% ❌
["BLAS", "Float16*Float16'=Float32 (4096×4096×4096, alpha)"] 6.689 ms ± 1.446 ms 6.356 ms ± 1.377 ms 6.3% ✅
["BLAS", "Float16*Float16=Float16 (4096×4096×4096, alpha)"] 5.419 ms ± 13.667 μs 4.184 ms ± 12.511 μs 22.7% ✅
["BLAS", "Float16*Float16=Float16 (4096×4096×4096, alpha, beta)"] 5.569 ms ± 24.554 μs 4.403 ms ± 17.014 μs 21.2% ✅
["BLAS", "Float16*Float16=Float16 (4096×4096×4096, beta)"] 3.865 ms ± 18.291 μs 3.311 ms ± 13.259 μs 13.8% ✅
["BLAS", "Float16*Float16=Float32 (4096×4096×4096, beta)"] 5.409 ms ± 8.565 μs 5.077 ms ± 15.586 μs 6.3% ✅

@codecov
Copy link

codecov bot commented Jul 3, 2023

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.26 🎉

Comparison is base (1222d80) 29.97% compared to head (4ffa934) 30.23%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #129      +/-   ##
==========================================
+ Coverage   29.97%   30.23%   +0.26%     
==========================================
  Files          11       11              
  Lines         794      797       +3     
==========================================
+ Hits          238      241       +3     
  Misses        556      556              
Impacted Files Coverage Δ
src/blas.jl 84.84% <100.00%> (+1.51%) ⬆️

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@maleadt maleadt merged commit d347375 into master Jul 4, 2023
@maleadt maleadt deleted the tb/alpha_beta_types branch July 4, 2023 04:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant