Skip to content

Commit

Permalink
Remove unnecessary and slow hypot calls in real givensAlgorithm. Add …
Browse files Browse the repository at this point in the history
…copyright statement.
  • Loading branch information
andreasnoack committed Mar 18, 2015
1 parent ed97120 commit 17abcd9
Showing 1 changed file with 15 additions and 3 deletions.
18 changes: 15 additions & 3 deletions base/linalg/givens.jl
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,12 @@ realmin2(::Type{Float32}) = reinterpret(Float32, 0x26000000)
realmin2(::Type{Float64}) = reinterpret(Float64, 0x21a0000000000000)
realmin2{T}(::Type{T}) = (twopar = 2one(T); twopar^trunc(Integer,log(realmin(T)/eps(T))/log(twopar)/twopar))

# derived from LAPACK's dlartg
# Copyright:
# Univ. of Tennessee
# Univ. of California Berkeley
# Univ. of Colorado Denver
# NAG Ltd.
function givensAlgorithm{T<:FloatingPoint}(f::T, g::T)
zeropar = zero(T)
onepar = one(T)
Expand Down Expand Up @@ -66,7 +72,7 @@ function givensAlgorithm{T<:FloatingPoint}(f::T, g::T)
scalepar = max(abs(f1), abs(g1))
if scalepar < safmx2 break end
end
r = hypot(f1, g1)
r = sqrt(f1*f1 + g1*g1)
cs = f1/r
sn = g1/r
for i = 1:count
Expand All @@ -81,14 +87,14 @@ function givensAlgorithm{T<:FloatingPoint}(f::T, g::T)
scalepar = max(abs(f1), abs(g1))
if scalepar > safmn2 break end
end
r = hypot(f1, g1)
r = sqrt(f1*f1 + g1*g1)
cs = f1/r
sn = g1/r
for i = 1:count
r *= safmn2
end
else
r = hypot(f1, g1)
r = sqrt(f1*f1 + g1*g1)
cs = f1/r
sn = g1/r
end
Expand All @@ -101,6 +107,12 @@ function givensAlgorithm{T<:FloatingPoint}(f::T, g::T)
return cs, sn, r
end

# derived from LAPACK's zlartg
# Copyright:
# Univ. of Tennessee
# Univ. of California Berkeley
# Univ. of Colorado Denver
# NAG Ltd.
function givensAlgorithm{T<:FloatingPoint}(f::Complex{T}, g::Complex{T})
twopar, onepar, zeropar = 2one(T), one(T), zero(T)
czero = zero(Complex{T})
Expand Down

7 comments on commit 17abcd9

@vtjnash
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hypot has better handling for the case where x^2 overflows. Do we care?

@andreasnoack
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The algorithm checks for overflow and scales the arguments if necessary and therefore the use of hypot is unnecessary.

@garborg
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird, last week I replaced sqrt(a*a + b*b) with hypot(a, b) in my own code because my tests (Float64 only) showed it was faster as well. Maybe I have to revisit that...

@kmsquire
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found the same thing in Images.jl previously. @andreasnoack, do you have some benchmark data which shows this speed improvement?

@andreasnoack
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

julia> let
       x = randn()
       y = randn()
       @time for i = 1:10^9;hypot(x,y);end
       @time for i = 1:10^9;sqrt(x*x + y*y);end
       end
elapsed time: 8.857747141 seconds (0 bytes allocated)
elapsed time: 0.425739182 seconds (0 bytes allocated)

julia> versioninfo()
Julia Version 0.4.0-dev+3903
Commit 9dbe1ed* (2015-03-19 15:38 UTC)
Platform Info:
  System: Darwin (x86_64-apple-darwin14.1.0)
  CPU: Intel(R) Core(TM) i7-4870HQ CPU @ 2.50GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: libopenblas
  LIBM: libopenlibm
  LLVM: libLLVM-3.3

so approximately 20 times faster than hypot on my machine.

@kmsquire
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Would replacing the hypot function be worthwhile (keeping @vtjnash's comment about overflow handling in mind)?

@andreasnoack
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that replacing hypot is right. It is probably the best you can do if you want to avoid under- and overflow. The problem here is that givensAlgorithm is doing the same checks as hypot so the checks are done twice.

I've looked a bit more on this and I think we could just use a simple function with hypot instead the long version borrowed from LAPACK which basically includes an implementation hypot. I'll take another look at

On computing Givens rotations reliably and efficiently

and then probably open a pull request with a new shorter and faster real givensAlgorithm.

Please sign in to comment.