Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support general eltypes in matrix division and SVD #1453

Merged
merged 1 commit into from
Mar 28, 2022

Conversation

danielwe
Copy link
Contributor

Hi again,

Since Base.:\, LinearAlgebra.svd, LinearAlgebra.svdvals need to make copies anyway, they can support general eltypes at no extra cost: just promote to a cublas/cusolver-compatible eltype when copying.

For now, I only implemented this for division and SVD since CUDA.CUSOLVER already specializes the non-mutating functions for these, so it was a drop-in replacement. Ideally, it would be extended to QR, LU, and Cholesky, which currently rely on Base.LinearAlgebra's non-mutating methods for making copies as necessary---this mostly works, but eagerly promotes to Float64 rather than Float32, which is obviously not ideal. Shouldn't be too hard to add the necessary methods to the non-mutating functions, I may get around to it later.

@codecov
Copy link

codecov bot commented Mar 25, 2022

Codecov Report

Merging #1453 (ebbd568) into master (e1fb0f4) will decrease coverage by 0.59%.
The diff coverage is 83.33%.

@@            Coverage Diff             @@
##           master    #1453      +/-   ##
==========================================
- Coverage   77.95%   77.35%   -0.60%     
==========================================
  Files         120      120              
  Lines        9227     9230       +3     
==========================================
- Hits         7193     7140      -53     
- Misses       2034     2090      +56     
Impacted Files Coverage Δ
lib/cusolver/linalg.jl 88.07% <83.33%> (-0.28%) ⬇️
lib/cudnn/CUDNN.jl 40.00% <0.00%> (-35.81%) ⬇️
lib/cublas/CUBLAS.jl 50.89% <0.00%> (-25.90%) ⬇️
src/utilities.jl 68.91% <0.00%> (-4.06%) ⬇️
lib/cudadrv/CUDAdrv.jl 51.66% <0.00%> (-3.34%) ⬇️
lib/cudadrv/module/linker.jl 68.75% <0.00%> (-3.13%) ⬇️
lib/cudadrv/memory.jl 80.75% <0.00%> (-1.04%) ⬇️
src/pool.jl 80.71% <0.00%> (+2.24%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e1fb0f4...ebbd568. Read the comment docs.

@maleadt
Copy link
Member

maleadt commented Mar 25, 2022

just promote to a cublas/cusolver-compatible eltype when copying.

Wouldn't it be surprising for svd etc to result in differently-typed outputs?

@danielwe
Copy link
Contributor Author

I don't think so for Integers at least. Factorizing an Integer matrix is a perfectly reasonable thing to do, but requires promoting to a float eltype. Base does it all the time:

julia> A = rand(1:10, 2, 3)
2×3 Matrix{Int64}:
[...]

julia> B = rand(1:10, 2, 3)
2×3 Matrix{Int64}:
[...]

julia> A \ B
3×3 Matrix{Float64}:
[...]

julia> svd(A)
SVD{Float64, Float64, Matrix{Float64}}
[...]

However, the mutating methods don't promote. They won't make extra copies behind the scenes so they need a type they can actually work with:

julia> svd!(A)
ERROR: MethodError: no method matching svd!(::Matrix{Int64})
[...]

The more open question is for Float16 and ComplexF16. It would definitely be possible to convert the result back to Float16 for those, like, e.g., norm does. For reference, Base doesn't bother---if it has to promote in order to call a LAPACK method, it just returns the LAPACK output. However, for many functions Base also has native generic implementations that don't convert at all, so the output type depends on whether your call hit LAPACK or not:

julia> A = rand(Float16, 2, 3)
2×3 Matrix{Float16}:
[...]

julia> B = rand(Float16, 2, 3)
2×3 Matrix{Float16}:
[...]

julia> A \ B
3×3 Matrix{Float16}:
[...]

julia> svd(A)
SVD{Float32, Float32, Matrix{Float32}}
[...]

julia> lu(A)
LU{Float16, Matrix{Float16}}
[...]

I suppose the cleanest behavior would be to accept float-typed returns for Integer matrices, but convert back down for Float16/ComplexF16---just like norm.

@danielwe
Copy link
Contributor Author

On further thought, converting back to F16 after the CUSOLVER call requires an extra copy, which should be optional (unlike functions with scalar output like norm). Having a keyword for that seems like a messy interface. So my preference would be to keep this PR as is and thus behave like Base (but ideally extend this to LU, QR, and Cholesky too).

@maleadt
Copy link
Member

maleadt commented Mar 28, 2022

Base doesn't bother---if it has to promote in order to call a LAPACK method, it just returns the LAPACK output.

In that case, I'm happy to keep it as is, even without converting back to FP16. Thanks for the PR!

@maleadt maleadt merged commit 2b8bd8c into JuliaGPU:master Mar 28, 2022
@maleadt maleadt added enhancement New feature or request cuda libraries Stuff about CUDA library wrappers. labels Mar 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuda libraries Stuff about CUDA library wrappers. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants