1.10 enablement #1946

dkarrasch · 2023-06-11T15:09:35Z

This is to make use of JuliaLang/julia#50058. While that PR has not been merged, I'm putting this up to see whether this is working properly on older julia versions. Once that PR is merged, CI should obviously also pass on nightly. The amount of code that can eventually be deleted is HUGE, with no loss of functionality!

dkarrasch · 2023-06-12T09:33:25Z

lib/cublas/linalg.jl

+LinearAlgebra.generic_mattrimul!(C::DenseCuMatrix{T}, uploc, isunitc, tfun::Function, A::DenseCuMatrix{T}, B::DenseCuMatrix{T}) where {T<:CublasFloat} =
+    trmm!('R', uploc, tfun === identity ? 'N' : tfun === transpose ? 'T' : 'C', isunitc, one(T), B, A, C)
+# tri-tri-mul!
+function LinearAlgebra.generic_trimatmul!(C::DenseCuMatrix{T}, uplocA, isunitcA, tfunA::Function, A::DenseCuMatrix{T}, B::LinearAlgebra.UpperOrLowerTriangular{T,<:DenseCuMatrix}) where {T<:CublasFloat}


This function is meant to handle the upper-lower triangular (and vice versa) products. It may be a bit more general in that not both factors need to be non-"unitary", but only one of the factors (the one that is tri[u/l]ed). We need to think about what should happen if none of the conditions is met, i.e., either throw or fall back to some generic matmatmul or so.

maleadt · 2023-06-26T20:11:14Z

I take it we better wait until JuliaLang/julia#50058 is merged so that we can put an accurate version bound in here?

dkarrasch · 2023-06-26T20:45:11Z

Yes. The extended functions currently don't exist. Unfortunately, that PR requires some back-and-forth with SparseArrays.jl, and requires bumping the stdlib. But I marked it as a milestone, so it must be in v1.10 eventually. I'd love to see profiling results compared to some prior reasonable state, like v1.9 and CUDA.jl without my previous PR, or something like that. 😉

dkarrasch · 2023-07-17T16:09:27Z

Let's see if it also works on nightly now that the Base PR is merged. If so, then this should perhaps get another quick look, but otherwise should be ready to go. For reference, the Base commit is already included in the next backport round: JuliaLang/julia#50508.

dkarrasch · 2023-07-18T06:26:46Z

Hm, nightly checks have been removed?

maleadt · 2023-07-18T13:41:27Z

Hm, nightly checks have been removed?

It was hanging. I'll try to add it back in another PR.

maleadt · 2023-07-18T14:51:13Z

This is to make use of JuliaLang/julia#50058.

FYI, CUDA.jl tests failed on 1.10 without this PR, so I guess that means the change was slightly breaking?

dkarrasch · 2023-07-18T15:06:15Z

That Base PR is not included in v1.10.0-alpha1. It is going to be included in v1.10.0-alpha2. However, this PR here redirects the triangular methods starting from v1.10-, like, a bit "too early". You can wait until alpha2 is released, then run the whole stack, and if it passes, you can merge and release. An indicator for whether it (should) work(s) is current nightly, because that has the commit included. If that one fails, then we know what needs to be fixed in any case.

maleadt · 2023-07-18T15:18:11Z

That Base PR is not included in v1.10.0-alpha1. It is going to be included in v1.10.0-alpha2.

I know; I was testing alpha2 against CUDA.jl#master. Some of the overloads here weren't sticking anymore:

libraries/cusolver/dense: Error During Test at /home/tim/Julia/pkg/CUDA/test/libraries/cusolver/dense.jl:413
  Test threw exception
  Expression: collect(d_M \ d_B) ≈ M \ B
  ArgumentError: cannot take the CPU address of a CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}
  Stacktrace:
    [1] unsafe_convert(::Type{Ptr{Float32}}, x::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer})
      @ CUDA ~/Julia/pkg/CUDA/src/array.jl:386
    [2] trtrs!(uplo::Char, trans::Char, diag::Char, A::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, B::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer})
      @ LinearAlgebra.LAPACK ~/Julia/src/julia/build/dev/usr/share/julia/stdlib/v1.10/LinearAlgebra/src/lapack.jl:3557
    [3] generic_trimatdiv!(C::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, uploc::Char, isunitc::Char, tfun::Function, A::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, B::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer})
      @ LinearAlgebra ~/Julia/src/julia/build/dev/usr/share/julia/stdlib/v1.10/LinearAlgebra/src/triangular.jl:836
    [4] _ldiv!(C::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, A::UpperTriangular{Float32, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, B::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer})
      @ LinearAlgebra ~/Julia/src/julia/build/dev/usr/share/julia/stdlib/v1.10/LinearAlgebra/src/triangular.jl:758
    [5] ldiv!(C::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, A::UpperTriangular{Float32, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, B::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer})
      @ LinearAlgebra ~/Julia/src/julia/build/dev/usr/share/julia/stdlib/v1.10/LinearAlgebra/src/triangular.jl:751
    [6] \(A::UpperTriangular{Float32, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, B::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer})
      @ LinearAlgebra ~/Julia/src/julia/build/dev/usr/share/julia/stdlib/v1.10/LinearAlgebra/src/triangular.jl:1489
    [7] ldiv!(_qr::QR{Float32, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, b::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer})
      @ CUDA.CUSOLVER ~/Julia/pkg/CUDA/lib/cusolver/linalg.jl:146
    [8] \(F::QR{Float32, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, B::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer})
      @ CUDA.CUSOLVER ~/Julia/pkg/CUDA/lib/cusolver/linalg.jl:95
    [9] macro expansion
      @ ~/Julia/src/julia/build/dev/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:669 [inlined]
   [10] macro expansion
      @ ~/Julia/pkg/CUDA/test/libraries/cusolver/dense.jl:413 [inlined]
   [11] macro expansion
      @ ~/Julia/src/julia/build/dev/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:1577 [inlined]
   [12] macro expansion
      @ ~/Julia/pkg/CUDA/test/libraries/cusolver/dense.jl:338 [inlined]
   [13] macro expansion
      @ ~/Julia/src/julia/build/dev/usr/share/julia/stdlib/v1.10/Test/src/Test.jl:1669 [inlined]
   [14] top-level scope
      @ ~/Julia/pkg/CUDA/test/libraries/cusolver/dense.jl:1690

Now, that isn't too bad for CUDA.jl, we generally don't promise forwards compatibility (because of our tight integration with the Julia compiler). I just wanted to mention it for your information.

dkarrasch · 2023-07-18T15:25:40Z

That means there is something wrong with this PR then. generic_trimatdiv! from this package should be called in step [3], which should be defined line 223 in linalg.jl. But that call takes the wrong direction, to LinearAlgebra and ends in LAPACK. Must be an issue with the method signature.

dkarrasch · 2023-07-18T15:28:19Z

Wait, I guess I'm mixing up branches and versions. Let me think about it.

maleadt · 2023-07-18T15:29:10Z

That means there is something wrong with this PR then.

Sorry, I'm not being clear. I wasn't using this PR. I was using 1.10-alpha2 with CUDA.jl#master. Theoretically, that shouldn't have broken anything, but AFAICT it did seem to break CUDA.jl. This PR will fix that, so users won't notice a thing.

dkarrasch · 2023-07-18T15:50:35Z

Got it. Yeah, the "breakage" is because in triangular \ (step [6] above) I moved to using 3-arg ldiv!, where we used to have a branch, and for BLAS eltypes this was calling 2-arg ldiv!.^1 For 2-arg ldiv! we have overloads here. But when the generic code calls 3-arg ldiv!, packages typcially don't have overloads for it, so their overloads are missed.

^1 For 3-arg ldiv! we allocate just a similar matrix, and in BLAS cases it copies B to the target and applies BLAS methods. Just like what we get for calling 2-arg ldiv!: there we used to create the filled copy (copy_similar) in \ right away. So, the work is the same, but the call chains are different now. That didn't pop up in pkgeval runs, so at least in the CPU world there don't seem to exist any issues with that.

dkarrasch · 2023-07-18T15:58:25Z

Ok, this PR is not completely correct. I'll fix it soon.

dkarrasch · 2023-07-22T11:07:09Z

Hm, is CI down, or did I crash it?

maleadt · 2023-07-22T11:49:32Z

Buildkite is down (like, all of Buildkite).

maleadt · 2023-07-29T18:53:28Z

Thanks for the help, @dkarrasch!

Use unwrapping mechanism for triangular matrices. Co-authored-by: Tim Besard <tim.besard@gmail.com>

dkarrasch commented Jun 12, 2023

View reviewed changes

maleadt force-pushed the dk/triangular branch from 94e1e3d to 98c2206 Compare July 18, 2023 14:47

maleadt changed the title ~~use unwrapping mechanism for triangular matrices~~ 1.10 enablement Jul 18, 2023

maleadt added the enhancement New feature or request label Jul 18, 2023

dkarrasch closed this Jul 29, 2023

dkarrasch reopened this Jul 29, 2023

dkarrasch and others added 5 commits July 29, 2023 11:37

Use unwrapping mechanism for triangular matrices.

05ad235

Add 1.10 and nightly CI.

be66b7c

Change a function signature to make Julia 1.10 happy.

ac17f9a

correct signature for tritrimul!

13d1b09

fix abstractq show test

5397909

maleadt force-pushed the dk/triangular branch from 4f66011 to 5397909 Compare July 29, 2023 15:37

maleadt merged commit a912bea into JuliaGPU:master Jul 29, 2023

dkarrasch deleted the dk/triangular branch July 31, 2023 07:51

maleadt added a commit that referenced this pull request Aug 25, 2023

1.10 enablement (#1946)

6850412

Use unwrapping mechanism for triangular matrices. Co-authored-by: Tim Besard <tim.besard@gmail.com>

amontoison mentioned this pull request Nov 10, 2023

[CUSPARSE] Update the interface for triangular solves #2164

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1.10 enablement #1946

1.10 enablement #1946

dkarrasch commented Jun 11, 2023

dkarrasch Jun 12, 2023

maleadt commented Jun 26, 2023

dkarrasch commented Jun 26, 2023

dkarrasch commented Jul 17, 2023

dkarrasch commented Jul 18, 2023

maleadt commented Jul 18, 2023

maleadt commented Jul 18, 2023

dkarrasch commented Jul 18, 2023

maleadt commented Jul 18, 2023

dkarrasch commented Jul 18, 2023

dkarrasch commented Jul 18, 2023

maleadt commented Jul 18, 2023

dkarrasch commented Jul 18, 2023

dkarrasch commented Jul 18, 2023

dkarrasch commented Jul 22, 2023

maleadt commented Jul 22, 2023

maleadt commented Jul 29, 2023

1.10 enablement #1946

1.10 enablement #1946

Conversation

dkarrasch commented Jun 11, 2023

dkarrasch Jun 12, 2023

Choose a reason for hiding this comment

maleadt commented Jun 26, 2023

dkarrasch commented Jun 26, 2023

dkarrasch commented Jul 17, 2023

dkarrasch commented Jul 18, 2023

maleadt commented Jul 18, 2023

maleadt commented Jul 18, 2023

dkarrasch commented Jul 18, 2023

maleadt commented Jul 18, 2023

dkarrasch commented Jul 18, 2023

dkarrasch commented Jul 18, 2023

maleadt commented Jul 18, 2023

dkarrasch commented Jul 18, 2023

dkarrasch commented Jul 18, 2023

dkarrasch commented Jul 22, 2023

maleadt commented Jul 22, 2023

maleadt commented Jul 29, 2023