[CUBLAS] Interface gemm_grouped_batched #2310

amontoison · 2024-04-01T03:43:17Z

@michel2323
NVIDIA added a new routine with CUDA v12.4 to perform batched gemm with matrices of different sizes.

res/wrap/cublas.toml

amontoison added 2 commits March 31, 2024 23:42

[CUBLAS] Interface gemm_grouped_batched

3e4dd8a

Only test gemm_grouped_batched with CUDA v12.4

9358e1a

maleadt reviewed Apr 1, 2024

View reviewed changes

res/wrap/cublas.toml Outdated Show resolved Hide resolved

Update cublas.toml

dcf79a6

maleadt added enhancement New feature or request cuda libraries Stuff about CUDA library wrappers. labels Apr 2, 2024

maleadt merged commit 7f725c0 into JuliaGPU:master Apr 2, 2024
1 check passed

Provide feedback