v5.1.2
CUDA v5.1.2
Merged pull requests:
- kernel docs: fix formatting, clean up awkward sentence (#2172) (@simonbyrne)
- [CUSOLVER] Don't reuse the sparse handles (#2173) (@amontoison)
- Added kronecker product support for dense matrices (#2177) (@albertomercurio)
- Fix typos and simplify wording in performance tips docs (#2179) (@Zentrik)
- provide more information on kernel compilation error (#2180) (@simonbyrne)
- [CUSPARSE] Test CUSPARSE_SPMV_COO_ALG2 (#2182) (@amontoison)
- [CUSPARSE] Use cusparseSpMM_preprocess (#2183) (@amontoison)
- [CUSPARSE] Use cusparseSDDMM_preprocess (#2184) (@amontoison)
- Add the structures ILU0Info() and IC0Info() for the preconditioners (#2187) (@amontoison)
- [CUSOLVER] Add a structure CuSolverParameters fro the generic API (#2188) (@amontoison)
- Support more kwarg syntax with kernel launches (#2189) (@maleadt)
- Fix typo in docs/src/development/troubleshooting.md (#2193) (@jcsahnwaldt)
- NVML: Add support for clock queries. (#2194) (@maleadt)
- Fix Random.jl seeding for 1.11 (#2199) (@IanButterworth)
- Improvements to context handling (#2200) (@maleadt)
- Add a concurrent kwarg to profiling macros. (#2201) (@maleadt)
- Rework unique context management. (#2202) (@maleadt)
- Preserve the buffer type when broadcasting. (#2203) (@maleadt)
- Fixes for Windows (#2206) (@maleadt)
- Bump Aqua. (#2207) (@maleadt)
- CUSPARSE: Eagerly combine duplicate element on construction. (#2213) (@maleadt)
- CompatHelper: bump compat for BFloat16s to 0.5, (keep existing compat) (#2214) (@github-actions[bot])
- Bump the CUDA Runtime for CUDA 12.3.2. (#2217) (@maleadt)
- Default to testing with only a single device. (#2221) (@maleadt)
- Backports for v5.1 (#2224) (@maleadt)
Closed issues:
- More informative errors when parameter size is too big (#2119)
- Modifying
struct
containingCuArray
fails in threads in 5.0.0 and 5.1.0 (#2171) - Matmul of CuArray{ComplexF32} and CuArray{Float32} is slow (#2175)
- Support for combining duplicate elements in sparse matrices (#2185)
- Interactive sessions: periodically trim the memory pool (#2190)
- Broadcast does not preserve buffer type (#2191)
- CUDA doesn't precompile on Julia nightly/1.11 (#2195)
- Latest julia: UndefVarError:
make_seed
not defined inRandom
(#2198) - CUDA installation fails on Apple Silicon/Julia 1.10 (#2211)
- Most recent package versions not supported on CUDA.jl (#2212)
- Testing of CUDA fails (#2222)
--debug-info=2
makesNNlibCUDACUDNNExt
precompilation run forever (#2225)