Releases · JuliaGPU/CUDA.jl

08 May 22:08

github-actions

v3.9.1

b7e60f5

v3.9.1

CUDA v3.9.1

Diff since v3.9.0

Closed issues:

Issue with copy_cublasfloat (#1476)
Errors when broadcasting random number generators (#1480)
CPU version of linear algebra routine is dispatched when using Zygote.gradient (#1481)
scan! fails on vectors of structs (#1482)
InexactError when getting CUDA version info (#1489)

Merged pull requests:

Allow more integer argument types for byte_perm (#1420) (@eschnett)
support CuSparseMatrix(::Diagonal) (#1470) (@Roger-luo)
Don't emit debug info until the next CUDA version. (#1473) (@maleadt)
Update manifest (#1474) (@github-actions[bot])
Update manifest (#1479) (@github-actions[bot])
fix unsafe_wrap docstring and widen signature (#1483) (@piever)
Update manifest (#1484) (@github-actions[bot])
Check whether cudaRuntimeGetVersion succeeded. (#1490) (@maleadt)
Update manifest (#1494) (@github-actions[bot])
Fix #1476: Allow any container in copy_cublasfloat (#1498) (@danielwe)

Contributors

eschnett, maleadt, and 3 other contributors

Assets 2

09 Apr 09:58

github-actions

v3.9.0

5c40438

v3.9.0

CUDA v3.9.0

Diff since v3.8.5

Closed issues:

Tests for showing (#35)
Support LU factorizations (#1193)
Int8 WMMA not working in 3.8.4 and 3.8.5 despite merged PR. Add more unit tests? (#1442)
Optional CPU cpu kernel call with @cuda (#1443)
Add library/artifact management for NCCL (#1446)
permutedims returns a lowertriangular matrix (#1451)
New broadcast corrupts memory? (#1457)
norm does not dispatch on CuSparseMatrixCSC (#1460)
scalar * sparse multiplication (#1468)

Merged pull requests:

CUTENSOR: axpy! and axpby! not mutating fixed (#1416) (@yapanuwan)
Initial wrap of cuquantum (#1437) (@kshyatt)
CompatHelper: bump compat for "GPUCompiler" to "0.14" (#1441) (@github-actions[bot])
Fix return type of nrm2 for ComplexF16 (#1444) (@danielwe)
Use a build matrix. (#1445) (@maleadt)
Update manifest (#1447) (@github-actions[bot])
Rework factorizations (#1449) (@maleadt)
Add NCCL binaries. (#1450) (@maleadt)
Support general eltypes in matrix division and SVD (#1453) (@danielwe)
Update manifest (#1456) (@github-actions[bot])
Look at more environment variables to find nsys. (#1459) (@maleadt)
Fixes for 1.8 (#1463) (@maleadt)

Contributors

cuda, maleadt, and 3 other contributors

Assets 2

14 Mar 20:11

github-actions

v3.8.5

a57345a

v3.8.5

CUDA v3.8.5

Diff since v3.8.4

Merged pull requests:

Update manifest (#1440) (@github-actions[bot])

Assets 2

11 Mar 16:50

github-actions

v3.8.4

1526aad

v3.8.4

CUDA v3.8.4

Diff since v3.8.3

Closed issues:

sparse-sparse and sparse-constant multiplication lose sparsity (output dense matrix) (#1264)
LLVMExtra fails to load on Julia 1.8 and PPC (#1387)
compute-sanitizer CUDA_ERROR_INVALID_VALUE on CUDA.jl 3.0+ (#1415)
@cudnnDescriptor is not threadsafe (#1421)
Precomplication of CUDA 3.8.3 broken on 1.7.1 due to changes in Random123.jl (#1422)
OOM error should include memory status (#1427)
WMMA kernel works with Julia 1.7.2 but fails with illegal memory access for Julia 1.8.0-beta1 (#1431)
Non Int64 local memory size leads to dynamic function invocation (#1434)
"initialization" test failing (#1435)
cuda with julia 1.8 not working on windows (working fine(?) on wsl2) (#1436)

Merged pull requests:

Add Int8 WMMA Support (#1119) (@max-Hawkins)
Wrap generic sparse-sparse GEMM (#1285) (@kshyatt)
Fix sparse COO to CSR conversion. (#1412) (@maleadt)
Drop support for CUDA 10.1 and below (#1414) (@maleadt)
Update manifest (#1417) (@github-actions[bot])
Report the OOM memory status at the time of the error. (#1428) (@maleadt)
Lock CUDNN descriptor cache lookups. (#1430) (@maleadt)
Switch to new LLVM context management for 1.9 compatibility. (#1432) (@maleadt)
Update manifest (#1433) (@github-actions[bot])
Backports for 3.8.4 (#1438) (@maleadt)

Contributors

maleadt, kshyatt, and max-Hawkins

Assets 2

25 Feb 17:59

github-actions

v3.8.3

2319b89

v3.8.3

CUDA v3.8.3

Diff since v3.8.2

Closed issues:

Sparse matrix addition not working (#528)
Native implementation of sparse arrays (#829)
CUSPARSE: Adding a value to the diagonal (#1372)
Conversion by cu casts Float64 to Float32 but not Int64 to Int32 (#1388)
CUDA.math_mode!(...; precision) option not working (#1392)
cuIpcGetMemHandle failure resulting in CUDA-aware MPI to fail (#1398)
axpby! support for BFloat16 (#1399)
CUSPARSE does not support integer matrices, breaks printing (#1402)
sparse(I, J, V) doesn't support unsorted inputs (#1407)

Merged pull requests:

General purpose broadcast for sparse CSR matrices. (#1380) (@maleadt)
Update manifest (#1389) (@github-actions[bot])
Implement sparse operations with UniformScaling using broadcast. (#1390) (@maleadt)
Prevent toplevel compilation. (#1391) (@maleadt)
Fix and test math precision. (#1394) (@maleadt)
Bump artifacts (#1397) (@maleadt)
support BFloat16 for atomic_cas (#1400) (@bjarthur)
Implement sparse broadcasting with CSC matrices. (#1401) (@maleadt)
Always report issues with discovering CUDA. (#1404) (@maleadt)
Fix sparse 1-argument broadcast output type. (#1405) (@maleadt)
CUSPARSE BSR improvements (#1409) (@maleadt)
Support limited sparse integer arrays by bitcasting to floating point. (#1410) (@maleadt)
Support using sparse with unsorted inputs. (#1411) (@maleadt)
Backports for 3.8.3 (#1413) (@maleadt)

Contributors

maleadt and bjarthur

Assets 2

18 Feb 17:16

github-actions

v3.8.2

46db50d

v3.8.2

CUDA v3.8.2

Diff since v3.8.1

Closed issues:

CuSparseMatrixCSC missing lu and interactions with UniformScaling (#79)
CUSPARSE typo (#1231)
similar(A::CuSparse,eltype) returns an Array (#1316)
"errormonitor" undefined in julia1.6 (#1375)
Pool free can switch tasks (#1384)

Merged pull requests:

Define a compatibility shim for errormonitor (#1378) (@vchuravy)
Backport #1361 to 3.8 (#1379) (@vchuravy)
Backports for 3.8.2 (#1381) (@maleadt)
Remove broken errormonitor implementation, just don't use it on 1.6. (#1382) (@maleadt)
Memory pool improvements (#1383) (@maleadt)

Contributors

vchuravy and maleadt

Assets 2

15 Feb 17:25

github-actions

v3.8.1

9d04926

v3.8.1

CUDA v3.8.1

Diff since v3.8.0

Closed issues:

one(::CuMatrix) result on cpu (#142)
Broadcasted setindex! triggers scalar setindex! (#101)
OutOfGPUMemoryError With Available Memory (#1346)
Distributions.jl with CuArrays (#1347)
Views of Flux OneHotArrays (#1349)
synchronize(blocking = false) hangs in julia 1.7 eventually (#1350)
unsupported call through a literal pointer (call to log1pf) on Julia 1.6.5 (#1352)
SpecialFunctions ^1.8 compat entry? (#1354)
Performance deprecation using ^ on Float32 (#1358)
Method definition setindex!(LinearAlgebra.Diagonal{T, V} ... overwritten in module CUDA (#1364)
[PackageCompiler] Segmentation fault with CUDA.jl in multiversioning (#1365)
Vectors in customary structs make julia stuck (#1366)
sparseCSC-dense matrix multiplication yields unstable results (#1368)
UndefVarError: parameters not defined on Windows10 (#1371)

Merged pull requests:

Optimize memoization helpers. (#1345) (@maleadt)
Update manifest (#1348) (@github-actions[bot])
Update manifest (#1355) (@github-actions[bot])
Fastmath improvements (#1356) (@maleadt)
Make the default pool visible when doing P2P (#1357) (@maleadt)
Fix resize of empty arrays. (#1359) (@maleadt)
CUSPARSE: add COO ctors and similar with eltype. (#1360) (@maleadt)
Add device_override for SpecialFunctions.gamma (#1361) (@vchuravy)
Implement (limited) broadcast of sparse arrays (#1367) (@maleadt)
Make nonblocking synchronization robust to errors. (#1369) (@maleadt)
Update manifest (#1370) (@github-actions[bot])
Backports for 3.8.1 (#1374) (@maleadt)

Contributors

vchuravy and maleadt

Assets 2

28 Jan 16:32

github-actions

v3.8.0

e1507d3

v3.8.0

CUDA v3.8.0

Diff since v3.7.1

Closed issues:

Consider reserving memory (#1320)

Merged pull requests:

Slight changes to pool management (#1344) (@maleadt)

Contributors

maleadt

Assets 2

27 Jan 11:00

github-actions

v3.7.1

fb01adb

v3.7.1

CUDA v3.7.1

Diff since v3.7.0

Closed issues:

Moving data between devices (#1136)
Repeated has_cuda_gpu errors when CUDA_VISIBLE_DEVICES is empty (#1331)
Error when env var CUDA_VISIBLE_DEVICES is set but empty (#1336)

Merged pull requests:

Wrap and test peer to peer memory copies (#1284) (@kshyatt)
Update manifest (#1332) (@github-actions[bot])
Have libcuda() fail repeatedly if anything (e.g. init) failed. (#1333) (@maleadt)
Simplify workarounds. (#1334) (@maleadt)
Properly detect a missing driver. (#1335) (@maleadt)
Various small fixes (#1337) (@maleadt)
Move CUDA.jl global state innto CUDAdrv wrapper "submodule" (#1338) (@maleadt)
Add CUDA.return_type (#1339) (@tkf)
Compute-sanitizer QOL improvements and docs (#1340) (@maleadt)
Fix regression in backwards CUFFT plans. (#1341) (@maleadt)
Don't assume host pointers are directly usable on the device. (#1342) (@maleadt)
Backports for 3.7.1 (#1343) (@maleadt)

Contributors

tkf, maleadt, and kshyatt

Assets 2

21 Jan 16:55

github-actions

v3.7.0

92f0dce

v3.7.0

CUDA v3.7.0

Diff since v3.6.4

Closed issues:

mul! is missing for plan_fft! (#1311)
Segfault with CUDA in a sysimage (#1314)
CuSparse does not support broadcast (#1317)
CUDA.functional(true) errors instead of printing "why" and returning false (#1318)
Interesting timings (#1323)
Syncronization how to? (#1324)

Merged pull requests:

Remove debug info hack. (#1259) (@maleadt)
Update manifest (#1312) (@github-actions[bot])
CUFFT improvements (#1313) (@maleadt)
Add additional quirks. (#1315) (@maleadt)
Use pointer to async_send directly instead of a wrapper function (#1319) (@vchuravy)
Update manifest (#1325) (@github-actions[bot])
Add support and test CUDA 11.6. (#1326) (@maleadt)
Bump CUTENSOR, expose libcutensorMg. (#1327) (@maleadt)
Bump CUDNN to v8.3.2. (#1328) (@maleadt)
Enable use of CUDA 11.6. (#1329) (@maleadt)

Contributors

vchuravy and maleadt

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA v3.9.1

Contributors

CUDA v3.9.0

Contributors

CUDA v3.8.5

CUDA v3.8.4

Contributors

CUDA v3.8.3

Contributors

CUDA v3.8.2

Contributors

CUDA v3.8.1

Contributors

CUDA v3.8.0

Contributors

CUDA v3.7.1

Contributors

CUDA v3.7.0

Contributors

Releases: JuliaGPU/CUDA.jl

v3.9.1

CUDA v3.9.1

Contributors

v3.9.0

CUDA v3.9.0

Contributors

v3.8.5

CUDA v3.8.5

v3.8.4

CUDA v3.8.4

Contributors

v3.8.3

CUDA v3.8.3

Contributors

v3.8.2

CUDA v3.8.2

Contributors

v3.8.1

CUDA v3.8.1

Contributors

v3.8.0

CUDA v3.8.0

Contributors

v3.7.1

CUDA v3.7.1

Contributors

v3.7.0

CUDA v3.7.0

Contributors