Skip to content

Releases: JuliaGPU/CUDA.jl

v4.4.0

26 Jun 20:29
315c80e
Compare
Choose a tag to compare

CUDA v4.4.0

Diff since v4.3.2

Closed issues:

  • Unreachable control flow leads to illegal divergent barriers (#1746)
  • CUBLAS fails on new CUDA.jl v4 (#1852)
  • Sort fails on Lovelace (sm8.9) GPUs (#1874)
  • gesvd! crashes on Pascal and v12.0 (#1932)
  • No effect for calling "nsys launch" (#1938)
  • Basic math operations with nested adjoint and transpose (#1940)
  • CPU and GPU implementations return results at dissimilar scales, even in double precision arithmetics (#1950)
  • Failed CUDA.jl initialization breaks Flux? (#1952)
  • Recent mul! changes break multiplication with matrices that have StaticArray elements (#1953)
  • Test infrastructure: define test groups (#1961)
  • Strange rand errors when sampling large matrices (#1963)
  • Add aqua tests (#1964)
  • Support of Orin GPU from Nvidia ? (#1966)
  • Crash in LLVM (#1971)
  • Warning cuDNN Convolution (#1972)
  • Strange behaviour when installed at system level (#1973)

Merged pull requests:

v4.3.2

02 Jun 05:55
acd245e
Compare
Choose a tag to compare

CUDA v4.3.2

Diff since v4.3.1

Merged pull requests:

v4.3.1

31 May 19:40
b7420f8
Compare
Choose a tag to compare

CUDA v4.3.1

Diff since v4.3.0

Closed issues:

  • Array testsuite compiles kernel with large types (#1902)
  • CUDA.jl v4 installs CUDA runtime despite version=local (#1922)
  • Occaisonal "CUSOLVERError: an internal operation failed (code 7, CUSOLVER_STATUS_INTERNAL_ERROR)" (#1924)
  • Does cuDNN@v1.0.4 need CUDA@v4.3? (#1929)

Merged pull requests:

v4.3.0

23 May 18:34
d3b1363
Compare
Choose a tag to compare

CUDA v4.3.0

Diff since v4.2.0

Closed issues:

  • Multidimensional reverse (#1126)
  • Test errors on master (#1866)
  • Integer overflow error with svd for large matrix (#1880)
  • Erratic behaviour of CUDA.jl if used in the REPL of VSCode. (#1892)
  • QR decomposition requires scalar indexing (#1893)
  • BSOD during package tests (#1898)
  • Insufficient coverage of CuArrays in the documentation (#1901)
  • Failed to compile with Julia v1.9 on PowerPC (#1911)
  • CUDA test failed in wmma.jl (#1914)
  • Fix deprecation warnings (#1920)

Merged pull requests:

v4.2.0

02 May 13:25
af65a44
Compare
Choose a tag to compare

CUDA v4.2.0

Diff since v4.1.4

Closed issues:

  • NVTX: consider using Start/End for ranges (#1485)
  • Limitations of CuIterator (#1768)
  • Testing fails on unsupported devices. (#1815)
  • Local runtime discovery does not work for external libraries (CUDNN, CUTENSOR) (#1850)
  • Passing tests using Github CI workflow errors with libcuda not defined (#1867)
  • Cannot precompile GPU code with SnoopPrecompile (#1870)
  • Incorrect kernel execution with bounds checking using Julia 1.9.0-rc2 (#1875)
  • Fake CUDA library (#1879)
  • Error thrown when launching Julia with Nsight systems or compute. (#1886)
  • Cannot construct CuDeviceArray (#1887)
  • Incorrect colVal array when using CuSparseMatrixCSR command on sparse matrix (#1888)

Merged pull requests:

  • Use adapt symmetrically in CuIterator (#1769) (@mcabbott)
  • Allow but warn when testing on not fully-supported devices. (#1818) (@maleadt)
  • Support runtime discovery for non-toolkit libraries (CUTENSOR, CUDNN, CUQUANTUM) (#1858) (@mloubout)
  • Add KernelAbstractions.jl unsafe_free! (#1863) (@pxl-th)
  • Allow precompiling CUDA code. (#1865) (@maleadt)
  • Assert CUDA.jl is functional when creating the TLS. (#1868) (@maleadt)
  • Update manifest (#1871) (@github-actions[bot])
  • Don't collect AbstractQ objects in tests (#1872) (@dkarrasch)
  • Add compatibility entry for Lovelace (#1873) (@xaellison)
  • remove some type-piracy from cusparse (#1876) (@vtjnash)
  • Remove more unneeded ndims methods. (#1878) (@maleadt)
  • Guard the initialization-time CUDA driver check in a try/catch. (#1881) (@maleadt)
  • Update manifest (#1882) (@github-actions[bot])
  • Update CUDA 12.1 to 12.1.1. (#1883) (@maleadt)
  • Use atomics for allocation statistics. (#1884) (@maleadt)
  • Fix atomic increment of alloc stats. (#1885) (@maleadt)
  • Update manifest (#1889) (@github-actions[bot])

v4.1.4

13 Apr 15:31
7e86df8
Compare
Choose a tag to compare

CUDA v4.1.4

Diff since v4.1.3

Closed issues:

  • Buggy precompilation of init-defined symbols can break CUDA_Driver_jll initialization (#1798)
  • Calling CUDA.set_runtime_version!() with float parameter makes CUDA.jl unusable. (#1831)
  • Unexpexted memory allocation when using randn! (#1856)
  • The memory copy speed seems to exceed the hardware limit (#1860)
  • PCG produces different output on GPU (via Krylov.jl) (#1864)

Merged pull requests:

  • Fix system_driver_version on platforms not supported by CUDA_Driver_jll. (#1854) (@maleadt)
  • Update manifest (#1861) (@github-actions[bot])

v4.1.3

31 Mar 16:08
4e8f45b
Compare
Choose a tag to compare

CUDA v4.1.3

Diff since v4.1.2

Closed issues:

  • CUDA.versioninfo() triggers download of lazy artifacts (#1844)

Merged pull requests:

  • Choose parallel tests based on CPUs, not threads. (#1842) (@maleadt)
  • Adapt to LLVM.jl 5 and GPUCompiler.jl 0.19. (#1847) (@maleadt)

v4.1.2

29 Mar 08:23
1aa3e6b
Compare
Choose a tag to compare

CUDA v4.1.2

Diff since v4.1.1

Closed issues:

  • Flux's gradient differentiatingrfft leads to non-bit error (#1835)

Merged pull requests:

v4.1.1

26 Mar 04:08
d235d35
Compare
Choose a tag to compare

CUDA v4.1.1

Diff since v4.1.0

Merged pull requests:

v4.1.0

24 Mar 02:30
cf4598f
Compare
Choose a tag to compare

CUDA v4.1.0

Diff since v4.0.1

Closed issues:

  • ERROR: LoadError: bin\cublas64_11.dll when installing CUDA (#1750)
  • System-wide CUDA in LD_LIBRARY_PATH breaks CUBLAS (#1755)
  • CuDeviceTexture getindex breaks when executed on the CPU (#1757)
  • cuDNN.version can cause Julia to crash, missing cudnn_ops_infer64_8.dll (#1777)
  • cuDNN compile error "ERROR: LoadError: ArgumentError: invalid version string: local" (#1783)
  • "Error: No CUDA Runtime library found" for ≥v4.0.0 (#1808)
  • sqrt broken in kernels 'Format of __nvvm__reflect function not recognized' (#1817)

Merged pull requests: