Skip to content

Releases: JuliaGPU/CUDA.jl

v2.5.0

13 Jan 02:16
bfb5d73
Compare
Choose a tag to compare

v2.4.0

08 Jan 14:55
2d5700c
Compare
Choose a tag to compare

CUDA v2.4.0

Diff since v2.3.0

Closed issues:

  • cublasXtStrmm test failures on Windows 10 Julia 1.1 (#124)
  • CUSPARSE tests broken (#259)
  • Make @cuda return a kernel object (#341)
  • Depend on CompilerSupportLibraries (#359)
  • CUBLAS and exceptions test failures on Windows (#536)
  • argmax(::CuArray) returns nothing with NaN-values (#553)
  • Multiple @cuDynamicSharedMem in kernel causes unexpected behavior (#555)
  • Illegal memory access with atomic shared memory (#558)
  • CUDA.sqrt will not found symbol "__nv_sqrt" (#559)
  • Exception with CUDA.exp (#561)
  • Use LazyArtifacts instead of Pkg (#570)
  • Test runner: early bail out (#578)
  • memory reporting issue (#579)
  • c[3:4]=0 leads to exception (#580)
  • Add math ops (including broadcast) for half types (#581)
  • Dot product of Array and CuArray fails with CPU address error. (#586)
  • Support for CUDA-capable GPU with compute capability 4.0 like GTX 1080 (#587)
  • mapreducedim! not threadsafe (#588)
  • Allow separate directories for cuda and cudnn (#590)
  • Difficulties installing CUDA on Julia 1.6.0 . (#591)
  • Bug in Initialisation Error (#603)
  • CUDA.jl initialisation fails after suspending Ubuntu 20.04 with CUDA 11.2 (#605)
  • CUDA 11.2 CUBLASError and "CUDA.jl does not yet support CUDA with nvdisasm 11.2.67" (#607)
  • This intrinsic must be compiled to be called (#611)
  • OpenGL interop (#612)
  • Add support for CuFFT callback functions (#614)
  • I can’t multiply a CSR sparse matrix anymore (#615)
  • Julia version requirement (#619)

Merged pull requests:

v2.3.0

19 Nov 01:35
e06704d
Compare
Choose a tag to compare

CUDA v2.3.0

Diff since v2.2.1

Closed issues:

  • Misaligned address on load from Const (#548)

Merged pull requests:

v2.2.1

13 Nov 11:43
1596a2c
Compare
Choose a tag to compare

v2.2.0

13 Nov 09:00
Compare
Choose a tag to compare

CUDA v2.2.0

Diff since v2.1.0

Closed issues:

  • cudnn missing after downloading artifact (#521)
  • Downloading artifact: CUDA110 when using DiffEqFlux (#542)

Merged pull requests:

v2.1.0

30 Oct 12:11
602c549
Compare
Choose a tag to compare

CUDA v2.1.0

Diff since v2.0.2

Closed issues:

  • CUDNN convolution with Float16 always returns zeros (#92)
  • axp(b)y! and mul! (scalar multiplication) with mixed argument types (#144)
  • Dispatching to generic matmul instead of CUBLAS (#164)
  • Support for Ints and Float16? (#165)
  • Subarrays/views support (#172)
  • Easy way to pick among multiple GPUs (#174)
  • More prominently document JULIA_CUDA_USE_BINARYBUILDER (#204)
  • ERROR_COOPERATIVE_LAUNCH_TOO_LARGE during tests (#247)
  • Pkg.test error for cutensor test on Windows (#422)
  • Runtime build improvements (#456)
  • Fusing Wrappers (#467)
  • Could not find nvToolsExt (libnvToolsExt.dylib.1.0 or libnvToolsExt.dylib.1) in /Users/imac/.julia/artifacts/b502baf54095dff4a69fd6aba8667124583f6929/lib (#482)
  • mapreduce assumes commutative op (#484)
  • SubArray Broadcast Bug in 2.0 (#488)
  • Nested SubArray Scalar Indexing (#490)
  • Sparse matrix * view(vector) regression in 2.0 (#493)
  • Error transforming a reshaped 0-dimentional GPU array to a CPU array (#494)
  • test cuda FAILURE (#496)
  • Reshaped CuArray is not DenseCuArray (#511)
  • assignment failure when using array slicing. (#516)

Merged pull requests:

v2.0.2

15 Oct 14:14
Compare
Choose a tag to compare

CUDA v2.0.2

Diff since v2.0.1

Closed issues:

  • cu() behavior for complex floating point numbers (#91)
  • Error when following example on using multiple GPUs on multiple processes (#468)
  • MacOS without nvidia GPU is trying to download CUDA111 on julia nightly (#469)
  • Drop BinaryProvider? (#474)
  • Latest version of master doesn't work on Windows (#477)
  • sum(CUDA.rand(3,3)) broken (#480)
  • copyto!() between cpu and gpu with subarrays (#491)

Merged pull requests:

v2.0.1

05 Oct 08:12
Compare
Choose a tag to compare

CUDA v2.0.1

Diff since v2.0.0

Closed issues:

  • Can't update (#462)

Merged pull requests:

  • Remove duplicate comment (#464) (@blegat)
  • Add functionality to precompile the runtime library. (#465) (@maleadt)
  • Update manifest (#470) (@github-actions[bot])

v2.0.0

02 Oct 07:12
70d93cc
Compare
Choose a tag to compare

CUDA v2.0.0

Diff since v1.3.3

Closed issues:

  • Test failure during threading tests (#15)
  • Bad allocations in memory pool after device_reset! (#16)
  • CuArrays can lose Blas on reshaped views (#78)
  • allowscalar performance (#87)
  • Indexing with a CuArrays causes a 'scalar indexing disallowed' error from checkbounds (#90)
  • 5-arg mul! for CUSPARSE (#98)
  • copyto!(Device, Host) uses scalar iteration in case of type mismatch (#105)
  • Array primitives broken for CUSPARSE arrays (#113)
  • SplittingPool: CPU allocations (#117)
  • error while concatenating to an empty CuArray (#139)
  • Showing sparse arrays goes wrong (#146)
  • Improve test coverage (#147)
  • CuArrays allocates a lot of memory on the default GPU (#153)
  • [Feature Request] Indexing CuArray with CuArray (#155)
  • Reshaping CuArray throws error during backpropagation (#162)
  • Match syntax and APIs against Julia 1.0 standard libraries (#163)
  • CURAND_STATUS_PREEXISTING_FAILURE when setting seed multiple times. (#212)
  • RFC: converts SparseMatrixCSC to CuSparseMatrixCSR via cu by default (#216)
  • Add a CuSparseMatrixCOO type (#220)
  • Test runner stumbles over path separators (#236)
  • Error: Invalid bitcode signature when loading CUDA.jl after precompilation (#293)
  • Atomic operations only work on global memory (#311)
  • Performance: cudnn algorithm selection (#318)
  • CUSPARSE is broken in CUDA.jl 1.2 (#322)
  • Device-side broadcast regression on 1.5 (#350)
  • API for fast math-like mode (#354)
  • CUDA 11.0 Update 1: cublasSetWorkspace (#365)
  • Can't precompile CUDA.jl on Kubuntu 20.04 (#396)
  • CuPtr should be Ptr in cudnnGetDropoutDescriptor (#397)
  • CUDA throws OOM error when initializing API on multiple devices (#398)
  • Cannot launch kernel with > 5 args using Dynamic Parallelism (#401)
  • Reverse performance regression (#410)
  • Tag for LLVM 3? (#412)
  • CUDA not working (#415)
  • StatsBase.transform fails on CuArray (#426)
  • Further unification of CUBLAS.axpy! and LinearAlgebra.BLAS.axpy! (#432)
  • size(range), length(range) and range[end] fail inside CUDA kernels (#434)
  • InitError: Cannot use memory pool 'binned' when CUDA.jl was precompiled for memory pool 'split'. (#446)
  • Missing dispatch for matrix multiplication with views? (#448)
  • New version not available yet? (#452)
  • using CUDA or CUArray, output: UndefVarError: AddrSpacePtr not defined (#457)
  • Unable to upgrade to the latest version (#459)

Merged pull requests:

v1.3.3

25 Aug 11:08
be21077
Compare
Choose a tag to compare

CUDA v1.3.3

Diff since v1.3.2

Closed issues:

  • Type changing Array conversions give error when allowscalar(false) (#344)
  • getindex(::CuArray, ::Adjoint, ::Colon) fails (#345)
  • View with array indices causes memory copy before broadcast (#384)
  • Regression with Julia 1.5 (#390)

Merged pull requests: