-
Notifications
You must be signed in to change notification settings - Fork 218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
at-benchmark captures GPU arrays #156
Comments
Problem is that CUDAdrv.CuArray memory management is pretty simple: Alternatively, with CuArray from CuArrays.jl, the memory management is much more sophisticated and forces Julia GC collection when the GPU goes out of memory. Switching to that type (which you ought to anyway, since I've just deprecated CUDAdrv.CuArray) solves your problem and makes your example work 🙂 |
I switched to |
Does it still happen after 5+3 iterations? How much GPU does your GPU have? |
CUDAnative, CuArrays, CUDAdrv, GPUArrays, CUDAapi, NNlib master branches and Julia: julia> versioninfo()
Julia Version 1.0.1
Commit 0d713926f8 (2018-09-29 19:05 UTC)
Platform Info:
OS: Linux (x86_64-pc-linux-gnu)
CPU: Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-6.0.0 (ORCJIT, skylake) |
I used 5000 this time, I think this machine has less memory, and it still fails after the 3rd. using CuArrays, CUDAnative, BenchmarkTools
function kernel_vadd(a, b, c)
i = (blockIdx().x-1) * blockDim().x + threadIdx().x
c[i] = a[i] + b[i]
return nothing
end
function h(m, n)
# CUDAdrv functionality: generate and upload data
a = round.(rand(Float32, (m, n)) * 100)
b = round.(rand(Float32, (m, n)) * 100)
d_a = CuArray(a)
d_b = CuArray(b)
d_c = similar(d_a) # output array
@cuda threads=12 kernel_vadd(d_a, d_b, d_c)
end
function h_bench(m, n)
# CUDAdrv functionality: generate and upload data
a = round.(rand(Float32, (m, n)) * 100)
b = round.(rand(Float32, (m, n)) * 100)
d_a = CuArray(a)
d_b = CuArray(b)
d_c = similar(d_a) # output array
@benchmark @cuda threads=12 kernel_vadd($d_a, $d_b, $d_c)
end
# Works
for i in 1:5
@show i
h(5_000, 5_000)
end
# Errors after i = 3
for i in 1:5
@show i
h_bench(5_000, 5_000)
end |
Confirmed. Another bug with JuliaGPU/CuArrays.jl#169 |
Ah, so apart from bugs, this is julia> mutable struct Foo
bar::String
end
julia> function main()
x = Foo("whatever")
finalizer(x) do x
Core.println(x.bar)
end
nothing
end
main (generic function with 1 method)
julia> main()
julia> GC.gc()
whatever
julia> function main()
x = Foo("whatever")
finalizer(x) do x
Core.println(x.bar)
end
@benchmark $x
nothing
end
main (generic function with 1 method)
julia> main()
julia> GC.gc()
julia> GC.gc()
julia> GC.gc()
julia> GC.gc() @jrevels, this seems unwanted, but somewhat expected? Are there workarounds? |
Actually, didn't mean to close this. Fixed some bugs in JuliaGPU/CuArrays.jl#212 but the |
Oof, yup. Right now, interpolated variables get closed over in the benchmarking harness here. We should probably change this so that For Anyway, good food for thought! |
Is there a link to the related BenchmarkTools issue ? |
thanx |
Closing this as it's an issue with BenchmarkTools, really. |
The following code triggers this error:
The text was updated successfully, but these errors were encountered: