-
Notifications
You must be signed in to change notification settings - Fork 218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CuArrays don't seem to display correctly in VS code #875
Comments
i cannot reproduce this, same vs code on windows with cuda.jl |
Interesting. Its 100% reproduceable for me. I tried setting number of threads to 1 and its still 100%. Any way it could have to do with card type or driver version? Looks like more or less exactly the same symptom. I also sometimes see garbage values, especially when there are more values. Does this shed any more light?
julia> function fff(x)
cx = cu(x)
1+2
return cx
end
fff (generic function with 1 method)
julia> fff(xx)
3-element CuArray{Float32, 1}:
0.0
0.0
0.0
julia> fff(xx)
3-element CuArray{Float32, 1}:
0.0
0.0
0.0
julia> function fff(x)
cx = cu(x)
CUDA.device_synchronize()
return cx
end
fff (generic function with 1 method)
julia> fff(xx)
3-element CuArray{Float32, 1}:
1.190589
-0.6828699
0.05489877
julia> fff(xx)
3-element CuArray{Float32, 1}:
1.190589
-0.6828699
0.05489877
julia> xx = randn(5, 10)
5×10 Matrix{Float64}:
1.67942 0.8585 1.27553 0.371574 -1.39436 -0.374869 -0.0694932 -1.40259 1.60282 -0.928131
0.873455 -0.284457 0.968319 1.06297 -0.877972 -2.09631 -1.63726 -0.652214 1.78131 2.5976
1.94723 0.762199 1.16164 -0.36343 -0.827762 -1.28091 -0.777818 -0.496266 -0.663625 -1.69308
-0.668761 1.34078 0.295551 2.05289 1.75167 1.87313 -0.591423 1.1398 -0.638453 0.685368
-0.44666 0.954526 -0.774744 0.114682 -0.37361 0.251102 -0.446602 1.09315 -0.642535 -0.254417
julia> cu(xx) # Now old values show up here!
5×10 CuArray{Float32, 2}:
1.19059 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
-0.68287 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0548988 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
julia> CUDA.device_synchronize()
julia> cu(xx)
5×10 CuArray{Float32, 2}:
1.19059 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
-0.68287 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0548988 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
I don't know much about VS Code, but I think I have read somewhere in some issue that there is task-y stuff going on. @pfitzseb or @davidanthoff? |
Yes and no. We should however call the REPL ast transforms properly (although maybe only on master). |
So is there a reason why |
Synchronization is automatic: Why does the REPL/vscode want to display on another task? We use task-local storage for many things, including device selection, library handles, etc. I can imagine other scenarios where using different tasks to compute and |
It's not strictly necessary for us to do it like that. Currently we're evaluating user code in a separate task that's created purely for that purpose (so we can schedule interrupts on it). We could conceivably call Also, FWIW, I still can't repro this issue, even with CUDA.jl@3.1. |
I couldn't repro the REPL-related issue on Linux either, but could on Windows (it's timing sensitive, obviously). |
And indeed, by @mcabbott:
Here device 0 doesn't have any free memory, leading to the OOM. |
Ok, so in theory we can fix this by calling
|
You can switch devices, and that's a task local property too. So the returned array can contain a pointer from another device, and fail to show regardless of synchronization. Now, I'm hoping to add automatic cross-device transfers in the future (where data would know which device it is bound to), but the point remains that a package may assume task-local state to remain the same between evaluation and display. Why would this break |
Right, but the normal REPL already uses two different tasks for display and evaluation, so I'm not sure how that's different? We simply add yet another task into the mix.
Basically, VSCode uses an AST transform to evaluate code in a different Task. If we also use that task to call |
Oh yeah my comment was about changing both the REPL and VSCode. Which probably isn't going to happen soon. Respecting the AST transforms in VScode would at least fix the more frequently occurring case (using a single device, but failing to synchronize). |
But we already are respecting AST transforms in the REPL. |
User reports here indicate otherwise. Or is this a recent change? |
Not particularly recent, no. Should've been here unless I messed up while writing the changelog. |
@DrChainsaw Which version of julia-vscode were you using? Can you still reproduce on v1.1.26+? |
I'm not sure if one needs to do something special to keep julia-vscode up to date, but in the extensions tab is says 1.1.40 and I still get the issue (just tested). |
@DrChainsaw Is this still happening? |
The race condition should still be there; we didn't change anything about this code on the VSCode side. |
Right, but the issue was never clear to me. If VSCode runs the AST transform registered by CUDA.jl, the computation should have been synchronized, and displaying it on another task should work (as long as its using the same device, but that was the case here). |
Sorry for delay. Still happens for me :( julia> xx = randn(3)
3-element Vector{Float64}:
-1.1807334363343767
-1.9560892767060414
0.21241781186365263
julia> cu(xx)
3-element CuArray{Float32, 1}:
0.0
0.0
0.0
(CUDAtest) pkg> status
Status `E:\swproj\CUDAtest\Project.toml`
[052768ef] CUDA v3.3.5
julia> versioninfo()
Julia Version 1.7.0-beta3.0
Commit e76c9dad42 (2021-07-07 08:12 UTC)
Platform Info:
OS: Windows (x86_64-w64-mingw32)
CPU: Intel(R) Core(TM) i7-5820K CPU @ 3.30GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-12.0.0 (ORCJIT, haswell)
Environment:
JULIA_DEPOT_PATH = E:/Programs/julia/.julia
JULIA_EDITOR = code
JULIA_NUM_THREADS = 1 Feel free to close the issue as it doesn't bother me at all and it doesn't seem to happen on all installations (see #875 (comment) above). If you want to investigate I'd be happy to do some tests though. |
Finally, I can reproduce this. And I still think it's a VSCode issue. Let's first demo in the REPL: julia> struct Foo
Foo() = (println("executing in $(current_task())"); new())
end
julia> Base.show(io::IO, ::Foo) = print(io, "displaying in $(current_task())")
julia> Foo()
executing in Task (runnable) @0x0000000009ed0010
displaying in Task (runnable) @0x0000000009ed09c0 Recall that the original problem was that display happens on a different task than execution does, which doesn't play nicely with how GPUs asynchronously execute things (i.e., you need to julia> push!(Base.active_repl_backend.ast_transforms, ex->quote
try
$(ex)
finally
println("synchronizing $(current_task())")
end
end)
2-element Vector{Any}:
softscope (generic function with 1 method)
#1 (generic function with 1 method)
julia> Foo()
executing in Task (runnable) @0x0000000009ed0010
synchronizing Task (runnable) @0x0000000009ed0010
displaying in Task (runnable) @0x0000000009ed09c0 So here we first synchronize the task on which asynchronous operations were being executed, before trying to display them. Everything works fine. Now over to VSCode: julia> struct Foo
Foo() = (println("executing in $(current_task())"); new())
end
julia> Base.show(io::IO, ::Foo) = print(io, "displaying in $(current_task())")
julia> Foo()
executing in Task (runnable) @0x000000000d828bb0
displaying in Task (runnable) @0x000000000d613460 The same situation as in the REPL. Let's try our fix: julia> push!(Base.active_repl_backend.ast_transforms, ex->quote
try
$(ex)
finally
println("synchronizing $(current_task())")
end
end)
4-element Vector{Any}:
revise_first (generic function with 1 method)
softscope (generic function with 1 method)
#66 (generic function with 1 method)
#5 (generic function with 1 method)
julia> Foo()
executing in Task (runnable) @0x000000000d828bb0
synchronizing Task (runnable) @0x0000000009f90010
displaying in Task (runnable) @0x000000000d613460 And our fix doesn't work because our hooked code executes in yet another task. @pfitzseb does this ring a bell? |
Yeah, that makes sense. Tricky to fix though because we don't have control over when REPL hooks are added and we need to be careful not to run them twice, Maybe I can try to figure out a good way to always use the REPL backend task for execution. |
To work around this from the CUDA.jl side, I'll just synchronize the entire device after each REPL line. Not ideal though. |
This also won't fix the whole issue, because IIRC the REPL transforms aren't run for inline evaluation. |
Maybe those should be subject to AST transforms, too? Or how would you suggest making sure the data is ready? |
Yes, they should. Just need to take care to do everything on the same task, which will require a bit of a refactor. |
FYI, I am on the most recent versions and this just happened to me. Manifest.toml# This file is machine-generated - editing it directly is not advisedjulia_version = "1.7.2" [[deps.AbstractFFTs]] [[deps.Adapt]] [[deps.ArgTools]] [[deps.Artifacts]] [[deps.BFloat16s]] [[deps.Base64]] [[deps.CEnum]] [[deps.CUDA]] [[deps.ChainRulesCore]] [[deps.ChangesOfVariables]] [[deps.Compat]] [[deps.CompilerSupportLibraries_jll]] [[deps.Dates]] [[deps.DocStringExtensions]] [[deps.Downloads]] [[deps.ExprTools]] [[deps.GPUArrays]] [[deps.GPUCompiler]] [[deps.InteractiveUtils]] [[deps.InverseFunctions]] [[deps.IrrationalConstants]] [[deps.JLLWrappers]] [[deps.LLVM]] [[deps.LLVMExtra_jll]] [[deps.LazyArtifacts]] [[deps.LibCURL]] [[deps.LibCURL_jll]] [[deps.LibGit2]] [[deps.LibSSH2_jll]] [[deps.Libdl]] [[deps.LinearAlgebra]] [[deps.LogExpFunctions]] [[deps.Logging]] [[deps.Markdown]] [[deps.MbedTLS_jll]] [[deps.MozillaCACerts_jll]] [[deps.NetworkOptions]] [[deps.OpenBLAS_jll]] [[deps.OpenLibm_jll]] [[deps.OpenSpecFun_jll]] [[deps.Pkg]] [[deps.Preferences]] [[deps.Printf]] [[deps.REPL]] [[deps.Random]] [[deps.Random123]] [[deps.RandomNumbers]] [[deps.Reexport]] [[deps.Requires]] [[deps.SHA]] [[deps.Serialization]] [[deps.Sockets]] [[deps.SparseArrays]] [[deps.SpecialFunctions]] [[deps.Statistics]] [[deps.TOML]] [[deps.Tar]] [[deps.Test]] [[deps.TimerOutputs]] [[deps.UUIDs]] [[deps.Unicode]] [[deps.Zlib_jll]] [[deps.libblastrampoline_jll]] [[deps.nghttp2_jll]] [[deps.p7zip_jll]] Julia versioninfoJulia Version 1.7.2 GPU: Mobile RTX 2080 |
That's not too surprising, considering that nothing changed in how this is implemented in VS Code. |
Describe the bug
Seems like CuArrays don't display correctly in the VS code REPL on windows, although the values seem to be ok. Don't have a linux install to test on. Also seems to work fine when running julia in windows terminal.
To reproduce
The Minimal Working Example (MWE) for this bug:
Manifest.toml
Expected behavior
Values should be represented correctly.
Version info
Details on Julia:
Details on CUDA:
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: