-
-
Notifications
You must be signed in to change notification settings - Fork 611
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Un-intended behaviour? Should Flux be able to reduce StaticArrays? #2180
Comments
This is FluxML/NNlib.jl#507 right? You could try pinging those who wrote the |
I am not sure, I just went "using Flux" and then it exposes NNlib to me. I suppose the GPU version might be using NNlibCUDA in the background, but I have not tested that |
To be honest I had forgotten I had made the first issue. I made an updated comment there, thank you EDIT: I think it is fine to close this issue, since it is not Flux I think? But the underlying library not supporting it? |
Let's not duplicate this discussion between 2 GitHub issues and Discourse. |
Hello
I am trying:
using CUDA
S = rand(SVector{3,Float32},5)
DST = zeros(SVector{3,Float32},5)
I = [3,1,2,5,4]
Works great on CPU
@CUDA.time NNlib.scatter!(+, DST,S,I)
0.000004 seconds
5-element Vector{SVector{3, Float32}}:
[1.3667885, 1.081403, 1.1134366]
[1.6543723, 0.50424564, 1.1448298]
[0.8695842, 1.8623418, 1.398939]
[1.0094697, 0.0052466393, 0.09184897]
[1.9389011, 0.3291297, 1.2308507]
Bugs out on GPU
@CUDA.time NNlib.scatter!(+, CuArray(DST),CuArray(S),CuArray(I))
ERROR: InvalidIRError: compiling kernel #scatter_kernel!(typeof(+), CuDeviceVector{SVector{3, Float32}, 1}, CuDeviceVector{SVector{3, Float32}, 1}, CuDeviceVector{Int64, 1}) resulted in invalid LLVM IR
Reason: unsupported dynamic function invocation (call to atomic_cas!)
Anyone knows why?
This is an example, I need it to work on GPU for a more complex case
Kind regards
The text was updated successfully, but these errors were encountered: