-
-
Notifications
You must be signed in to change notification settings - Fork 612
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPU error when using Zeros() as bias in Conv layer #1332
Comments
I wonder if this has anything to do with fmap turning Have you tried using If that doesn't work, then this is a separate issue: julia> cc = Conv(ones(1,1,1,10), Flux.Zeros())
Conv((1, 1), 1=>10)
julia> cc(ones(1,1,1,10))
julia> cc(ones(1,1,1,10)) |> size
(1, 1, 10, 10)
julia> cc32 = Flux.f32(cc)
Conv((1, 1), 1=>10)
julia> cc32(ones(1,1,1,10)) |> size
ERROR: MethodError: no method matching reshape(::Float32, ::Int64, ::Int64, ::Colon, ::Int64)
Closest candidates are:
reshape(::FillArrays.AbstractFill, ::Union{Colon, Int64}...) at C:\Users\echrska\.julia\packages\FillArrays\tE9Xq\src\FillArrays.jl:209
reshape(::OffsetArrays.OffsetArray, ::Union{Colon, Int64}...) at C:\Users\echrska\.julia\packages\OffsetArrays\sUnpU\src\OffsetArrays.jl:234
reshape(::AbstractArray, ::Union{Colon, Int64}...) at reshapedarray.jl:117
...
Stacktrace:
[1] (::Conv{2,4,typeof(identity),Array{Float32,4},Float32})(::Array{Float64,4}) at C:\Users\echrska\.julia\packages\Flux\05b38\src\layers\conv.jl:145
[2] top-level scope at REPL[19]:1
julia> cc32.bias
0.0f0 Happy to do a PR to fix this, but it would be good with some directions. I see the following alternatives:
I think 1) is the only feasible option out of those, but I'd like some confirmation before I proceed with the PR. |
The question is what would reshaping it imply. For this case it seems like a function is causing Zeros to materialize, if we can avoid that, we should be fine. Further for the OP, again understanding which function is causing the zeros to materialize would be useful |
It seems like there is an possible to make Zeros behave like an array of zeros with a given shape. I though that could be useful so that in case it materializes, it will at least be valid. Preventing it from materializing is ofc not mutually exclusive and should perhaps be done to the largest extent possible. Isn't there a risk though that it requires alot of special treatment in packages which should not need to care about it, e.g. CUDA (and in the future ROCArrays or what it will end up being called)? In my example above it is Adapt.adapt which causes it to materialize. Implementing Summary is that I still think 1 is good to prevent crashes in case it accidentally materializes, but also prevent it from materializing at least in the mapping functions provided by Flux. |
What kinds of areas do you think it would get in the way of CUDA, preventing it from materializing is also so we don't accidentally train on it. Adapt for the |
Ah, how could I not think about that! Preventing them from materializing is of course the only valid option. Throwing for incorrect dims (except Zeros ofc) is probably also good to catch errors earlier. As it is now, the error is thrown far from the cause. I'll try to make a PR for this tonight unless someone else wants to do it. |
A PR would be good. We should define the adapt rule in Flux, if it's sufficient. |
When I try to use a Conv layer without bias (Julia: 1.5, CUDA: v1.3.3, Flux: v0.11.1), I have an error with GPU (all good with CPU).
Minimal example:
Error with julia debug level set to 2:
The text was updated successfully, but these errors were encountered: