Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cannot differentiate getindex(::Tuple, ::Int) #61

Closed
Roger-luo opened this issue May 27, 2020 · 5 comments
Closed

cannot differentiate getindex(::Tuple, ::Int) #61

Roger-luo opened this issue May 27, 2020 · 5 comments

Comments

@Roger-luo
Copy link

I tried this very simple code on current master branch

julia> grad((x, y)->sum((x, y)[1]), rand(10), rand(10))
┌ Error: Failed to find a derivative for %5 = getindex(%3, %4)::Array{Float64,1} at position 1, current state of backpropagation saved to Yota.DEBUG_STATE
└ @ Yota ~/.julia/packages/Yota/6ZZpD/src/grad.jl:164
ERROR: Can't find differentiation rule for (getindex)(var"%3", var"%4") at 1 with types DataType[Tuple{Array{Float64,1},Array{Float64,1}}, Int64])
Stacktrace:
 [1] error(::String) at ./error.jl:33
 [2] deriv_exprs(::Expr, ::Array{DataType,1}, ::Int64) at /Users/roger/.julia/packages/Yota/6ZZpD/src/diffrules/diffrules.jl:265
 [3] deriv!(::Yota.Tape, ::Yota.Call, ::Int64, ::Yota.Call) at /Users/roger/.julia/packages/Yota/6ZZpD/src/grad.jl:98
 [4] step_back!(::Yota.Tape, ::Yota.Call, ::Int64) at /Users/roger/.julia/packages/Yota/6ZZpD/src/grad.jl:158

I guess it's because of a missing primitive for getindex?

@dfdx
Copy link
Owner

dfdx commented May 27, 2020

Thanks for the report! Indeed, we have a diffrule for getindex over arrays, but not over tuples. It might be the case that fixing it will be as easy as copying the rule for arrays, but given that we treat tuples quote differently in many cases (e.g. when unpacking multiple return values from a function) I need to test it thoroughly before putting to master. I will have time to look at it tomorrow approximately at the same time.

@Roger-luo
Copy link
Author

I think since Tuple is immutable, one needs to use a zero type in ungetindex somehow, e.g

dy -> (zero(T1), dy, zero(T2), zero(T3))

but to avoid unnecessary allocations I'm wondering how you represent zero here? (for reference, in Zygote, nothing is used to represent a "symbolic" zero, or in ChainRules, it's Zero)

@dfdx
Copy link
Owner

dfdx commented May 28, 2020

Currently there's no special handling of zeros since there was very few of them in my own experience. Do you have a motivating example where it is an issue?

Unfortunately I didn't have time today to work out this issue completely, but here's the starting point (for my own reference) to proceed tomorrow:

function ungetindex(x::Tuple, dy, I...)
    dx = map(1:length(x)) do i
        i in I ? dy : zero(x[i])
    end
    return dx
end


@diffrule getindex(u::Tuple, i) u ungetindex(u, dy, i)
@diffrule getindex(u::Tuple, i, j) u ungetindex(u, dy, i, j)
@diffrule getindex(u::Tuple, i, j, k) u ungetindex(u, dy, i, j, k)
@nodiff getindex(u::Tuple, i) i
@nodiff getindex(u::Tuple, i, j) i
@nodiff getindex(u::Tuple, i, j) j
@nodiff getindex(u::Tuple, i, j, k) i
@nodiff getindex(u::Tuple, i, j, k) j
@nodiff getindex(u::Tuple, i, j, k) k

(and yes, it uses exactly the trick with zero you mentioned)

@Roger-luo
Copy link
Author

Hi thanks,

yes, this will happen when I have a function returns two arrays, e.g

function foo(..)
   # some calculation
   return A, B
end

but only one of them is used. And a more concrete example is that this is exactly how Flux's recurrent layers are defined, and I believe it could happen in many multi-return functions.

using zero gives the right result, but will also result in unnecessary large allocation

@dfdx
Copy link
Owner

dfdx commented May 29, 2020

The issue is now fixed on master and new version should be automatically tagged in a few hours.

Thanks for RNN example, it's very intuitive and relevant. Indeed, it's possible to eliminate these allocations by propagating some kind of zero indicator, however I can't think out possible consequences of such a change right now, so I've created #63 as an idea for improved memory management.

Please let me know if you have any issues with getindex(::Tuple, ...) and feel free to elaborate on #63

@dfdx dfdx closed this as completed May 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants