Forward over reverse drops custom rule if corresponding function is inlined #1795

danielwe · 2024-09-05T22:44:54Z

My custom reverse rule works as expected in 1st order reverse mode, but I need to give the corresponding function @noinline tag for it to be picked up in 2nd order forward over reverse. In the MWE below I've introduced a bug in the rule such that both the gradient and the hv product should be different between g and g_custom; gradients are always different, but hv products are only different when I add @noinline.

using Enzyme

#=@noinline=# f(x) = sum(abs2, x)
#=@noinline=# f_custom(x) = sum(abs2, x)

g(x) = cos(f(x))
g_custom(x) = cos(f_custom(x))

function dg_deferred!(dx, x)
    make_zero!(dx)
    autodiff_deferred(Reverse, g, Active, Duplicated(x, dx))
    return nothing
end

function dg_custom_deferred!(dx, x)
    make_zero!(dx)
    autodiff_deferred(Reverse, g_custom, Active, Duplicated(x, dx))
    return nothing
end

function EnzymeRules.augmented_primal(
    config::EnzymeRules.Config, f::Const{typeof(f_custom)}, ::Type{<:Active}, x::Duplicated
)
    tape = EnzymeRules.overwritten(config)[2] ? copy(x.val) : nothing
    primal = EnzymeRules.needs_primal(config) ? f.val(x.val) : nothing
    return EnzymeRules.AugmentedReturn(primal, nothing#=shadow=#, tape)
end

function EnzymeRules.reverse(
    config::EnzymeRules.Config,
    ::Const{typeof(f_custom)},
    dret::Active,
    tape,
    x::Duplicated,
)
    xval = EnzymeRules.overwritten(config)[2] ? tape : x.val
    x.dval .= (2dret.val) .* xval
    x.dval .^= 2  # Deliberate bug as signature of custom rule 🐛
    return (nothing,)
end

x = [2.0]
dx, dx_custom = make_zero(x), make_zero(x)

v = first(onehot(x))
hv, hv_custom = make_zero(v), make_zero(v)

# gradients
dg_deferred!(dx, x)
@show dx

dg_custom_deferred!(dx_custom, x)
@show dx_custom

# hvps
autodiff(Forward, dg_deferred!, Const, Duplicated(dx, hv), Duplicated(x, v))
@show hv

autodiff(
    Forward,
    dg_custom_deferred!,
    Const,
    Duplicated(dx_custom, hv_custom),
    Duplicated(x, v),
)
@show hv_custom

Output as written: different gradients, equal hv products.

dx = [3.027209981231713]
dx_custom = [9.164000270468907]
hv = [11.971902924433648]
hv_custom = [11.971902924433648]

Output with @noinline: both gradients and hv products different.

dx = [3.027209981231713]
dx_custom = [9.164000270468907]
hv = [11.971902924433648]
hv_custom = [72.48292805436535]

The text was updated successfully, but these errors were encountered:

wsmoses · 2024-12-07T06:36:12Z

@danielwe with the just landed fixes to nesting (and in particular not calling deferred anymore on the inside but regular autodiff), does this still err?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Forward over reverse drops custom rule if corresponding function is inlined #1795

Forward over reverse drops custom rule if corresponding function is inlined #1795

danielwe commented Sep 5, 2024

wsmoses commented Dec 7, 2024

Forward over reverse drops custom rule if corresponding function is inlined #1795

Forward over reverse drops custom rule if corresponding function is inlined #1795

Comments

danielwe commented Sep 5, 2024

wsmoses commented Dec 7, 2024