-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add one-arg method Duplicated(x)
#2118
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## main #2118 +/- ##
==========================================
+ Coverage 67.50% 70.22% +2.71%
==========================================
Files 31 42 +11
Lines 12668 15784 +3116
==========================================
+ Hits 8552 11084 +2532
- Misses 4116 4700 +584 ☔ View full report in Codecov by Sentry. 🚨 Try these New Features:
|
My only concern about this is that it's rather reverse-mode centric. On the other hand, since there's no universally obvious candidate for what to put in there for forward mode, I guess it's fine to make this method favour reverse mode? |
That is true. I don't think there's a downside for forward mode, just no benefit. Unless you think that for forward mode, something like julia> DualNumbers.Dual(42.)
42.0 + 0.0ɛ ... where zero dual is a bit useless, but does construct the type you requested. And julia> v = [DualNumbers.Dual(33.0, 1.0)];
julia> push!(v, 42)
2-element Vector{Dual128}:
33.0 + 1.0ɛ
42.0 + 0.0ɛ |
Yeah, I don't see any obvious use for it for enzyme so it's probably fine. |
I feel like as long as If we had different user-facing duplicted types for forward and reverse, e.g.,
|
Oh no, now I'm worried this PR will turn into a punching bag for API / documentation complaints. From a human UI perspective, it does seem a bit odd to re-use the exact same julia> xdx = Duplicated(3.0, 100.0);
julia> autodiff(Forward, abs2, xdx) # ok, like dual numbers
(600.0,)
julia> autodiff(Reverse, abs2, Active, xdx) # maybe this should be an error?
((nothing,),)
julia> xdx # Duplicated{Float64} is never a useful container for reverse mode
Duplicated{Float64}(3.0, 100.0)
julia> y = (array = [1.0, 2.0], float = 3.0);
julia> ydy = Duplicated(y, make_zero(y));
julia> autodiff(Reverse, y -> sum(y.array .* y.float), Active, ydy)
((nothing,),)
julia> ydy.dval # this isn't useless, but does require some understanding
(array = [3.0, 3.0], float = 0.0) You need Anyway, this PR changes nothing at all about forward mode. Please make a separate issue to discuss changing from Duplicated to some new Dual for forward mode. (Or to discuss more clearly documenting things.) The one thing that is on-topic for this PR is that we could make |
I'm very down to make/enforce the use of a separate Dual type for forward mode (which the Rust Enzyme stuff does), but yeah that's separate from here, if you want to open a different issue/PR on that. Obviously that would be quite breaking [and need corresponding checks throughout enzyme]. Part of the reason for the one duplicated for both is that from the implementation (as opposed to the user side) they are very much the same, in the sense that a separate shadow data structure is created and maintained (hence the very literal name "duplicated" as in we duplicated the data structure from primal to shadow) |
I don't think that's a fair response. Like it or not, If we did disentangle the APIs so Forward mode didn't use |
Instead of writing
this wants to make it easy to just look after one thing,
and later get
dx = xdx.dval
out when you need it. Or not... add methods which consume the gradient likeupdate!(::Duplicated)
instead of unpacking.Should there also be
DuplicatedNoNeed(x)
, and perhapsBatchDuplicated(x, n::Int)
? Or [edit] perhapsMixedDuplicated(x)
is the closest other struct here.