Some tweaks to the Getting Started docs #2195

MasonProtter · 2024-12-11T14:13:17Z

I'd like to get around to some more comprehensive documentation contributions, but thought I'd start with some very low hanging fruit / minor tweaks to the getting started page:

I removed the pattern autodiff(mode, f, activity, args...) and replaced it with autodiff(mode, f, args...) since telling it the activity mode is no longer needed
Before delving into various things, I just made it clear that we expect rosenbrock(1.0, 2.0) == 100.0 and the derivative w.r.t. x should be -400.0 and the derivative w.r.t. y should be 200.0. This should hopefully help users orient themselves a bit quicker when they're trying to interpret the (perhaps confusing) outputs of the various autodiff calls.
I mentioned what we mean by 'primal' since this isn't necessarily a word everyone who took a calculus course knows

wsmoses · 2024-12-12T06:19:23Z

so telling it the activity is still helpful. Enzyme does its best job to deduce the return activity, but if something is unstable (e.g. like in #2194) it may fail. In the case of the issue I presume marking the return activity as Active explicitly would allow the type unstable case to succeed.

That said, that doesn't mean we need to introduce the more complex version at the start

mcabbott · 2024-12-13T17:14:51Z

docs/src/index.md

 ```

+where we note for future reference that the value of this function at `x=1.0`, `y=2.0` is `100.0`, and its derivative


Consider showing this as code instead of prose?

julia> rosenbrock(x, y) = (1.0 - x)^2 + 100.0 * (y - x^2)^2; julia> rosenbrock(xy) = (1.0 - xy[1])^2 + 100.0 * (xy[2] - xy[1]^2)^2; julia> rosenbrock(1.0, 2.0) == rosenbrock([1.0, 2.0]) == 100.0 true

I also think you should not call the input of rosenbrock_inp the same thing, x == [x, y] is weird. The name rosenbrock_inp also seems a bit weird, maybe it can just be another method, or if that's too confusing, add a suffix more informative than "inp"? (I'm not sure what INP means, maybe input, but why?)

I think it was originally in place (@michel2323 were you the one to originally author this doc, just by virtue of it being rosenbrock?)

But either way sure!

Ah ok. But this function isn't in-place, it's just going to be used somewhere below in a demonstration that Enzyme likes to handle functions which accept Vector by mutating something else. The reader doesn't know that yet.

very true, maybe rosenbrok_array or something? or even just rosenbrock2

mcabbott · 2024-12-13T17:17:04Z

docs/src/index.md

 The return value of reverse mode [`autodiff`](@ref) is a tuple that contains as a first value
-the derivative value of the active inputs and optionally the primal return value.
+the derivative value of the active inputs and optionally the _primal_ return value (i.e. the 
+value of the undifferentiated function).


Consider not using "value" to mean so many things here?

is a tuple that contains as a first element the derivatives of ..., and optionally the primal value (i.e. what the function returns).

Maybe "optionally" also seems a bit odd to describe the output not the input. It's not that you may omit this. It's that ReverseWithPrimal tells it to.

yeah we definitely don't need to say "derivative value" and can just say "the derivative of"

and "the value of the undifferentiated function" -> "the result of the original function without differentiation"

Also consider putting the ReverseWithPrimal case first, as without it, ((-400.0, 200.0),) seems like a puzzle to count the brackets & guess why.

Perhaps also write it with destructuring syntax, like:

derivs, y = autodiff(ReverseWithPrimal, rosenbrock, Active(1.0), Active(2.0))

oh yeah totally fair, if you want to put that in this PR that would be fine with me!

I can't make suggestions across deleted lines :/ so this is going to be messy...

mcabbott · 2024-12-13T17:32:05Z

docs/src/index.md

@@ -71,8 +73,9 @@ julia> dx
  200.0
 ```

-Both the inplace and "normal" variant return the gradient. The difference is that with
-[`Active`](@ref) the gradient is returned and with [`Duplicated`](@ref) the gradient is accumulated in place.
+Both the inplace and "normal" variant return the gradient. The difference is that with [`Active`](@ref) 


This wording seems weird. The inplace version returns ((nothing,),), it's written right there. That's what return means. And inplace / "normal" are new terms here, which aren't the terms you need to learn to understand Enzyme.

The version with Active arguments (for immutable inputs like x::Float64) returns the gradient. The version with Duplicated (for mutable inputs like x::Vector{Float64}) instead writes the gradient into the Duplicated object, and returns nothing in the corresponding slot of the returned derivs. In fact it accumulates the gradient, i.e. if you run it again it will double dx. (See make_zero! perhaps.) In general a function may accept any mix of Active, Duplicated, and Const arguments.

IDK how much of the end goes here, but the reader should not get the impression that all arguments must have the same type.

perhaps we should say both compute the gradient.

And we can use whatever function names are here for clarity

Suggested change

Both the inplace and "normal" variant return the gradient. The difference is that with [`Active`](@ref)

Both versions calculate the same derivatives. The difference is that with [`Active`](@ref) arguments

codecov-commenter · 2024-12-13T17:38:03Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 70.70%. Comparing base (037dfed) to head (d856c12).
Report is 286 commits behind head on main.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2195      +/-   ##
==========================================
+ Coverage   67.50%   70.70%   +3.20%     
==========================================
  Files          31       55      +24     
  Lines       12668    16302    +3634     
==========================================
+ Hits         8552    11527    +2975     
- Misses       4116     4775     +659

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

mcabbott

This is messy but I made many suggestions.

Should it still mention autodiff(Reverse, f, Active, Duplicated()) i.e. annotating the return type, somewhere? Maybe it's not the first thing to know, but it is surprising to encounter later.
Should this page also introduce gradient(Reverse, f, ...)? Should it do this before or after autodiff?

mcabbott · 2024-12-13T18:16:14Z

docs/src/index.md

+julia> rosenbrock_inp(x) = (1.0 - x[1])^2 + 100.0 * (x[2] - x[1]^2)^2;
 ```

+where we note for future reference that the value of this function at `x=1.0`, `y=2.0` is `100.0`, and its derivative
+with respect to `x` at that point is `-400.0`, and its derivative with respect to `y` at that point is `200.0`. 


Suggested change

julia> rosenbrock_inp(x) = (1.0 - x[1])^2 + 100.0 * (x[2] - x[1]^2)^2;

```

where we note for future reference that the value of this function at `x=1.0`, `y=2.0` is `100.0`, and its derivative

with respect to `x` at that point is `-400.0`, and its derivative with respect to `y` at that point is `200.0`.

julia> rosenbrock(xy::Vector) = (1 - xy[1])^2 + 100 * (xy[2] - xy[1]^2)^2;

julia> z = rosenbrock(1.0, 2.0)

100.0

julia> z == rosenbrock([1.0, 2.0]) # Vector method

true

```

We note for future reference that the value of this function at `x=1.0`, `y=2.0` is `z=100.0`. Its derivative with respect to `x` at that point is `-400.0`, and its derivative with respect to `y` is `200.0`.

I've also removed 100.0 from the definition, as IMO this is idiomatic Julia -- rosenbrock can take & return Float32 without promoting.

mcabbott · 2024-12-13T18:18:35Z

docs/src/index.md

 ```

+where we note for future reference that the value of this function at `x=1.0`, `y=2.0` is `100.0`, and its derivative
+with respect to `x` at that point is `-400.0`, and its derivative with respect to `y` at that point is `200.0`. 
+
 ## Reverse mode

 The return value of reverse mode [`autodiff`](@ref) is a tuple that contains as a first value


Suggested change

The return value of reverse mode [`autodiff`](@ref) is a tuple that contains as a first value

The return value of reverse mode [`autodiff`](@ref) is a tuple that contains as a first element

mcabbott · 2024-12-13T18:18:59Z

docs/src/index.md

 The return value of reverse mode [`autodiff`](@ref) is a tuple that contains as a first value
-the derivative value of the active inputs and optionally the primal return value.
+the derivative value of the active inputs and optionally the _primal_ return value (i.e. the 
+value of the undifferentiated function).


I can't make suggestions across deleted lines :/ so this is going to be messy...

mcabbott · 2024-12-13T18:21:03Z

docs/src/index.md

+the derivative value of the active inputs and optionally the _primal_ return value (i.e. the 
+value of the undifferentiated function).


Suggested change

the derivative value of the active inputs and optionally the _primal_ return value (i.e. the

value of the undifferentiated function).

the derivatives with respect to the inputs. The tuple's second element is the _primal_ value (i.e. the result of the original function without differentiation),

but this is omitted if you use `Reverse` instead of `ReverseWithPrimal`:

mcabbott · 2024-12-13T18:22:09Z

docs/src/index.md


 ```jldoctest rosenbrock
-julia> autodiff(Reverse, rosenbrock, Active, Active(1.0), Active(2.0))
+julia> autodiff(Reverse, rosenbrock, Active(1.0), Active(2.0))


Suggested change

julia> autodiff(Reverse, rosenbrock, Active(1.0), Active(2.0))

julia> derivs, z = autodiff(ReverseWithPrimal, rosenbrock, Active(1.0), Active(2.0))

((-400.0, 200.0), 100.0)

julia> autodiff(Reverse, rosenbrock, Active(1.0), Active(2.0))

mcabbott · 2024-12-13T18:22:26Z

docs/src/index.md

+julia> autodiff(ReverseWithPrimal, rosenbrock, Active(1.0), Active(2.0))
 ((-400.0, 200.0), 100.0)


Suggested change

julia> autodiff(ReverseWithPrimal, rosenbrock, Active(1.0), Active(2.0))

((-400.0, 200.0), 100.0)

mcabbott · 2024-12-13T18:22:45Z

docs/src/index.md

@@ -62,7 +64,7 @@ julia> dx = [0.0, 0.0]
 0.0
 0.0

-julia> autodiff(Reverse, rosenbrock_inp, Active, Duplicated(x, dx))
+julia> autodiff(Reverse, rosenbrock_inp, Duplicated(x, dx))


Suggested change

julia> autodiff(Reverse, rosenbrock_inp, Duplicated(x, dx))

julia> autodiff(Reverse, rosenbrock, Duplicated(x, dx))

mcabbott · 2024-12-13T18:24:00Z

docs/src/index.md

@@ -71,8 +73,9 @@ julia> dx
  200.0
 ```

-Both the inplace and "normal" variant return the gradient. The difference is that with
-[`Active`](@ref) the gradient is returned and with [`Duplicated`](@ref) the gradient is accumulated in place.
+Both the inplace and "normal" variant return the gradient. The difference is that with [`Active`](@ref) 


Suggested change

Both the inplace and "normal" variant return the gradient. The difference is that with [`Active`](@ref)

Both versions calculate the same derivatives. The difference is that with [`Active`](@ref) arguments

mcabbott · 2024-12-13T18:26:45Z

docs/src/index.md

+the gradient is returned and with [`Duplicated`](@ref) the gradient is accumulated in-place into `dx`, 
+and a value of `nothing` is placed in the corresponding slot of the returned `Tuple`.


Suggested change

the gradient is returned and with [`Duplicated`](@ref) the gradient is accumulated in-place into `dx`,

and a value of `nothing` is placed in the corresponding slot of the returned `Tuple`.

(for immutable inputs like `x::Float64`) it returns them as `derivs`, while the version with `Duplicated` (for mutable inputs like `x::Vector{Float64}`) instead writes the gradient into the `Duplicated` object, and returns `nothing` in the corresponding slot of the returned `derivs`.

In fact it accumulates the gradient, i.e. if you run `autodiff` again it will double `dx`.

In general, `autodiff` accepts any mix of `Active` and `Duplicated` function arguments, as well as `Const` and various other `Annotation` types.

mcabbott · 2024-12-13T18:29:09Z

docs/src/index.md

@@ -121,7 +124,7 @@ julia> dx = [1.0, 1.0]
 1.0
 1.0

-julia> autodiff(ForwardWithPrimal, rosenbrock_inp, Duplicated, Duplicated(x, dx))
+julia> autodiff(ForwardWithPrimal, rosenbrock_inp, Duplicated(x, dx))


Suggested change

julia> autodiff(ForwardWithPrimal, rosenbrock_inp, Duplicated(x, dx))

julia> autodiff(ForwardWithPrimal, rosenbrock, Duplicated(x, dx))

some doc tweaks

d856c12

wsmoses approved these changes Dec 12, 2024

View reviewed changes

MasonProtter mentioned this pull request Dec 12, 2024

Reverse mode incorrectly detecting duplicated returns for a type unstable function #2194

Open

mcabbott reviewed Dec 13, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some tweaks to the Getting Started docs #2195

Some tweaks to the Getting Started docs #2195

MasonProtter commented Dec 11, 2024

wsmoses commented Dec 12, 2024

mcabbott Dec 13, 2024 •

edited

Loading

wsmoses Dec 13, 2024

mcabbott Dec 13, 2024

wsmoses Dec 13, 2024

mcabbott Dec 13, 2024

mcabbott Dec 13, 2024

wsmoses Dec 13, 2024

wsmoses Dec 13, 2024

mcabbott Dec 13, 2024 •

edited

Loading

wsmoses Dec 13, 2024

mcabbott Dec 13, 2024

mcabbott Dec 13, 2024 •

edited

Loading

wsmoses Dec 13, 2024

mcabbott Dec 13, 2024

codecov-commenter commented Dec 13, 2024 •

edited

Loading

mcabbott left a comment •

edited

Loading

mcabbott Dec 13, 2024 •

edited

Loading

mcabbott Dec 13, 2024

mcabbott Dec 13, 2024

mcabbott Dec 13, 2024

mcabbott Dec 13, 2024

mcabbott Dec 13, 2024

mcabbott Dec 13, 2024

mcabbott Dec 13, 2024

mcabbott Dec 13, 2024 •

edited

Loading

mcabbott Dec 13, 2024

		```

		where we note for future reference that the value of this function at `x=1.0`, `y=2.0` is `100.0`, and its derivative

	Both the inplace and "normal" variant return the gradient. The difference is that with [`Active`](@ref)
	Both versions calculate the same derivatives. The difference is that with [`Active`](@ref) arguments

-julia> rosenbrock_inp(x) = (1.0 - x[1])^2 + 100.0 * (x[2] - x[1]^2)^2;
-```
-where we note for future reference that the value of this function at `x=1.0`, `y=2.0` is `100.0`, and its derivative
-with respect to `x` at that point is `-400.0`, and its derivative with respect to `y` at that point is `200.0`.
+julia> rosenbrock(xy::Vector) = (1 - xy[1])^2 + 100 * (xy[2] - xy[1]^2)^2;
+julia> z = rosenbrock(1.0, 2.0)
+.0
+julia> z == rosenbrock([1.0, 2.0])  # Vector method
+true
+```
+We note for future reference that the value of this function at `x=1.0`, `y=2.0` is `z=100.0`. Its derivative with respect to `x` at that point is `-400.0`, and its derivative with respect to `y` is `200.0`.

	The return value of reverse mode [`autodiff`](@ref) is a tuple that contains as a first value
	The return value of reverse mode [`autodiff`](@ref) is a tuple that contains as a first element

		the derivative value of the active inputs and optionally the _primal_ return value (i.e. the
		value of the undifferentiated function).

		julia> autodiff(ReverseWithPrimal, rosenbrock, Active(1.0), Active(2.0))
		((-400.0, 200.0), 100.0)

	julia> autodiff(Reverse, rosenbrock_inp, Duplicated(x, dx))
	julia> autodiff(Reverse, rosenbrock, Duplicated(x, dx))

		the gradient is returned and with [`Duplicated`](@ref) the gradient is accumulated in-place into `dx`,
		and a value of `nothing` is placed in the corresponding slot of the returned `Tuple`.

-the gradient is returned and with [`Duplicated`](@ref) the gradient is accumulated in-place into `dx`,
-and a value of `nothing` is placed in the corresponding slot of the returned `Tuple`.
+(for immutable inputs like `x::Float64`) it returns them as `derivs`, while the version with `Duplicated` (for mutable inputs like `x::Vector{Float64}`) instead writes the gradient into the `Duplicated` object, and returns `nothing` in the corresponding slot of the returned `derivs`.
+In fact it accumulates the gradient, i.e. if you run `autodiff` again it will double `dx`.
+In general, `autodiff` accepts any mix of `Active` and `Duplicated` function arguments, as well as `Const` and various other `Annotation` types.

	julia> autodiff(ForwardWithPrimal, rosenbrock_inp, Duplicated(x, dx))
	julia> autodiff(ForwardWithPrimal, rosenbrock, Duplicated(x, dx))

Some tweaks to the Getting Started docs #2195

Are you sure you want to change the base?

Some tweaks to the Getting Started docs #2195

Conversation

MasonProtter commented Dec 11, 2024

wsmoses commented Dec 12, 2024

mcabbott Dec 13, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mcabbott Dec 13, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mcabbott Dec 13, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov-commenter commented Dec 13, 2024 • edited Loading

Codecov Report

mcabbott left a comment • edited Loading

Choose a reason for hiding this comment

mcabbott Dec 13, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mcabbott Dec 13, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mcabbott Dec 13, 2024 •

edited

Loading

mcabbott Dec 13, 2024 •

edited

Loading

mcabbott Dec 13, 2024 •

edited

Loading

codecov-commenter commented Dec 13, 2024 •

edited

Loading

mcabbott left a comment •

edited

Loading

mcabbott Dec 13, 2024 •

edited

Loading

mcabbott Dec 13, 2024 •

edited

Loading