EnzymeAD · MasonProtter · Dec 11, 2024 · mcabbott · Dec 13, 2024 · wsmoses
diff --git a/docs/src/index.md b/docs/src/index.md
@@ -31,23 +31,25 @@ Also see [Implementing pullbacks](@ref) on how to implement back-propagation for
 We will try a few things with the following functions:
 
 ```jldoctest rosenbrock
-julia> rosenbrock(x, y) = (1.0 - x)^2 + 100.0 * (y - x^2)^2
-rosenbrock (generic function with 1 method)
+julia> rosenbrock(x, y) = (1.0 - x)^2 + 100.0 * (y - x^2)^2;
 
-julia> rosenbrock_inp(x) = (1.0 - x[1])^2 + 100.0 * (x[2] - x[1]^2)^2
-rosenbrock_inp (generic function with 1 method)
+julia> rosenbrock_inp(x) = (1.0 - x[1])^2 + 100.0 * (x[2] - x[1]^2)^2;
 ```
 
+where we note for future reference that the value of this function at `x=1.0`, `y=2.0` is `100.0`, and its derivative
+with respect to `x` at that point is `-400.0`, and its derivative with respect to `y` at that point is `200.0`. 
-julia> rosenbrock_inp(x) = (1.0 - x[1])^2 + 100.0 * (x[2] - x[1]^2)^2;
-```
-
-where we note for future reference that the value of this function at `x=1.0`, `y=2.0` is `100.0`, and its derivative
-with respect to `x` at that point is `-400.0`, and its derivative with respect to `y` at that point is `200.0`. 
+julia> rosenbrock(xy::Vector) = (1 - xy[1])^2 + 100 * (xy[2] - xy[1]^2)^2;
+
+julia> z = rosenbrock(1.0, 2.0)
+100.0
+
+julia> z == rosenbrock([1.0, 2.0])  # Vector method
+true
+```
+We note for future reference that the value of this function at `x=1.0`, `y=2.0` is `z=100.0`. Its derivative with respect to `x` at that point is `-400.0`, and its derivative with respect to `y` is `200.0`. 
-julia> rosenbrock_inp(x) = (1.0 - x[1])^2 + 100.0 * (x[2] - x[1]^2)^2;
-```
-
-where we note for future reference that the value of this function at `x=1.0`, `y=2.0` is `100.0`, and its derivative
-with respect to `x` at that point is `-400.0`, and its derivative with respect to `y` at that point is `200.0`. 
+julia> rosenbrock(xy::Vector) = (1 - xy[1])^2 + 100 * (xy[2] - xy[1]^2)^2;
+
+julia> z = rosenbrock(1.0, 2.0)
+100.0
+
+julia> z == rosenbrock([1.0, 2.0])  # Vector method
+true
+```
+We note for future reference that the value of this function at `x=1.0`, `y=2.0` is `z=100.0`. Its derivative with respect to `x` at that point is `-400.0`, and its derivative with respect to `y` is `200.0`. 
+
 ## Reverse mode
 
 The return value of reverse mode [`autodiff`](@ref) is a tuple that contains as a first value
-The return value of reverse mode [`autodiff`](@ref) is a tuple that contains as a first value
+The return value of reverse mode [`autodiff`](@ref) is a tuple that contains as a first element
-The return value of reverse mode [`autodiff`](@ref) is a tuple that contains as a first value
+The return value of reverse mode [`autodiff`](@ref) is a tuple that contains as a first element
-the derivative value of the active inputs and optionally the primal return value.
+the derivative value of the active inputs and optionally the _primal_ return value (i.e. the 
+value of the undifferentiated function).
-the derivative value of the active inputs and optionally the _primal_ return value (i.e. the 
-value of the undifferentiated function).
+the derivatives with respect to the inputs. The tuple's second element is the _primal_ value (i.e. the result of the original function without differentiation),
+but this is omitted if you use `Reverse` instead of `ReverseWithPrimal`:
-the derivative value of the active inputs and optionally the _primal_ return value (i.e. the 
-value of the undifferentiated function).
+the derivatives with respect to the inputs. The tuple's second element is the _primal_ value (i.e. the result of the original function without differentiation),
+but this is omitted if you use `Reverse` instead of `ReverseWithPrimal`:
 
 ```jldoctest rosenbrock
-julia> autodiff(Reverse, rosenbrock, Active, Active(1.0), Active(2.0))
+julia> autodiff(Reverse, rosenbrock, Active(1.0), Active(2.0))
-julia> autodiff(Reverse, rosenbrock, Active(1.0), Active(2.0))
+julia> derivs, z = autodiff(ReverseWithPrimal, rosenbrock, Active(1.0), Active(2.0))
+((-400.0, 200.0), 100.0)
+
+julia> autodiff(Reverse, rosenbrock, Active(1.0), Active(2.0))
-julia> autodiff(Reverse, rosenbrock, Active(1.0), Active(2.0))
+julia> derivs, z = autodiff(ReverseWithPrimal, rosenbrock, Active(1.0), Active(2.0))
+((-400.0, 200.0), 100.0)
+
+julia> autodiff(Reverse, rosenbrock, Active(1.0), Active(2.0))
 ((-400.0, 200.0),)
 
-julia> autodiff(ReverseWithPrimal, rosenbrock, Active, Active(1.0), Active(2.0))
+julia> autodiff(ReverseWithPrimal, rosenbrock, Active(1.0), Active(2.0))
 ((-400.0, 200.0), 100.0)
-julia> autodiff(ReverseWithPrimal, rosenbrock, Active(1.0), Active(2.0))
-((-400.0, 200.0), 100.0)
-julia> autodiff(ReverseWithPrimal, rosenbrock, Active(1.0), Active(2.0))
-((-400.0, 200.0), 100.0)
 ```
 
@@ -62,7 +64,7 @@ julia> dx = [0.0, 0.0]
  0.0
  0.0
 
-julia> autodiff(Reverse, rosenbrock_inp, Active, Duplicated(x, dx))
+julia> autodiff(Reverse, rosenbrock_inp, Duplicated(x, dx))
-julia> autodiff(Reverse, rosenbrock_inp, Duplicated(x, dx))
+julia> autodiff(Reverse, rosenbrock, Duplicated(x, dx))
-julia> autodiff(Reverse, rosenbrock_inp, Duplicated(x, dx))
+julia> autodiff(Reverse, rosenbrock, Duplicated(x, dx))
 ((nothing,),)
 
 julia> dx
@@ -71,8 +73,9 @@ julia> dx
   200.0
 ```
 
-Both the inplace and "normal" variant return the gradient. The difference is that with
-[`Active`](@ref) the gradient is returned and with [`Duplicated`](@ref) the gradient is accumulated in place.
+Both the inplace and "normal" variant return the gradient. The difference is that with [`Active`](@ref) 
-Both the inplace and "normal" variant return the gradient. The difference is that with [`Active`](@ref) 
+Both versions calculate the same derivatives. The difference is that with [`Active`](@ref) arguments
-Both the inplace and "normal" variant return the gradient. The difference is that with [`Active`](@ref) 
+Both versions calculate the same derivatives. The difference is that with [`Active`](@ref) arguments
+the gradient is returned and with [`Duplicated`](@ref) the gradient is accumulated in-place into `dx`, 
+and a value of `nothing` is placed in the corresponding slot of the returned `Tuple`.
-the gradient is returned and with [`Duplicated`](@ref) the gradient is accumulated in-place into `dx`, 
-and a value of `nothing` is placed in the corresponding slot of the returned `Tuple`.
+(for immutable inputs like `x::Float64`) it returns them as `derivs`, while the version with `Duplicated` (for mutable inputs like `x::Vector{Float64}`) instead writes the gradient into the `Duplicated` object, and returns `nothing` in the corresponding slot of the returned `derivs`. 
+In fact it accumulates the gradient, i.e. if you run `autodiff` again it will double `dx`.
+In general, `autodiff` accepts any mix of `Active` and `Duplicated` function arguments, as well as `Const` and various other `Annotation` types.
-the gradient is returned and with [`Duplicated`](@ref) the gradient is accumulated in-place into `dx`, 
-and a value of `nothing` is placed in the corresponding slot of the returned `Tuple`.
+(for immutable inputs like `x::Float64`) it returns them as `derivs`, while the version with `Duplicated` (for mutable inputs like `x::Vector{Float64}`) instead writes the gradient into the `Duplicated` object, and returns `nothing` in the corresponding slot of the returned `derivs`. 
+In fact it accumulates the gradient, i.e. if you run `autodiff` again it will double `dx`.
+In general, `autodiff` accepts any mix of `Active` and `Duplicated` function arguments, as well as `Const` and various other `Annotation` types.
 
 ## Forward mode
 
@@ -121,7 +124,7 @@ julia> dx = [1.0, 1.0]
  1.0
  1.0
 
-julia> autodiff(ForwardWithPrimal, rosenbrock_inp, Duplicated, Duplicated(x, dx))
+julia> autodiff(ForwardWithPrimal, rosenbrock_inp, Duplicated(x, dx))
-julia> autodiff(ForwardWithPrimal, rosenbrock_inp, Duplicated(x, dx))
+julia> autodiff(ForwardWithPrimal, rosenbrock, Duplicated(x, dx))
-julia> autodiff(ForwardWithPrimal, rosenbrock_inp, Duplicated(x, dx))
+julia> autodiff(ForwardWithPrimal, rosenbrock, Duplicated(x, dx))
 (-400.0, 400.0)
 ```