Skip to content

Commit

Permalink
Document performance issues for captured variables (#27282)
Browse files Browse the repository at this point in the history
* Document performance issues for captured variables

These changes to the manual try to document the material
in issue 15276 about the performance of captured variables, since
it appears that this issue will not be "solved" before the release
of 1.0

* Fix trailing whitespace

* fix trailing whitespace #2

* Make all changes suggested by ararslan

* Revise text following suggestions from Mauro, Jeff Bezanson and Martin Holters.

* Fix two typos in most recent revision
  • Loading branch information
StephenVavasis authored and KristofferC committed Jun 26, 2018
1 parent f36d3de commit afbd699
Show file tree
Hide file tree
Showing 4 changed files with 101 additions and 1 deletion.
8 changes: 8 additions & 0 deletions doc/src/manual/arrays.md
Original file line number Diff line number Diff line change
Expand Up @@ -246,6 +246,14 @@ julia> map(tuple, (1/(i+j) for i=1:2, j=1:2), [1 3; 2 4])
(0.333333, 2) (0.25, 4)
```

Generators are implemented via inner functions. As in other cases of
inner functions in the language, variables from the enclosing scope can be
"captured" in the inner function. For example, `sum(p[i] - q[i] for i=1:n)`
captures the three variables `p`, `q` and `n` from the enclosing scope.
Captured variables can present performance challenges described in
[performance tips](@ref man-performance-tips).


Ranges in generators and comprehensions can depend on previous ranges by writing multiple `for`
keywords:

Expand Down
6 changes: 6 additions & 0 deletions doc/src/manual/functions.md
Original file line number Diff line number Diff line change
Expand Up @@ -654,6 +654,12 @@ normally or threw an exception. (The `try/finally` construct will be described i
With the `do` block syntax, it helps to check the documentation or implementation to know how
the arguments of the user function are initialized.

A `do` block, like any other inner function, can "capture" variables from its
enclosing scope. For example, the variable `data` in the above example of
`open...do` is captured from the outer scope. Captured variables
can create performance challenges as discussed in [performance tips](@ref man-performance-tips).


## [Dot Syntax for Vectorizing Functions](@id man-vectorized)

In technical-computing languages, it is common to have "vectorized" versions of functions, which
Expand Down
82 changes: 82 additions & 0 deletions doc/src/manual/performance-tips.md
Original file line number Diff line number Diff line change
Expand Up @@ -1459,3 +1459,85 @@ The following examples may help you interpret expressions marked as containing n
field `data::Array{T}`. But `Array` needs the dimension `N`, too, to be a concrete type.
* Suggestion: use concrete types like `Array{T,3}` or `Array{T,N}`, where `N` is now a parameter
of `ArrayContainer`

## [Performance of captured variable](@id man-performance-captured)

Consider the following example that defines an inner function:
```julia
function abmult(r::Int)
if r < 0
r = -r
end
f = x -> x * r
return f
end
```

Function `abmult` returns a function `f` that multiplies its argument by
the absolute value of `r`. The inner function assigned to `f` is called a
"closure". Inner functions are also used by the
language for `do`-blocks and for generator expressions.

This style of code presents performance challenges for the language.
The parser, when translating it into lower-level instructions,
substantially reorganizes the above code by extracting the
inner function to a separate code block. "Captured" variables such as `r`
that are shared by inner functions and their enclosing scope are
also extracted into a heap-allocated "box" accessible to both inner and
outer functions because the language specifies that `r` in the
inner scope must be identical to `r` in the outer scope even after the
outer scope (or another inner function) modifies `r`.

The discussion in the preceding paragraph referred to the "parser", that is, the phase
of compilation that takes place when the module containing `abmult` is first loaded,
as opposed to the later phase when it is first invoked. The parser does not "know" that
`Int` is a fixed type, or that the statement `r = -r` tranforms an `Int` to another `Int`.
The magic of type inference takes place in the later phase of compilation.

Thus, the parser does not know that `r` has a fixed type (`Int`).
nor that `r` does not change value once the inner function is created (so that
the box is unneeded). Therefore, the parser emits code for
box that holds an object with an abstract type such as `Any`, which
requires run-time type dispatch for each occurrence of `r`. This can be
verified by applying `@code_warntype` to the above function. Both the boxing
and the run-time type dispatch can cause loss of performance.

If captured variables are used in a performance-critical section of the code,
then the following tips help ensure that their use is performant. First, if
it is known that a captured variable does not change its type, then this can
be declared explicitly with a type annotation (on the variable, not the
right-hand side):
```julia
function abmult2(r0::Int)
r::Int = r0
if r < 0
r = -r
end
f = x -> x * r
return f
end
```
The type annotation partially recovers lost performance due to capturing because
the parser can associate a concrete type to the object in the box.
Going further, if the captured variable does not need to be boxed at all (because it
will not be reassigned after the closure is created), this can be indicated
with `let` blocks as follows.
```julia
function abmult3(r::Int)
if r < 0
r = -r
end
f = let r = r
x -> x * r
end
return f
end
```
The `let` block creates a new variable `r` whose scope is only the
inner function. The second technique recovers full language performance
in the presence of captured variables. Note that this is a rapidly
evolving aspect of the compiler, and it is likely that future releases
will not require this degree of programmer annotation to attain peformance.
In the mean time, some user-contributed packages like
[FastClosures](https://github.com/c42f/FastClosures.jl) automate the
insertion of `let` statements as in `abmult3`.
6 changes: 5 additions & 1 deletion doc/src/manual/variables-and-scoping.md
Original file line number Diff line number Diff line change
Expand Up @@ -262,7 +262,11 @@ julia> counter()
2
```

See also the closures in the examples in the next two sections.
See also the closures in the examples in the next two sections. A variable
such as `x` in the first example and `state` in the second that is inherited
from the enclosing scope by the inner function is sometimes called a
*captured* variable. Captured variables can present performance challenges
discussed in [performance tips](@ref man-performance-tips).

The distinction between inheriting global scope and nesting local scope
can lead to some slight differences between functions
Expand Down

0 comments on commit afbd699

Please sign in to comment.