Anamnesis [an-am-nee-sis]:
- the recollection or rememberance of the past
- recollection of the ideas which the soul had known in a previous existence, especially by means of reasoning
A package for doing unobtrusive memoization of computationally expensive functions in Julia.
This package is for Julia 0.7 and later.
The "standard" implementation of memoization is to simply declare a function that will automatically memoize all of its results. This can be done with
@anamnesis function SU2(ϕ::Real, n::AbstractVector{T}) where T<:Real
println("Fire walk with me")
σ⃗ = [Complex{T}[0 1; 1 0], Complex{T}[0 -im; im 0], Complex{T}[1 0; 0 -1]]
exp(im*ϕ*n'*σ⃗)
end
SU2(0, [1, 0, 0]) # calls SU2, prints "Fire walk with me"
SU2(0, [1, 0, 0]) # the previous result is retrieved, does not print
This also works for single-line function declearations such as @anamnesis f(x) = x^2 + 1
.
Note that Anamnesis.jl also supports keyword arguments in all use cases.
One of the design goals of Anamnesis.jl is to allow memoization as unobtrusively as possible, preferably without the need to change any other code. To this
end, we provide the @anamnesis
macro also for blocks. For example
@anamnesis begin
t = f(x) + g(x)
z = exp(-λ*t)
f(z) + f(x) - g(x)
end f
In this example, all calls to f
are memoized, but calls to other functions are not. This was achieved by naming f
after the block. This can be done with
arbitrarily many functions so, for example, if we placed a g
after the above example it would have memoized g
as well. It is not necessary to declare
f
with @anamnesis
to do this, in fact, these are mutually exclusive use cases. Note that this can be done with any expression, not just a begin end
block. For example, we could do
@anamnesis function h(x)
o = f(x) + g(x) - f(x^2)
f(o^2) + f(x^2)
end f
What distinguishes this example from the memoization of h
itself is the extra f
added to the end.
One of the simplest uses of Anamnesis.jl is the @mem
macro. This macro will memoize only the function call for which it was called, for example
@mem f(1) # the result is memoized
f(2) # this is a normal function call
@mem f(1) # f is not evaluated here, instead the previous value is retrieved
@mem f(2) # now the value for f(2) is stored
Anamnesis.jl uses objects called Scribe
s to perform memoization. The objects are just the function itself combined with a cache of seen values. If you
prefer to use Anamnesis.jl more explicitly, you can simply use the scribe objects directly. For example
f(x) = abs2(gamma(x))
sf = Scribe(f)
sf(0) # this result is memoized
f(0) # function calls work normally; results are not memoized
sf(0) # this of course retrieves the previously memoized result
The macros described above generally create scribe objects and place them in the current global scope. These can be accessed with the @scribeof
macro. For
example
@mem f(0)
@scribeof(f) # this is the scribe object generated by mem
@scribeof(f)(0) # you can call these normally
Memoized functions created with @anamnesis
on function declarations need to be accessed differently, in this case whatever name you gave the function is the
name of a new function that wraps the Scribe
.
@anamnesis h(x) = exp(x)
@scribeof(h) # ERROR
@rawfunc(h) # this is the non-memoized function
@rawfunc(h)(0) # you can call it normally, it will not memoize
@scribeofrawfunc(h) # this is the scribe object you are calling with h itself
typeof(h) <: Function # the reason Anamnesis is designed this way, instead of just passing you a Scribe is so that you are still creating a bonified function
By default Scribe
s use an IdDict{Any,Any}
to cache its values, however any AbstractDict
object can be used in its place. One can even supply an
AbstractDict
with pre-existing values. For example
sf = Scribe(f, Dict{Any,Any}((1,)=>0))
sf(1) # return 0 regardless of return value of f
Explicitly specifying the value cache can also be useful for improving performance. If you know the argument or (more importantly) return types of your
function, you can tell the compiler about it by supplyinig an appropriately typed AbstractDict
. For example
sf = Scribe(f, Dict{Tuple{Real,Real},Float64}())
sf(1, 2) # the return values are now guaranteed type stable. if f does not return a Float64 an error will be thrown
sg = Scribe(g, Dict{Any,ComplexF64}()) # a more common use case is to specify only the return type
Note that return types can also be specified by overloading
Anamnesis.returntype(::typeof(f), argtypes) = AbstractFloat
Anamnesis.returntype(::typeof(f), ::Type{Tuple{String}}) = AbstractString
# here you are promising the compiler that f returns an AbstractFloat unless you pass it a String in which case it returns AbstractString
Note that even if you do not specify a return type in any way Anamnesis.jl uses inference to attempt to determine one so that its output should be type stable in most cases (although it contains type unstable code internally).
Julia is a scientific programming language which was designed from the ground up with computational efficiency as one of its goals. As such, even ad hoc and
"naive" functions written in Julia tend to be extremely efficient. Furthermore, any memoization implementation is liable to have some performance pain points:
in general they contain type unstable code (even if they have type stable output) and they must include a value cache which is accessible from the same scope as
the function itself so that global objects are potentially involved. Because of this, memoization comes with a significant overhead, even with Julia's highly
efficient AbstractDict
implementations.
I find that a good rule of thumb is that you should never memoize functions which take less than a few microseconds to evaluate. Typically I'd expect you to use memoization with functions which typicall take on the order of milliseconds or more to evaluate.
I started Anamnesis.jl as an experiment mainly with the goal in mind of doing inter-process memoization, e.g. so that you could exit a program completely and then retrieve memoized values from serialized data on disk. I find that I very frequently have need for this sort of thing when doing experimentation in scientific programming and data science as often I have functions that take a very long time (on the order of seconds) to evaluate, and I don't want to have to do that every time I change some unrelated piece of code. I initially had some inellegant code which worked, but was not a good long term solution. I've now decided to make sure I have a really good implementation of basic memoization and build up from there.
Hopefully I'll have a chance to work on some of these more "exotic" features soon, and I already have concrete plans about how to achieve this.