This is an implementation of caching for direnv.
direnv-cache
caches the environment into a .env
file, and then loads from the
.env if it is still valid. This could be helpful when re-entering a directory
that has not changed since we last left it; we can restore the environment
without fully executing the .envrc.
The accompanying file cache.sh should be added into direnv’s library,
- by dropping into
$HOME/.config/direnv/lib
curl -sSL https://raw.githubusercontent.com/indigoviolet/direnv-cache/main/cache.sh -o $HOME/.config/direnv/lib/05-cache.sh
- or by appending to direnvrc
curl -sSL https://raw.githubusercontent.com/indigoviolet/direnv-cache/main/cache.sh >> $HOME/.config/direnv/direnvrc
- jq should be available
Add the use_cache
function to the top of your .envrc
:
cat envrc_with_cache
use_cache layout_poetry
If the cache does not exist or is invalid, use_cache
will fall through to the
rest of the .envrc, and then cache the resulting environment for future use.
Note that direnv reload
may no longer do what you expect: it will load from
cache if possible. Delete the .env
file to force a full reload.
These are mostly for debugging purposes, but may come in handy:
DIRENV_CACHE_IGNORE
- if set, the cache will always be ignored.
DIRENV_CACHE_DEBUG
- if set, some debugging output will be printed. if >1, more verbose debugging output is available.
A few additions to the core of direnv could make this better:
- Currently we are unable to figure out why the envrc is being evaluated: we
can’t distinguish between
precmd
,direnv reload
andchwpd
. It would be nice for direnv to hint this, perhaps via an environment variable. Since we only need to test cache validity in the last case, this could improve performance in the first two cases. Our current implementation has to test cache validity in all three cases. (See explanation) direnv current
tests 1 watched file fromDIRENV_WATCHES
at a time; we could have a higher-level check that tests whether the existing environment is current.- The output of
direnv export
requires processing to convert to dotenv format. We could supportdirenv export dotenv
.
See details below. The average improvement isn’t dramatic in my measurements
(with layout_poetry
), but it depends on how expensive the envrc is.
I use direnv for setting up Poetry (layout_poetry) or conda, and in these scenarios, the environment created via direnv doesn’t change all that often, and is amenable to caching.
I also use direnv in Emacs (envrc), and having fast loads would make my editor start up faster.
I could create a .env
manually but it’s convenient to define it using direnv’s
primitives.
direnv hook <shell>
sets updirenv export json
to be called on each prompt (precmd
) and directory change (chpwd
)- upon
chwpd
, if we are entering a directory containing anenvrc
, we execute it in a subshell, compare the env from the subshell to our own, and then apply that diff to our env. (loading). - Note that the env in the subshell that
direnv export json
is executed in, is carefully restored to the “pre-direnv” state, by revertingDIRENV_DIFF
. - Several direnv-specific state tracking env variables are set - ex.
DIRENV_FILE
(the envrc file),DIRENV_DIR
(the directory containing the envrc),DIRENV_WATCHES
(name, mtime, existence of all watched files),DIRENV_DIFF
(the diff that was applied). - Upon
chwpd
, if we already have these variables in our environment, and are leaving theDIRENV_DIR
, these tracking variables are unset, and the inverse ofDIRENV_DIFF
is applied. (unloading) - Upon
precmd
, ifDIRENV_WATCHES
is stale i.e, the watched files have changed, direnv loads again (direnv current
implements this for one file at a time). direnv watch
and friends add toDIRENV_WATCHES
, so they act as dependencies for the env state.
direnv hook zsh
_direnv_hook() { trap -- '' SIGINT; eval "$("/home/linuxbrew/.linuxbrew/Cellar/direnv/2.30.3/bin/direnv" export zsh)"; trap - SIGINT; } typeset -ag precmd_functions; if [[ -z "${precmd_functions[(r)_direnv_hook]+1}" ]]; then precmd_functions=( _direnv_hook ${precmd_functions[@]} ) fi typeset -ag chpwd_functions; if [[ -z "${chpwd_functions[(r)_direnv_hook]+1}" ]]; then chpwd_functions=( _direnv_hook ${chpwd_functions[@]} ) fi
Caching is only useful when re-entering a directory that hasn’t changed in the interim. In this case, we would like to restore our previous state.
use_cache
is the first statement in theenvrc
, so it can short circuit if loading from cache.Here are the scenarios when the envrc is executed:
(use_cache sees a DIRENV_WATCHES containing only the envrc & allow. files)
invocation mode DIRENV_WATCHES cache verification needed? cache action precmd set, stale no - known to be invalid rebuild direnv reload set, irrelevant no - forced reload rebuild chdir (enter) unset or from a previous RC yes - might be stale rebuild if cache is valid Unfortunately, there doesn’t appear to be any way to know which of these invocation modes we are in – since the envrc always executes in a “clean” subshell.
All we know is that direnv wants to execute the envrc; we can test whether the cache is valid (based on whether the cached DIRENV_WATCHES is stale), and rebuild if it is not, or load from cache if valid.
- building the cache: run
direnv export json
in a clean subshell, and convert that intodotenv
format into.env
(usingjq
) - if the cache is valid: load it via
dotenv_if_exists
, otherwise build it - some extra env switches are provided to help debug things:
DIRENV_CACHE_IGNORE
,DIRENV_CACHE_DEBUG
DIRENV_WATCHES is in gzenv format, ie base64-urlencoded + zlib + json
direnv show_dump $DIRENV_WATCHES
echo $DIRENV_WATCHES | python -c "import sys; import zlib; import base64; print(zlib.decompress(base64.urlsafe_b64decode(sys.stdin.read())).decode('utf-8'))" | jq '.'
{ printf "\x1f\x8b\x08\x00\x00\x00\x00\x00" ; echo $DIRENV_WATCHES | basenc --base64url -d ; } | gzip -d | jq '.'
dotenv_if_exists
will usually watch_file
.env
, which modifies DIRENV_WATCHES
,
but then immediately the DIRENV_WATCHES
from the cache will overwrite this, so
that .env will not be watched.
Do we even want to watch the cache file? I don’t think so: users shouldn’t be
modifying it directly; if deleted, it will get recreated the next time direnv
tries to load something.
Attempting to get the cache file into DIRENV_WATCHES is tricky:
- DIRENV_WATCHES is captured in the subshell, and won’t contain .env by default. We do need to capture DIRENV_WATCHES, since the .envrc could be registering files to watch.
- the first problem is mentioned above:
dotenv_if_exists
willwatch_file
on the cache file but the resulting DIRENV_WATCHES will be lost when the cache is actually loaded. - So we need to
watch_file .env
after the cache is created and loaded; this generates a new DIRENV_WATCHES containing the current stat of .env. But if we modify .env after this to update the cached value of DIRENV_WATCHES, our cache will appear invalid (since DIRENV_WATCHES is stale), and we will rebuild the cache. - The trick could be to first update .env with a DIRENV_WATCHES value that
includes itself, and then the env, as below. Here we are appending a second
export
of DIRENV_WATCHES to .env, which will override the earlier one.
{ direnv watch json .env | jq -r '"export DIRENV_WATCHES=\(.DIRENV_WATCHES|@sh)"' >> .env; eval $(direnv watch zsh .env); }
[tool.poetry]
name = "direnv-cache-test"
version = "0.1.0"
description = "Test project for benchmarking direnv-cache."
authors = ["Venky Iyer <indigoviolet@gmail.com>"]
[tool.poetry.dependencies]
python = "^3.8"
[build-system]
requires = ["poetry>=0.12"]
build-backend = "poetry.masonry.api"
use_cache
layout_poetry
layout_poetry
python 3.8.1
brew install hyperfine
cp cache.sh ~/.config/direnv/05-cache.sh
icdiff cache.sh ~/.config/direnv/05-cache.sh
WITH_CACHE_DIR=/tmp/with_cache WITHOUT_CACHE_DIR=/tmp/without_cache
rm $WITH_CACHE_DIR $WITHOUT_CACHE_DIR -rf
mkdir $WITH_CACHE_DIR $WITHOUT_CACHE_DIR
ln -sf $(realpath pyproject.toml) $WITH_CACHE_DIR/
ln -sf $(realpath tool-versions) $WITH_CACHE_DIR/
( cd $WITH_CACHE_DIR && poetry install )
ln -sf $(realpath envrc_with_cache) $WITH_CACHE_DIR/.envrc
direnv allow $WITH_CACHE_DIR/.envrc
ln -sf $(realpath pyproject.toml) $WITHOUT_CACHE_DIR/
ln -sf $(realpath tool-versions) $WITHOUT_CACHE_DIR/
( cd $WITHOUT_CACHE_DIR && poetry install )
ln -sf $(realpath envrc_without_cache) $WITHOUT_CACHE_DIR/.envrc
direnv allow $WITHOUT_CACHE_DIR/.envrc
:
export DIRENV_CACHE_DEBUG=1
direnv exec "$WITH_CACHE_DIR" bash -c "ls $WITH_CACHE_DIR/.env -al"
:
hyperfine -w 10 -L dir "$WITH_CACHE_DIR","$WITHOUT_CACHE_DIR" 'cd {dir}'
Benchmark 1: cd /tmp/with_cache Time (mean ± σ): 0.0 ms ± 0.1 ms [User: 0.1 ms, System: 0.1 ms] Range (min … max): 0.0 ms … 1.5 ms 3353 runs Benchmark 2: cd /tmp/without_cache Time (mean ± σ): 0.1 ms ± 0.1 ms [User: 0.1 ms, System: 0.1 ms] Range (min … max): 0.0 ms … 4.6 ms 3140 runs Summary 'cd /tmp/with_cache' ran 1.13 ± 3.65 times faster than 'cd /tmp/without_cache'
# shellcheck disable=SC2155
# shellcheck disable=SC1090
use_cache() {
[[ -v DIRENV_CACHE_IGNORE ]] && {
_debug "Ignoring cache, DIRENV_CACHE_IGNORE is set"
return
}
[[ ${DIRENV_CACHE_DEBUG:-0} -gt 1 ]] && {
set_x
set -uo pipefail
}
local cache_filename=${1:-.env}
local cache_file=$(get_cache_file "$cache_filename")
# if cache exists and nonzero
if [[ -s "$cache_file" ]]; then
# Load preemptively
load_cache "$cache_file"
# Then verify (and reload if necessary)
verify_cache "$cache_file"
else
_debug "Rebuilding cache: ${cache_file} missing or zero"
build_and_load_cache "$cache_file"
fi
exit $?
}
get_cache_file() {
# Ensure the cache file is in the same directory as the RC file
local cache_filename=${1:?"Cache filename is required"}
local rcfile=$(find_up ".envrc")
builtin echo -n "${rcfile%%/.*}/$cache_filename"
}
verify_cache () {
local cache_file=${1:?"Cache file required"}
# runs direnv current for all .Path in $DIRENV_WATCHES (in parallel)
# xargs will return 0 only if the command is successful for all inputs
direnv show_dump "$DIRENV_WATCHES" | jq -r '.[]|.Path' | xargs -n1 -P0 direnv current
local status=$?
if [[ $status -gt 0 ]]; then
_debug "Cache is stale, rebuilding"
build_and_load_cache "$cache_file"
fi
}
build_cache() {
local cache_file=${1:?"Cache file required"}
if [[ -v DIRENV_CACHE_DEBUG ]]; then
local stderr_file=$(mktemp)
else
local stderr_file=/dev/null
fi
# we use json/jq because the bash export uses $'' c-strings which are not
# easy to get rid of with sed
# DIRENV_LOG_FORMAT='' will turn off direnv logging
# DIRENV_CACHE_IGNORE=1 so that we can build the cache without using it
local cache_contents=$(
set -o pipefail
env DIRENV_CACHE_IGNORE=1 DIRENV_LOG_FORMAT="" direnv export json 2>"$stderr_file" | jq -r 'to_entries | map("export \(.key)=\(.value|@sh)")[]'
)
local status=$?
if [[ -v DIRENV_CACHE_DEBUG ]]; then
local stderr_content=$(<"$stderr_file") && rm "$stderr_file"
else
local stderr_content=""
fi
if [[ $status -eq 0 ]]; then
_debug "Built cache: ${cache_file} contents: <${cache_contents}> stderr: <$stderr_content>"
builtin echo -n "$cache_contents" >"$cache_file" || _debug "Cache build failed while writing to $cache_file"
return
else
_debug "Cache build failed: $stderr_content"
return $status
fi
}
load_cache() {
local cache_file=${1:?"Cache file required"}
# we could use dotenv instead, but we don't need `watch_file`, and this is compatible?
source "$cache_file" || {
_debug "Cache load failed: $cache_file"
exit $?
}
_debug "Loaded from cache $cache_file"
}
build_and_load_cache() {
local cache_file=${1:?"Cache file required"}
build_cache "$cache_file" || {
_debug "Cache build failed"
exit $?
}
load_cache "$cache_file"
}
_debug() {
# Return status of this function is always the previous status.
#
# Prints $1 if DIRENV_CACHE_DEBUG is set. (Note that you probably have to
# ~export~ it, not just set it, since all this code runs in a subshell)
{
local status=$?
[[ -o xtrace ]] && {
shopt -uo xtrace
local xtrace_was_on=1
}
} 2>/dev/null
local msg=${1:?"Message required"}
[[ -v DIRENV_CACHE_DEBUG ]] && echo "$msg (status: $status)" >&2
{
[[ ${xtrace_was_on:-0} -eq 1 ]] && shopt -so xtrace
return $status
} 2>/dev/null
}
# Local Variables:
# sh-shell: bash
# End: