Skip to content
This repository has been archived by the owner on May 5, 2023. It is now read-only.

Using mtime.secs for hash is incompatible with Nix managed configurations #42

Closed
Steven0351 opened this issue Dec 19, 2021 · 7 comments
Closed

Comments

@Steven0351
Copy link

I've migrated to using Nix to manage my dotfiles and system configuration which which sets all file timestamps to 0. After migrating my fledgling neovim configuration over, I noticed that none of my changes were taking after updating my configs unless I deleted ~/.cache/nvim/luacache. This was not an issue before I migrated my configuration over to nix. After digging into the source, I see the hash function is as follows:

local function hash(modpath)
  local stat = fs_stat(modpath)
  if stat then
    return stat.mtime.sec
  end
end

For this to be more robust an actual file hash would need to be performed, though I imagine that would also slow things down. However, if the goal is for this plugin to eventually be merged into neovim core I think this would impact every Nix neovim user if it were merged with the hash function as-is.

@lewis6991
Copy link
Owner

lewis6991 commented Dec 20, 2021

Sorry I don't understand why using the modified time for cache invalidation is wrong or why nix has issues with it? Can you elaborate more?

Using a full file hash would probably nullify any benefits this plugin gives since it would require loading all the files it is caching in order to create a hash. There would just little point.

Is the problem that nix has issues with libuv or something? And if so shouldn't that be fixed?

@lewis6991
Copy link
Owner

lewis6991 commented Dec 20, 2021

Python has (and solved) this already in PEP552

The current Python pyc format is the marshaled code object of the module prefixed by a magic number, the source timestamp, and the source file size. The presence of a source timestamp means that a pyc is not a deterministic function of the input file’s contents—it also depends on volatile metadata, the mtime of the source. Thus, pycs are a barrier to proper reproducibility.

Distributors of Python code are currently stuck with the options of

  1. not distributing pycs and losing the caching advantages
  2. distributing pycs and losing reproducibility carefully giving all Python source files a deterministic timestamp (see, for example, bpo-29708: support SOURCE_DATE_EPOCH env var in py_compile (allow for reproducible builds of python packages) python/cpython#296)
  3. doing a complicated mixture of 1. and 2. like generating pycs at installation time

None of these options are very attractive. This PEP proposes allowing the timestamp to be replaced with a deterministic hash. The current timestamp invalidation method will remain the default, though. Despite its nondeterminism, timestamp invalidation works well for many workflows and usecases. The hash-based pyc format can impose the cost of reading and hashing every source file, which is more expensive than simply checking timestamps. Thus, for now, we expect it to be used mainly by distributors and power use cases.

@Steven0351
Copy link
Author

Steven0351 commented Dec 20, 2021

Is the problem that nix has issues with libuv or something? And if so shouldn't that be fixed?

It's not really anything to do with libuv specifically, just that any time-stamp based approach for caching anything being managed by nix is always going to be a cache-hit.

Here's an example from my system:

❯ ll ~/.config/nvim/init.lua
lrwxr-xr-x 84 stevensherry 19 Dec 14:54 /Users/stevensherry/.config/nvim/init.lua -> /nix/store/j0c14n9i00df7cf815lpmxspjxf4qv07-home-manager-files/.config/nvim/init.lua

That looks fine, but that timestamp is for the symlink. Here is the timestamp for the final file (in my case this is symlink -> symlink -> concrete file)

❯ ll /nix/store/053hwqs15iz3qsvs8pdw0pgngciq3h1k-hm_nvim/init.lua
.r--r--r-- 523 root 31 Dec  1969 /nix/store/053hwqs15iz3qsvs8pdw0pgngciq3h1k-hm_nvim/init.lua

The SHA256 hash is the same:

❯ openssl dgst -sha256 ~/.config/nvim/init.lua
SHA256(/Users/stevensherry/.config/nvim/init.lua)= 
c341e0a9a52d2a4802a5e179b93241c56bb913d80bcbcc3f5c4d5dc8db14cab8

❯ openssl dgst -sha256 /nix/store/053hwqs15iz3qsvs8pdw0pgngciq3h1k-hm_nvim/init.lua
SHA256(/nix/store/053hwqs15iz3qsvs8pdw0pgngciq3h1k-hm_nvim/init.lua)= 
c341e0a9a52d2a4802a5e179b93241c56bb913d80bcbcc3f5c4d5dc8db14cab8

The problem becomes when I do an update, the impatient cache will not invalidate because Nix sets all timestamps in /nix/store to 0 (this is a feature not a bug).

I guess this is a long-winded way of trying to figure out if this is just meant to be good enough for the majority use-case. If so, I respect that decision and feel free to close the issue if you see no value in this 👍.

@lewis6991
Copy link
Owner

We can take the same approach python has, where the current solution of using mtime will work well for 95% of users and thus should be the default.

If you want to put forward a PR that allows an alternative way of validating the cache then I'm happy to review it. However, note that Neovim currently doesn't have any hashing functions available, so one would need to be added somehow; either implemented from scratch or imported from a third party library.

@lewis6991 lewis6991 changed the title Using mtime.secs for hash causing issues with Nix managed configurations Using mtime.secs for hash is incompatible with Nix managed configurations Jan 12, 2022
@azuwis
Copy link

azuwis commented Mar 1, 2022

Meet the same problem, and use:

  home.activation.neovim = lib.hm.dag.entryAfter [ "writeBoundary" ] ''
    rm ~/.cache/nvim/luacache_chunks ~/.cache/nvim/luacache_modpaths
  '';

as a workaround.

@lewis6991
Copy link
Owner

I've opened #50, so the hash uses the file size too. It doesn't completely fix this issue but it should mitigate it a bit since the cache will work, though with a much higher chance of a false-positive cache hit.

@lewis6991
Copy link
Owner

Closing this.

The hashing is now pretty similar to what pycache does, if that isn't good enough for nix, then that's nix's problem.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants