From ae6d7e3c7cf5c9a66ae4126b7c752556de8edb00 Mon Sep 17 00:00:00 2001 From: Wout Mertens Date: Fri, 11 Aug 2017 23:51:27 +0200 Subject: [PATCH 01/10] Intensional Store --- rfcs/0000-intensional-store.md | 110 +++++++++++++++++++++++++++++++++ 1 file changed, 110 insertions(+) create mode 100644 rfcs/0000-intensional-store.md diff --git a/rfcs/0000-intensional-store.md b/rfcs/0000-intensional-store.md new file mode 100644 index 000000000..b11656d20 --- /dev/null +++ b/rfcs/0000-intensional-store.md @@ -0,0 +1,110 @@ +--- +feature: intensional_store +start-date: 2017-08-11 +author: Wout.Mertens@gmail.com +co-authors: (find a buddy later to help our with the RFC) +related-issues: (will contain links to implementation PRs) +--- + +# Summary +[summary]: #summary + +One paragraph explanation of the feature. + +# Motivation +[motivation]: #motivation + +* Better re-use of inputs between compiles +* Faster updates via nix-channel +* Less compiling +* More benefit from reproducible compiles, so more reason to work on that + +# Detailed design +[design]: #detailed-design + +Terms used: +* derivation: a `nix-build` output product depending on some inputs and resulting in a file or directory under `/nix/store` +* dependent derivation: a derivation built using the currently considered derivation +* `$out`: name of the location where a derivation is built, e.g., `zyb6qaasr5yhh2r4484x01cy87xzddn7-unit-script-1.12` + * calculated based on the hashes of all the inputs, including build tools +* `$cas`: output hash, the total hash of all the files under $out, with the derivation name appended, e.g., `qqzyb6bsr5yhh2r5624x01cy87xzn7aa-unit-script-1.12` + +## Concept + +The basic concept is aliasing equivalent input derivations in such a way that dependent derivations won't need to change if only `$out` changes but not the input derivation contents. + +After building a derivation, `$cas` is calculated, and `$out` is renamed to `$cas`. Then, if another build requires the input `$out`, it gets `$cas` instead, and all references to that build input will be `$cas` instead of `$out`. That dependent derivation will also have its input hash calculated with the `$cas` instead of the `$out`. + +This means that if 2 different derivations of the same input have a different `$out` but the same `$cas`, any dependent builds will not need to rebuild due to the inputs being different. For example, the 12MB input `poppler-data` is often the same across multiple different input derivations, so many `$out`s for `poppler-data` all result in the same `$cas`. Similarly, a compiler flag change might leave most derivations unchanged. + +In order to know which `$out`s refer to a particular `$cas`, symlinks can be used (`$out` pointing to `$cas`), or that data can be stored in the store database. The database can help with doing reverse lookups from `$cas` to all the `$out`s. + +## Calculating `$cas` + +There is one important corner case that needs special handling: if a derivation refers to itself, it will be referring to `$out`, because `$cas` is not known at the time of the build. This means that each `$out` of a furthermore equivalent build would have a different hash, due to the different `$out`s. + +To fix this, the `$cas` calculation has to replace all occurrences of `$out` with an equal-length string of (for example) NULL bytes. After that, `$out` is renamed to `$cas` and all occurrences of `$out` are replaced with `$cas`. + +This also means that `$out` and `$cas` should have the same length. The easiest way to achieve that is to use the same hash function for the output hash as used for the input hashes. + +To calculate `$cas` we need to include all the data that uniquely defines a derivation: the file contents, case-sensitive names, and the permission bits, traversed in a fixed order, no matter what the filesystem or platform. Not to be included are the owning `uid` of the store and timestamps. + +## Distributing derivations + +Since `$cas` is only known when `$out` is built, binary caches would need to retain that information. When you look up `$out` to see if it was built already, the response should be _"Yes, this is available as `$cas`"_. + +## Maintaining the Nix store + +When garbage collecting, the Nix store should also remove `$out` references (be they symlinks or db entries) when removing a `$cas`. + +## Micro-optimizations not worth considering + +* By stripping the version from `$cas`, it could be the same for multiple versions of the same derivation. + * However, increased version numbers mean the derivation actually changed, so there is no point in doing that. +* By stripping the name and the version from `$cas`, it could be the same for multiple different derivations. + * However, this makes it hard to find out what derivation a certain `$cas` is + * Furthermore, different inputs with the same contents are very unlikely, and there is no reduction in builds that need to be done. + +Finally, `nix-store` supports hardlinking duplicate files, so the above optimizations are useless. + +# Drawbacks +[drawbacks]: #drawbacks + +* Extra code to maintain +* Slightly more processing after a build + +# Alternatives +[alternatives]: #alternatives + +* No change: This is only an optimization, it won't change the fundamental working of Nix in any way + +# Unresolved questions +[unresolved]: #unresolved-questions + +* Whether to store mappings as symlinks or db entries +* Exactly how the Hydra protocol needs to be changed + +# Future work +[future]: #future-work + +## Input-agnostic derivations + +If a derivation with a new input is the same except that it has a changed reference to that input (e.g., a script referring to its interpreter, or a binary using a new library version), we call this an input-agnostic derivation for those two input versions (old and new input). + + * To detect this, calculate the hash over the derivation, replacing *all* input references with NULL bytes. If that resulting hash is the same as a previous derivation, it is input-agnostic for those versions. + * This means that instead of downloading for installing it, it could be patched together from the previous version, by patching the old input `$cas`s with the new `$cas`s. + * This could keep storage and network traffic for Hydra down, by storing the previous `$cas` and the strings that need to be patched. + +### …and beyond: + +Knowing this also could enable a building shortcut: If a dependent derivation needs rebuilding, and a previous version is available depending on an input-agnostic derivation, it could be generated by patching in the new `$cas`. + +This will not always work, i.e., when the input-agnostic derivation is used to copy data from the input it is agnostic over, it results in a change besides the input reference. + +Therefore, this optimization should be optional, defaulting to off. + +## Reproducible builds + +If two derivations are the same except for some irrelevant build-environment changes, they won't get the same `$cas`. Since this impacts rebuilds, there is more incentive to have fully reproducible builds. + +Hopefully this means we'll have it at some point, so we can crowd-source `$out` to `$cas` mappings by trusting many systems that get the same result. From a7b3772ba484ddd4bc6abd8294ae93ae7d71c0e5 Mon Sep 17 00:00:00 2001 From: Wout Mertens Date: Fri, 11 Aug 2017 23:55:01 +0200 Subject: [PATCH 02/10] Rename to PR # --- rfcs/{0000-intensional-store.md => 0017-intensional-store.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename rfcs/{0000-intensional-store.md => 0017-intensional-store.md} (100%) diff --git a/rfcs/0000-intensional-store.md b/rfcs/0017-intensional-store.md similarity index 100% rename from rfcs/0000-intensional-store.md rename to rfcs/0017-intensional-store.md From 520e3e2c99b8a4876dfbcf1bf2c110568383efd2 Mon Sep 17 00:00:00 2001 From: Dmitry Kalinkin Date: Fri, 11 Aug 2017 18:55:17 -0400 Subject: [PATCH 03/10] self-references may not be discoverable by grep --- rfcs/0017-intensional-store.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/rfcs/0017-intensional-store.md b/rfcs/0017-intensional-store.md index b11656d20..b2ce9222b 100644 --- a/rfcs/0017-intensional-store.md +++ b/rfcs/0017-intensional-store.md @@ -25,7 +25,7 @@ One paragraph explanation of the feature. Terms used: * derivation: a `nix-build` output product depending on some inputs and resulting in a file or directory under `/nix/store` * dependent derivation: a derivation built using the currently considered derivation -* `$out`: name of the location where a derivation is built, e.g., `zyb6qaasr5yhh2r4484x01cy87xzddn7-unit-script-1.12` +* `$out`: name of the location where a derivation is installed first, e.g., `zyb6qaasr5yhh2r4484x01cy87xzddn7-unit-script-1.12` * calculated based on the hashes of all the inputs, including build tools * `$cas`: output hash, the total hash of all the files under $out, with the derivation name appended, e.g., `qqzyb6bsr5yhh2r5624x01cy87xzn7aa-unit-script-1.12` @@ -37,7 +37,7 @@ After building a derivation, `$cas` is calculated, and `$out` is renamed to `$ca This means that if 2 different derivations of the same input have a different `$out` but the same `$cas`, any dependent builds will not need to rebuild due to the inputs being different. For example, the 12MB input `poppler-data` is often the same across multiple different input derivations, so many `$out`s for `poppler-data` all result in the same `$cas`. Similarly, a compiler flag change might leave most derivations unchanged. -In order to know which `$out`s refer to a particular `$cas`, symlinks can be used (`$out` pointing to `$cas`), or that data can be stored in the store database. The database can help with doing reverse lookups from `$cas` to all the `$out`s. +In order to know which `$out`s refer to a particular `$cas`, symlinks can be used (`$out` pointing to `$cas`), or that data can be stored in the store database. The database can help with doing reverse lookups from `$cas` to all the `$out`s. Using symlinks will have a benefit of handling the case when self-references that are not discoverable via grep (e.g. filtered by ```xxd(1)```). ## Calculating `$cas` From ee3ed3a1b99ebe4c5e387b29565a0a5e35b2d6b3 Mon Sep 17 00:00:00 2001 From: Wout Mertens Date: Thu, 18 Jul 2019 16:13:43 +0200 Subject: [PATCH 04/10] add shepherd team MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-Authored-By: Domen Kožar --- rfcs/0017-intensional-store.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/rfcs/0017-intensional-store.md b/rfcs/0017-intensional-store.md index b2ce9222b..a63727e85 100644 --- a/rfcs/0017-intensional-store.md +++ b/rfcs/0017-intensional-store.md @@ -3,6 +3,8 @@ feature: intensional_store start-date: 2017-08-11 author: Wout.Mertens@gmail.com co-authors: (find a buddy later to help our with the RFC) +shepherd-team: Shea Levy, Vladimír Čunát, Eelco Dolstra, Nicolas B. Pierron +shepherd-leader: Shea Levy related-issues: (will contain links to implementation PRs) --- From 3fda171f9676f35f4bf62a52461d70f9da739537 Mon Sep 17 00:00:00 2001 From: Wout Mertens Date: Thu, 12 Dec 2019 06:48:57 +0100 Subject: [PATCH 05/10] Update rfcs/0017-intensional-store.md Co-Authored-By: Benjamin Staffin --- rfcs/0017-intensional-store.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/rfcs/0017-intensional-store.md b/rfcs/0017-intensional-store.md index a63727e85..d5bf3f6e2 100644 --- a/rfcs/0017-intensional-store.md +++ b/rfcs/0017-intensional-store.md @@ -67,7 +67,7 @@ When garbage collecting, the Nix store should also remove `$out` references (be * However, this makes it hard to find out what derivation a certain `$cas` is * Furthermore, different inputs with the same contents are very unlikely, and there is no reduction in builds that need to be done. -Finally, `nix-store` supports hardlinking duplicate files, so the above optimizations are useless. +Finally, `nix-store` supports hardlinking duplicate files, so the above optimizations are superfluous. # Drawbacks [drawbacks]: #drawbacks From 1c7f74926e55a5e8302a89c201fab1672124cc96 Mon Sep 17 00:00:00 2001 From: Wout Mertens Date: Thu, 5 Nov 2020 13:55:47 +0100 Subject: [PATCH 06/10] 17: complete rework --- rfcs/0017-intensional-store.md | 291 ++++++++++++++++++++++++++------- 1 file changed, 228 insertions(+), 63 deletions(-) diff --git a/rfcs/0017-intensional-store.md b/rfcs/0017-intensional-store.md index d5bf3f6e2..ad0c3a075 100644 --- a/rfcs/0017-intensional-store.md +++ b/rfcs/0017-intensional-store.md @@ -8,105 +8,270 @@ shepherd-leader: Shea Levy related-issues: (will contain links to implementation PRs) --- -# Summary -[summary]: #summary +# Intensional Store -One paragraph explanation of the feature. +## TODO / to explain -# Motivation -[motivation]: #motivation +- flesh out Trust DB locations and updating/querying/merging multiple +- query service +- efficient distribution of mappings +- explain the benefits of late binding and how it improves installs on low-power systems +- script that uses nix make-content-addressable in /var/lib/nix to generate a store +- how binary caches provide \$cas and mappings -* Better re-use of inputs between compiles -* Faster updates via nix-channel -* Less compiling -* More benefit from reproducible compiles, so more reason to work on that +## Overview -# Detailed design -[design]: #detailed-design +This RFC builds on the implementation of RFC 62 to maximize its benefits. The store becomes decoupled from the metadata. -Terms used: -* derivation: a `nix-build` output product depending on some inputs and resulting in a file or directory under `/nix/store` -* dependent derivation: a derivation built using the currently considered derivation -* `$out`: name of the location where a derivation is installed first, e.g., `zyb6qaasr5yhh2r4484x01cy87xzddn7-unit-script-1.12` - * calculated based on the hashes of all the inputs, including build tools -* `$cas`: output hash, the total hash of all the files under $out, with the derivation name appended, e.g., `qqzyb6bsr5yhh2r5624x01cy87xzn7aa-unit-script-1.12` +### Benefits -## Concept +By using content hashes instead of output hashes, we can: -The basic concept is aliasing equivalent input derivations in such a way that dependent derivations won't need to change if only `$out` changes but not the input derivation contents. +- optimize resource usage +- reduce binary substitution trust to a single lookup +- make the Nix store network-writeable and world-shareable +- predefine mappings from output to content hash without building +- store paths can be verified without access to the Nix Store DB -After building a derivation, `$cas` is calculated, and `$out` is renamed to `$cas`. Then, if another build requires the input `$out`, it gets `$cas` instead, and all references to that build input will be `$cas` instead of `$out`. That dependent derivation will also have its input hash calculated with the `$cas` instead of the `$out`. +Additionally, this is an opportunity to move the Nix store to a filesystem location supported by most non-NixOS systems, namely `/var/lib/nix`. -This means that if 2 different derivations of the same input have a different `$out` but the same `$cas`, any dependent builds will not need to rebuild due to the inputs being different. For example, the 12MB input `poppler-data` is often the same across multiple different input derivations, so many `$out`s for `poppler-data` all result in the same `$cas`. Similarly, a compiler flag change might leave most derivations unchanged. +By "cleaning up" the filesystem state of Nix, a host of possibilities emerge: -In order to know which `$out`s refer to a particular `$cas`, symlinks can be used (`$out` pointing to `$cas`), or that data can be stored in the store database. The database can help with doing reverse lookups from `$cas` to all the `$out`s. Using symlinks will have a benefit of handling the case when self-references that are not discoverable via grep (e.g. filtered by ```xxd(1)```). +- Boot a cloud VM to a specific system by passing a `$cas` name for stage2. The stage1 will auto-download the stage2 if it's missing and switch to it. +- Cross-compiling can generate `$cas` entries that are reused for native compiles via `$out` mapping. This is useful on low-resource platforms. +- The Nix store doesn't require any support or metadata. On embedded systems, all management of the store can be performed outside the system. +- References to `$cas` entries, such as profiles, are no longer tied to a single system. +- A FUSE filesystem could auto-install `$cas` entries as they are referenced, hanging the I/O until the entry is downloaded and verified. +- You can copy a store from some other install, and immediately use profiles without having their metadata. +- Different Nix tooling and metadata implementations can use the same store -## Calculating `$cas` +…and so on. Decouplying systems brings exponential possibilities. -There is one important corner case that needs special handling: if a derivation refers to itself, it will be referring to `$out`, because `$cas` is not known at the time of the build. This means that each `$out` of a furthermore equivalent build would have a different hash, due to the different `$out`s. +### Drawbacks -To fix this, the `$cas` calculation has to replace all occurrences of `$out` with an equal-length string of (for example) NULL bytes. After that, `$out` is renamed to `$cas` and all occurrences of `$out` are replaced with `$cas`. +There are some small drawbacks: -This also means that `$out` and `$cas` should have the same length. The easiest way to achieve that is to use the same hash function for the output hash as used for the input hashes. +- we have to assume that a `$cas` collision is impossible in practice +- by removing the derivation name from the store paths, the store becomes more opaque and requires good tooling for manual management +- garbage collection is more complex when the store is shared between hosts +- `$cas` entries without metadata are opaque, and might contain malware or illegal content. If nothing references it, there is no problem with the content. Garbage collection takes care of unused entries. +- A hash collision would allow inserting malware into a widely used `$cas`. This is already possible without `$cas`, but trusting the hashes may lead to wider cache use. Remedies include using secure hashes, scanning for malware, using multiple hashes and comparing between binary caches, … +- An attacker can provide malicious `$out` => `$cas` mappings. This is already somewhat possible via binary caches. To remedy, trust mappings might be signed by a trusted key. -To calculate `$cas` we need to include all the data that uniquely defines a derivation: the file contents, case-sensitive names, and the permission bits, traversed in a fixed order, no matter what the filesystem or platform. Not to be included are the owning `uid` of the store and timestamps. +### Terminology -## Distributing derivations +We assume the following process when wanting to install a given package attribute `$attr`: -Since `$cas` is only known when `$out` is built, binary caches would need to retain that information. When you look up `$out` to see if it was built already, the response should be _"Yes, this is available as `$cas`"_. +- Nix evaluates the desired expressions and determines that a certain output hash `$out` is required +- `$out` is looked up in the Trust DB, to possibly yield `$cas`, its Content Addressable Storage hash +- if `$cas` is known: + - if `$cas` is present in the store, `$attr` is already installed; Done. + - if `$cas` is present on a binary cache, it is downloaded to the store; Done. +- `$out` is built using the normal mechanisms (see RFC 62 for more details) +- its `$cas` is calculated (see RFC 62 for more details) +- the generated metadata is stored in the Trust DB; Done. -## Maintaining the Nix store +### A note on reproducibility -When garbage collecting, the Nix store should also remove `$out` references (be they symlinks or db entries) when removing a `$cas`. +There is no need for a given `$out` to always generate the same `$cas`. It allows better resource use, but doesn't change anything about this RFC. There is no obligation that a single `$out` only stores a single `$cas` entry. -## Micro-optimizations not worth considering +## Nix Store -* By stripping the version from `$cas`, it could be the same for multiple versions of the same derivation. - * However, increased version numbers mean the derivation actually changed, so there is no point in doing that. -* By stripping the name and the version from `$cas`, it could be the same for multiple different derivations. - * However, this makes it hard to find out what derivation a certain `$cas` is - * Furthermore, different inputs with the same contents are very unlikely, and there is no reduction in builds that need to be done. +## FHS compatibility -Finally, `nix-store` supports hardlinking duplicate files, so the above optimizations are superfluous. +Since we're working on the store layer, we have the opportunity to split up the current `/nix` directory and make it FHS compliant. This makes it easier to use Nixpkgs where creating `/nix` is not possible. -# Drawbacks -[drawbacks]: #drawbacks +An [informal discussion](https://discourse.nixos.org/t/nix-var-nix-opt-nix-usr-local-nix/7101) concluded that the Store should be located at `/var/lib/nix` for maximum compatibility. -* Extra code to maintain -* Slightly more processing after a build +As for the contents of `/nix/var`: -# Alternatives -[alternatives]: #alternatives +- `/nix/var/log` should go under central or per-user log. +- `/nix/var/nix`: + - `db`: Store database. Its contents will be spread across Trust DBs. + - `daemon-socket`: Builder service. Should move to appropriate location for service sockets, like `/run` + - `gc`: See the Garbage Collection section + - `profiles`: Are now maintained per user/system + - User profiles, channels etc go under `$NIX_CONF_DIR/{profiles,channels,auto_roots}` per user + - System profiles, default user profile, channels etc go under `/nix/var/nix-profiles/{system,user,channels,auto_roots}` + - `temproots`, `userpool`: Builder service. Should move to appropriate locations for services, like `/var/tmp` and `/var/lib` -* No change: This is only an optimization, it won't change the fundamental working of Nix in any way +### Contents -# Unresolved questions -[unresolved]: #unresolved-questions +The Store should be verifiable, and only contain verifiable paths. However, to allow atomic installation over the network, there should be a directory for staging an installation. Some other operations also need supporting directories. -* Whether to store mappings as symlinks or db entries -* Exactly how the Hydra protocol needs to be changed +For cosmetics and wildcard expansion, we hide supporting directories from regular view. -# Future work -[future]: #future-work +Therefore, these are the Store contents, all part of the same mount point to ensure atomic semantics: -## Input-agnostic derivations +- `$cas`: a self-validating derivation. Any path matching the proper hash length is subject to verification at any time, and is be moved to `.quarantaine` if verification fails +- `.prepare`: this directory can be used by anyone to prepare a derivation before adding it to the Store, by picking a non-conflicting subpath +- `.stage`: after preparing, the derivation is moved here +- `.daemon`: if there is a store daemon, it might use this path to prepare installation +- `.quarantaine`: whenever a non-compliant path is encountered, it is moved here +- `.links`: used to hard-link identical store files +- `.gc`: used to communicate about garbage collection +- anything else doesn't belong in the Store and should be removed -If a derivation with a new input is the same except that it has a changed reference to that input (e.g., a script referring to its interpreter, or a binary using a new library version), we call this an input-agnostic derivation for those two input versions (old and new input). +The timestamps of files/directories are kept 0, and the user and group ownership are recommended to be a single user, for example `root:root` or `store:store`. +Note that for a shared store, two systems might see different ownership values; this is acceptable. - * To detect this, calculate the hash over the derivation, replacing *all* input references with NULL bytes. If that resulting hash is the same as a previous derivation, it is input-agnostic for those versions. - * This means that instead of downloading for installing it, it could be patched together from the previous version, by patching the old input `$cas`s with the new `$cas`s. - * This could keep storage and network traffic for Hydra down, by storing the previous `$cas` and the strings that need to be patched. +### Metadata -### …and beyond: +For a given store path, there is no objective metadata other than the path itself. Required dependencies, name, output hash and so on are trusted data, and as such should be stored in the Trust DB. -Knowing this also could enable a building shortcut: If a dependent derivation needs rebuilding, and a previous version is available depending on an input-agnostic derivation, it could be generated by patching in the new `$cas`. +It could be said that the required dependencies are somewhat objective, but one build may decide that a certain runtime dependency is necessary while another may not. Other than that, it's also inconvenient to have to pass metadata out-of-band. -This will not always work, i.e., when the input-agnostic derivation is used to copy data from the input it is agnostic over, it results in a change besides the input reference. +However, having the list of dependencies can be useful, and therefore we make it optional. If a `$cas` entry is a directory and contains the file `nix-dependencies` as a direct child, this file can contain `$cas` names (without path), separated by newlines. Whenever processing dependencies, these entries are considered. Out-of-band metadata can note extra dependencies, but can't strike a dependency. -Therefore, this optimization should be optional, defaulting to off. +One use case for in-band dependencies is a self-installing application, where all you need is a `$cas` to get the entire application. -## Reproducible builds +It is up to the build to decide when to include `nix-dependencies` and if it should include transitive dependencies. -If two derivations are the same except for some irrelevant build-environment changes, they won't get the same `$cas`. Since this impacts rebuilds, there is more incentive to have fully reproducible builds. +The name `nix-dependencies` is chosen because it's unlikely to clash with package files. Furthermore, it is only valid if it contains nothing but `$cas` names separated by single newlines. -Hopefully this means we'll have it at some point, so we can crowd-source `$out` to `$cas` mappings by trusting many systems that get the same result. +### Trust DB + +Here is a selection of metadata generated when building a given output: + +- `$name` (can include version number and output name) +- inputs / dependencies (including runtime) +- size +- build time and duration +- description (from derivation) + +For installation and searching, these are relevant for each store path `$cas`: + +- list of `$out` that are known to result in `$cas` +- `$cas` of all dependencies +- `$name` +- description +- size + +These have to be consulted for any installation request, and therefore they should be easy to retrieve. Build hosts can provide a log of new and changed entries, enabling differential updates. Binary caches could provide a query service (both for "what does `$out` map to and "what is `$cas`"). + +### Sharing the Nix Store + +Since the Nix Store (minus supporting directories) contains only self-validating paths, it can be shared "infinitely", only limited by: + +- disk space +- network performance +- confidence around hash collision attacks +- confidence around writers corrupting paths without detection + +The installation step only involves moving a proposed path from `.prepare` to `.stage`, so no further communcation is necessary with the Store daemon. + +For single-user installs, the Store can trivially be maintained by the Nix tools, and converting to multi-user is only a matter of changing the permissions. + +It would even be possible to use FUSE to automatically download any paths that are referenced in the Store, hanging the I/O request while it's being downloaded. + +### Store Daemon + +Optionally, a daemon can maintain the Store. In this case, it is recommended be the only user with write access. It performs installations and verifications, described below. + +### Preparing + +`$cas` entries are prepared by building them in a path that has the same total length as its final `$cas` entry. This means they can be built practically anywhere, in `/tmp`, in a home directory, in the `.prepare` directory in the Store, etc. It is recommended to make a part of the path a unique-per-build string. +To make sure only the build's own references need rewriting, it is recommended to build using only `$cas` entries as dependencies, instead of relying on rewriting paths. + +After the build, its `$cas` is calculated and any occurences of the build path are replaced with `/var/lib/nix/$cas`. + +If there undetected build path references, they might cause the finished entry to work incorrectly, and they will cause `$cas` to differ on every build of `$out`. This must be handled on a case-by-case basis. + +The build can happen by a sandboxing build daemon like `nix-build`, but that is not a requirement. + +After preparing, the metadata for the build is added to the user's Trust DB. + +### Installation + +We use rename semantics to provide atomic installations. Prepared `$cas` entries are moved to their final location with a `rename` call, which is atomic but requires the path to be on the same filesystem. + +Atomicity is important to ensure that `$cas` entries are always valid. If they are copied instead, they don't self-validate during the copy. + +#### with Store Daemon + +Any user with write access to `/var/lib/nix/.prepare` and `/var/lib/nix/.stage` can ask for entries to be installed. To do so: + +1. They prepare entries to be stored in `/var/lib/nix/.prepare`, renaming each entry to its `$cas`. +1. They atomically move prepared paths to `/var/lib/nix/.stage`, in reverse dependency order, meaning dependencies of an entry are moved first. + +When the Store daemon discovers a new `$cas` entry under `.stage`: + +1. If the Store already contains this `$cas` entry, it removes this new one, perhaps first verifying the Store copy. +1. It recursively changes ownership of the path to itself and timestamps to 0, making sure that write permission is removed for everybody, and read permission is added for anybody. + If it has no permissions to do this, it instead copies the path into `/var/lib/nix/.daemon`, and another process will need to keep `.stage` clean. +1. The daemon verifies the hash. If the hash doesn't match, it removes the path. +1. If the path is a directory and there is a `nix-dependencies` file as a direct child, it checks that all dependencies are already present in the Store. If not, the path is held for a while and deleted if the dependencies don't appear in time (configurable). +1. It atomically moves the path into `/var/lib/nix`. + +Note that to ensure atomicity, `.prepare` and `.stage` need to be on the same filesystem, and either `.stage` or `.daemon` need so be on the same filesystem as the Store. + +#### without Store Daemon + +Any user with write access to `/var/lib/nix/.stage` and `/var/lib/nix` can install entries. To do so: + +1. They prepare entries to be stored in `/var/lib/nix/.stage`, renaming each entry to its `$cas`. +1. They atomically move prepared entries to `/var/lib/nix`, in reverse dependency order, meaning dependencies of an entry are moved first. + +Note that to ensure atomicity, `.stage` needs to be on the same filesystem as the Store. + +Note that when two writers are trying to install the same path, one of them might get an error, but the end result will be the same. (As long as the `$cas` is self-valid) + +### Verification + +A path in the Store is verified by calculating its `$cas`. If the `$cas` doesn't match, the path is moved to `/var/lib/nix/.quarantaine`, where a sysadmin has to investigate. + +Any process with write access to `/var/lib/nix` and `/var/lib/nix/.quarantaine` can do this, for example the Store daemon. + +### Garbage collection + +Garbage collection needs to identify store paths that are not used by anything on any of the systems sharing the same store. Here we propose a simple mechanism for coordination, but any mechanism is acceptable. + +- a host with store write access decides to run garbage collection +- it checks that `.gc/running_gc` does not exist or contains a very old timestamp, and writes a unique number to `.gc/will_gc` +- after waiting long enough to prevent collisions (for example 10 seconds), it reads `.gc/will_gc` and verifies it contains the unique number it wrote +- then it clears out `.gc` except for the files `.gc/will_gc` and adds the file `.gc/running_gc` containing the current timestamp +- each host's store daemon monitors `.gc/running_gc` at some interval, for example 1 minute +- while this file exists, the daemon must record its required `$cas` entries, by creating a 0-length file, named `.gc/$cas` + - entries that are referenced in `nix-dependencies` don't have to be marked +- the writer waits long enough for all the hosts to record their GC roots, for example 10 minutes +- after the wait period expired, the writer host scans for store paths that are not marked and not part of a marked entry's `nix-dependencies`. Each path is atomically moved to `.gc` and deleted +- finally, the writer host empties the `.gc` directory, leaving the `running_gc` file for last + +For a single-user installation or a non-shared Nix store, none of this is necessary, and the GC process remains unchanged, except for the new locations to search for GC roots. See next section. + +## Profiles + +A profile is a named reference to a `$cas` store path, to be used in arbitrary ways. To allow GC, the Store must know about all profiles, so they should be available at predictable paths. + +The current versioning system is reused: A profile with the name `$profile` is a symlink which points to `$profile-$v-link`, which is a symlink that points to the absolute path of the `$cas` entry. Tools like `nix-env` and `home-manager` can maintain the set of links for a profile. + +Since a profile points to an immutable `$cas` path, it is the same across systems and can therefore be part of a network-mounted home directory. + +However, a profile link itself is trusted information, and should be shared between users and systems only when they trust each other. + +Known paths that can contain profiles: + +- `$NIX_CONF_DIR/profiles`: per-user profiles and auto roots +- `/var/lib/nix-profiles`: NixOS system and shared profiles, and auto roots, maintained by the `root` user + +Other paths can of course be used, but the GC won't know about them, so a link should be maintained in one of the directories above. + +Example: This creates or updates the user-specific profile "vscode", adding Python 3: + +```sh +nix-env -p ~/.config/nix/profiles/vscode -i python3 +``` + +## Migration + +There is no real need for migrating stores, since `/nix` and `/var/lib/nix` can coexist and the tooling either uses one or the other. However, it is convenient to migrate built artifacts for implementing this RFC. + +To migrate an existing output from `/nix/store/$out-$name` to `/var/lib/nix/$cas`, the following approach will work most of the time: + +- migrate all its dependencies using the below steps +- for all files of `$out-$name`, replace all strings of the form `/nix/store/$out-$name` with `/var/lib/nix/$filler/$cas`. `$cas` has the same length as `$out` and the minimum length of `$name` is 1, so there is always room for the full `$cas`. The `$filler` is a string with length `l = length($name) - 1` of the form `./././/`, that is, repeat `./` `floor(l/2)` times and append `/` if `l` is odd. +- do the same with symlinks, but consider relative paths as well +- self-references must be updated after `$cas` is known, their contents must be skipped while calculating `$cas`, as described elsewhere +- once `$cas` is determined, the patched derivation can be placed in `/var/lib/nix/$cas` and the metadata recorded in the Trust DB + +This process will fail if the derivation refers to the Store in ways that aren't visible, like different string encoding and calculated paths. From e9c3340eff9dd39b4d1b84f1000b3f0ecee143e5 Mon Sep 17 00:00:00 2001 From: Wout Mertens Date: Thu, 5 Nov 2020 14:13:34 +0100 Subject: [PATCH 07/10] 17: updates from comments --- rfcs/0017-intensional-store.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/rfcs/0017-intensional-store.md b/rfcs/0017-intensional-store.md index ad0c3a075..d59623701 100644 --- a/rfcs/0017-intensional-store.md +++ b/rfcs/0017-intensional-store.md @@ -32,6 +32,7 @@ By using content hashes instead of output hashes, we can: - make the Nix store network-writeable and world-shareable - predefine mappings from output to content hash without building - store paths can be verified without access to the Nix Store DB +- public trust mappings allow detecting non-reproducible builds Additionally, this is an opportunity to move the Nix store to a filesystem location supported by most non-NixOS systems, namely `/var/lib/nix`. @@ -175,7 +176,7 @@ To make sure only the build's own references need rewriting, it is recommended t After the build, its `$cas` is calculated and any occurences of the build path are replaced with `/var/lib/nix/$cas`. -If there undetected build path references, they might cause the finished entry to work incorrectly, and they will cause `$cas` to differ on every build of `$out`. This must be handled on a case-by-case basis. +If there undetected build path references, they might cause the finished entry to work incorrectly, and they will cause `$cas` to differ on every build of `$out`. This must be handled on a case-by-case basis. Perhaps we'll need pluggable hash rewriters. The build can happen by a sandboxing build daemon like `nix-build`, but that is not a requirement. From d4ad873a0baae096fad96937cbfba067b14db66a Mon Sep 17 00:00:00 2001 From: Wout Mertens Date: Sun, 8 Nov 2020 16:38:59 +0100 Subject: [PATCH 08/10] 17: add summary and expand on details --- rfcs/0017-intensional-store.md | 76 ++++++++++++++++++++++++++++------ 1 file changed, 64 insertions(+), 12 deletions(-) diff --git a/rfcs/0017-intensional-store.md b/rfcs/0017-intensional-store.md index d59623701..84369b2bb 100644 --- a/rfcs/0017-intensional-store.md +++ b/rfcs/0017-intensional-store.md @@ -19,9 +19,15 @@ related-issues: (will contain links to implementation PRs) - script that uses nix make-content-addressable in /var/lib/nix to generate a store - how binary caches provide \$cas and mappings -## Overview +## Summary -This RFC builds on the implementation of RFC 62 to maximize its benefits. The store becomes decoupled from the metadata. +This RFC builds on the implementation of RFC 62 to maximize its benefits. + +- Decouple metadata (Trust DB) from the Store, keep per-user +- Move the Store to `/var/lib/nix` +- Store can be shared read-write on a network share +- `nix-daemon` becomes optional +- Store paths optionally provide dependency list in-band ### Benefits @@ -33,6 +39,7 @@ By using content hashes instead of output hashes, we can: - predefine mappings from output to content hash without building - store paths can be verified without access to the Nix Store DB - public trust mappings allow detecting non-reproducible builds +- easily switch between single- and multi-user setup Additionally, this is an opportunity to move the Nix store to a filesystem location supported by most non-NixOS systems, namely `/var/lib/nix`. @@ -46,18 +53,19 @@ By "cleaning up" the filesystem state of Nix, a host of possibilities emerge: - You can copy a store from some other install, and immediately use profiles without having their metadata. - Different Nix tooling and metadata implementations can use the same store -…and so on. Decouplying systems brings exponential possibilities. +… and so on. Decouplying systems brings exponential possibilities. ### Drawbacks There are some small drawbacks: -- we have to assume that a `$cas` collision is impossible in practice -- by removing the derivation name from the store paths, the store becomes more opaque and requires good tooling for manual management -- garbage collection is more complex when the store is shared between hosts +- We have to assume that we can always create a working `$cas` derivation, even if the build stores the build path in an opaque way. Otherwise, their build path would have to be a symlink to the `$cas` entry. +- By removing the derivation name from the store paths, the store becomes more opaque and requires good tooling for manual management. +- Garbage collection is more complex when the store is shared between hosts. +- An attacker can provide malicious `$out` => `$cas` mappings. This is already somewhat possible via binary caches. To remedy, trust mappings might be signed by a trusted key. +- We have to assume that a `$cas` collision is impossible in practice. - `$cas` entries without metadata are opaque, and might contain malware or illegal content. If nothing references it, there is no problem with the content. Garbage collection takes care of unused entries. - A hash collision would allow inserting malware into a widely used `$cas`. This is already possible without `$cas`, but trusting the hashes may lead to wider cache use. Remedies include using secure hashes, scanning for malware, using multiple hashes and comparing between binary caches, … -- An attacker can provide malicious `$out` => `$cas` mappings. This is already somewhat possible via binary caches. To remedy, trust mappings might be signed by a trusted key. ### Terminology @@ -84,7 +92,7 @@ Since we're working on the store layer, we have the opportunity to split up the An [informal discussion](https://discourse.nixos.org/t/nix-var-nix-opt-nix-usr-local-nix/7101) concluded that the Store should be located at `/var/lib/nix` for maximum compatibility. -As for the contents of `/nix/var`: +As for the contents of `/nix/var`, all of it can go elsewhere: - `/nix/var/log` should go under central or per-user log. - `/nix/var/nix`: @@ -167,7 +175,7 @@ It would even be possible to use FUSE to automatically download any paths that a ### Store Daemon -Optionally, a daemon can maintain the Store. In this case, it is recommended be the only user with write access. It performs installations and verifications, described below. +Optionally, a daemon can maintain the Store. In this case, it is recommended be the only user with write access. It performs installations, verifications and garbage collection, described below. ### Preparing @@ -215,7 +223,7 @@ Any user with write access to `/var/lib/nix/.stage` and `/var/lib/nix` can insta Note that to ensure atomicity, `.stage` needs to be on the same filesystem as the Store. -Note that when two writers are trying to install the same path, one of them might get an error, but the end result will be the same. (As long as the `$cas` is self-valid) +Note that when two writers are trying to install the same path, one of them might get an error, but the end result will be the same (as long as the `$cas` is self-valid). So multiple writers can also be on separate hosts, in a trusted setting. ### Verification @@ -232,7 +240,7 @@ Garbage collection needs to identify store paths that are not used by anything o - after waiting long enough to prevent collisions (for example 10 seconds), it reads `.gc/will_gc` and verifies it contains the unique number it wrote - then it clears out `.gc` except for the files `.gc/will_gc` and adds the file `.gc/running_gc` containing the current timestamp - each host's store daemon monitors `.gc/running_gc` at some interval, for example 1 minute -- while this file exists, the daemon must record its required `$cas` entries, by creating a 0-length file, named `.gc/$cas` +- while this file exists, the daemon must record its required `$cas` entries, by creating 0-length files named `.gc/$cas` - entries that are referenced in `nix-dependencies` don't have to be marked - the writer waits long enough for all the hosts to record their GC roots, for example 10 minutes - after the wait period expired, the writer host scans for store paths that are not marked and not part of a marked entry's `nix-dependencies`. Each path is atomically moved to `.gc` and deleted @@ -263,7 +271,9 @@ Example: This creates or updates the user-specific profile "vscode", adding Pyth nix-env -p ~/.config/nix/profiles/vscode -i python3 ``` -## Migration +## Administration Tasks + +### Migration There is no real need for migrating stores, since `/nix` and `/var/lib/nix` can coexist and the tooling either uses one or the other. However, it is convenient to migrate built artifacts for implementing this RFC. @@ -276,3 +286,45 @@ To migrate an existing output from `/nix/store/$out-$name` to `/var/lib/nix/$cas - once `$cas` is determined, the patched derivation can be placed in `/var/lib/nix/$cas` and the metadata recorded in the Trust DB This process will fail if the derivation refers to the Store in ways that aren't visible, like different string encoding and calculated paths. + +### Adding a Store Daemon + +To begin managing an existing Store with a Store Daemon, these steps are performed: + +- Change permissions on the Store root so only the daemon has write access. +- Ensure `.prepare`, `.stage` and `.quarantaine` with desired permissions. +- For each Store entry + - Recursively adjust permissions and timestamps + - Verify entry + - If invalid, move to `.quarantaine` and try to download replacement from known caches + +### Removing a Store Daemon + +- Wait for pending installs to complete. +- Stop Store Daemon. +- Change permissions on the Store as desired. + +## Implementation + +- NixPkgs needs to be audited to remove hard-coded `/nix` names, replacing it with the store path variable (TODO look up name). +- The Nix tools and Hydra need to be be branched to support the new location and store semantics. The tools either use the old location and semantics, or the new one. +- The binary cache server needs to serve `$cas` as compressed files and trust mappings in an incremental way + +## Alternative options + +There are a few choices made in this RFC, here we describe alternatives and why they were not picked. + +### Keep store at `/nix/store` + +Since the names are always different, the `$cas` entries could stay in `/nix/store`. The benefit would be that NixPkgs doesn't have to be audited for hardcoded `/nix` paths. + +However, this keeps the problem of some installations not having permission to create a `/nix` directory, and makes it much harder to share the store between hosts (as long as non-`$cas` entries are present). + +### Always store dependency list in `$cas` entry + +This would require single-file entries to be a directory instead, and then for symmetry and simplicity the directory entries would require the same. +So a path `/var/lib/nix/$cas` becomes for example `/var/lib/nix/$cas/_` (short non-descript name to keep references short), and the dependencies would be at `/var/lib/nix/$cas/deps`. +This allows adding other objective metadata, like late binding information, and other information that might be desired in the future. +Another benefit would be that root entries would be enough to know what paths to keep during garbage collection. + +However, this increases storage somewhat. From b0362a0a435f75de407ece4399b9db0f84477b20 Mon Sep 17 00:00:00 2001 From: Wout Mertens Date: Mon, 9 Nov 2020 00:08:43 +0100 Subject: [PATCH 09/10] 17: $cas.m metadata file --- rfcs/0017-intensional-store.md | 174 +++++++++++++++++++++------------ 1 file changed, 111 insertions(+), 63 deletions(-) diff --git a/rfcs/0017-intensional-store.md b/rfcs/0017-intensional-store.md index 84369b2bb..13ae23443 100644 --- a/rfcs/0017-intensional-store.md +++ b/rfcs/0017-intensional-store.md @@ -25,6 +25,7 @@ This RFC builds on the implementation of RFC 62 to maximize its benefits. - Decouple metadata (Trust DB) from the Store, keep per-user - Move the Store to `/var/lib/nix` +- Content checksums include the runtime dependencies - Store can be shared read-write on a network share - `nix-daemon` becomes optional - Store paths optionally provide dependency list in-band @@ -72,13 +73,13 @@ There are some small drawbacks: We assume the following process when wanting to install a given package attribute `$attr`: - Nix evaluates the desired expressions and determines that a certain output hash `$out` is required -- `$out` is looked up in the Trust DB, to possibly yield `$cas`, its Content Addressable Storage hash -- if `$cas` is known: - - if `$cas` is present in the store, `$attr` is already installed; Done. - - if `$cas` is present on a binary cache, it is downloaded to the store; Done. +- `$out` is looked up in the Trust DB, to possibly yield `$cas`, its Content Addressable Storage hash. +- If `$cas` is known: + - If `$cas` is present in the store, `$attr` is already installed; Done. + - If `$cas` is present on a binary cache, it is downloaded to the store; Done. - `$out` is built using the normal mechanisms (see RFC 62 for more details) -- its `$cas` is calculated (see RFC 62 for more details) -- the generated metadata is stored in the Trust DB; Done. +- Its `$cas` is calculated (see Preparing for more details). +- The generated metadata is stored in the Trust DB; Done. ### A note on reproducibility @@ -126,37 +127,59 @@ Note that for a shared store, two systems might see different ownership values; ### Metadata -For a given store path, there is no objective metadata other than the path itself. Required dependencies, name, output hash and so on are trusted data, and as such should be stored in the Trust DB. +For a given `$cas` entry, there is objective and subjective metadata. -It could be said that the required dependencies are somewhat objective, but one build may decide that a certain runtime dependency is necessary while another may not. Other than that, it's also inconvenient to have to pass metadata out-of-band. +Objective metadata examples: -However, having the list of dependencies can be useful, and therefore we make it optional. If a `$cas` entry is a directory and contains the file `nix-dependencies` as a direct child, this file can contain `$cas` names (without path), separated by newlines. Whenever processing dependencies, these entries are considered. Out-of-band metadata can note extra dependencies, but can't strike a dependency. +- `$cas` +- output size +- late binding information +- runtime dependencies (special case) -One use case for in-band dependencies is a self-installing application, where all you need is a `$cas` to get the entire application. +Subjective metadata examples: -It is up to the build to decide when to include `nix-dependencies` and if it should include transitive dependencies. +- `$out` +- name, version +- build timestamp and duration +- build-time dependencies +- nixpkgs commitish (not always applicable) +- name of the attribute in nixpkgs +- own configuration commitish +- list of flakes involved -The name `nix-dependencies` is chosen because it's unlikely to clash with package files. Furthermore, it is only valid if it contains nothing but `$cas` names separated by single newlines. +Objective metadata is a function of the entry contents only and can be calculated by anyone. -### Trust DB +Subjective metadata is a function of the build description and execution. It depends on the source, is trusted data, and any desired metadata should be stored in the Trust DB. -Here is a selection of metadata generated when building a given output: +#### Runtime dependencies -- `$name` (can include version number and output name) -- inputs / dependencies (including runtime) -- size -- build time and duration -- description (from derivation) +Runtime dependencies (the Store entries that are necessary for correct operation of this entry) are a special case, because you can't always detect those based on the contents of the entry. +However, if we make the dependencies part of the `$cas` calculation, differences in those will result in a different `$cas`, and so for a given `$cas` there can be only one list of correct runtime dependencies. + +Furthermore, in order to know which Store entries belong together, the runtime dependencies are a necessity. Therefore we must provide them as part of the entry. Since we can't change the entries themselves (be they files or directories), we add the metadata as an optional file named `/var/lib/nix/$cas.m`. This file is included in the `$cas` calculation. If the file is missing or altered, the `$cas` won't validate. -For installation and searching, these are relevant for each store path `$cas`: +The metadata file currently only contains the `$cas` hashes of each runtime dependency, sorted alphabetically, and separated by single newlines. If the file does not match this format exactly, it is invalid. + +Other objective metadata that could be useful, such as late binding information, could be placed in this file too, perhaps as a TOML file that begins after the first double newline. This is to be determined. + +### Trust DB -- list of `$out` that are known to result in `$cas` -- `$cas` of all dependencies -- `$name` -- description +The Trust DB contains subjective metadata for `$out` -> `$cas` mappings. Each entry contains: + +- `$cas` +- an array of `$out` +- name (including version) - size +- array of dependency `$cas` +- optional: + - description + - build timestamp + - builder + - custom metadata, like one or more git repo commit hashes + +With this information, a user can quickly find `$cas` entries to install that match a name or description. `nix-build` can find a `$cas` by `$out`. -These have to be consulted for any installation request, and therefore they should be easy to retrieve. Build hosts can provide a log of new and changed entries, enabling differential updates. Binary caches could provide a query service (both for "what does `$out` map to and "what is `$cas`"). +For a given `$cas`, there can be many entries, one for each trusted source. This can be handled by having one SQLite DB per source (including localhost), and having an order of precedence. ### Sharing the Nix Store @@ -167,7 +190,7 @@ Since the Nix Store (minus supporting directories) contains only self-validating - confidence around hash collision attacks - confidence around writers corrupting paths without detection -The installation step only involves moving a proposed path from `.prepare` to `.stage`, so no further communcation is necessary with the Store daemon. +The installation step only involves moving a proposed path from `.prepare` to `.stage`, so no further communication is necessary with the Store daemon. For single-user installs, the Store can trivially be maintained by the Nix tools, and converting to multi-user is only a matter of changing the permissions. @@ -182,7 +205,7 @@ Optionally, a daemon can maintain the Store. In this case, it is recommended be `$cas` entries are prepared by building them in a path that has the same total length as its final `$cas` entry. This means they can be built practically anywhere, in `/tmp`, in a home directory, in the `.prepare` directory in the Store, etc. It is recommended to make a part of the path a unique-per-build string. To make sure only the build's own references need rewriting, it is recommended to build using only `$cas` entries as dependencies, instead of relying on rewriting paths. -After the build, its `$cas` is calculated and any occurences of the build path are replaced with `/var/lib/nix/$cas`. +After the build, the runtime dependencies file is generated in-memory (if there are dependencies), the `$cas` is calculated including the runtime dependencies buffer, any occurences of the build path are replaced with `/var/lib/nix/$cas`, and the `$cas.m` file is written (if there are dependencies). If there undetected build path references, they might cause the finished entry to work incorrectly, and they will cause `$cas` to differ on every build of `$out`. This must be handled on a case-by-case basis. Perhaps we'll need pluggable hash rewriters. @@ -190,27 +213,51 @@ The build can happen by a sandboxing build daemon like `nix-build`, but that is After preparing, the metadata for the build is added to the user's Trust DB. +### A note on `$cas` calculation + +The content hash has to be built up recursively, as described elsewhere. There are some caveats: + +- When the build path `$build/$out` is encountered, it should be replaced with `/var/lib/nix/$placeholder`. `$placeholder` is a fixed string used in all `$cas` calculations. +- When `$out` is encountered but not `$build/$out`, it should be replaced with `$placeholder`. +- When `$build` is encountered but not `$out`, it is a possible problem and should be noted. +- The paths of files and directories should be included relative to `$build` (so `$build/$out` is `/$out`), but not in the file contents or symlink targets. +- If there is metadata, the file should be included in the calculation as if it was named `_meta.m`, stored next to `$out`. + +Changing the entry to its `$cas` name means: + +- Rename `$out` to `$cas`. +- Replace the string `$build/$out` with `/var/lib/nix/$cas`. +- Replace the string `$out` with `$cas`. +- If there is metadata, write the contents to `$cas.m`. + +When validating: + +- The paths of files and directories should be included relative to `/var/lib/nix`. +- If `$cas.m` exists, it should be included in the calculation as if it was named `_meta.m`. +- When the string `$cas` is encountered, it should be replaced with `$placeholder`. + ### Installation We use rename semantics to provide atomic installations. Prepared `$cas` entries are moved to their final location with a `rename` call, which is atomic but requires the path to be on the same filesystem. -Atomicity is important to ensure that `$cas` entries are always valid. If they are copied instead, they don't self-validate during the copy. +Atomicity is important to ensure that `$cas` entries are always valid. If they are copied instead, they don't self-validate for the duration of the copy. #### with Store Daemon Any user with write access to `/var/lib/nix/.prepare` and `/var/lib/nix/.stage` can ask for entries to be installed. To do so: -1. They prepare entries to be stored in `/var/lib/nix/.prepare`, renaming each entry to its `$cas`. -1. They atomically move prepared paths to `/var/lib/nix/.stage`, in reverse dependency order, meaning dependencies of an entry are moved first. +1. They prepare entries to be stored in `/var/lib/nix/.prepare`, renaming each entry to its `$cas` and `$cas.m`. +1. They atomically move prepared paths to `/var/lib/nix/.stage`, in reverse dependency order, meaning dependencies of an entry are moved first. First the `$cas.m` file is moved and then the `$cas` entry. When the Store daemon discovers a new `$cas` entry under `.stage`: 1. If the Store already contains this `$cas` entry, it removes this new one, perhaps first verifying the Store copy. -1. It recursively changes ownership of the path to itself and timestamps to 0, making sure that write permission is removed for everybody, and read permission is added for anybody. +1. It recursively changes ownership of `$cas` and `$cas.m` to itself and timestamps to 0, making sure that write permission is removed for everybody, and read permission is added for anybody. If it has no permissions to do this, it instead copies the path into `/var/lib/nix/.daemon`, and another process will need to keep `.stage` clean. -1. The daemon verifies the hash. If the hash doesn't match, it removes the path. -1. If the path is a directory and there is a `nix-dependencies` file as a direct child, it checks that all dependencies are already present in the Store. If not, the path is held for a while and deleted if the dependencies don't appear in time (configurable). -1. It atomically moves the path into `/var/lib/nix`. +1. The daemon verifies the `$cas`. If it doesn't match, it removes `$cas` and `$cas.m`. Note that a missing or altered `$cas.m` file won't pass validation. +1. If `$cas.m` exists, it checks that all dependencies are already present in the Store. If not, the path is held for a while and deleted if the dependencies don't appear in time (configurable). +1. It atomically moves `$cas.m` into `/var/lib/nix`. +1. It atomically moves `$cas` into `/var/lib/nix`. Note that to ensure atomicity, `.prepare` and `.stage` need to be on the same filesystem, and either `.stage` or `.daemon` need so be on the same filesystem as the Store. @@ -218,16 +265,16 @@ Note that to ensure atomicity, `.prepare` and `.stage` need to be on the same fi Any user with write access to `/var/lib/nix/.stage` and `/var/lib/nix` can install entries. To do so: -1. They prepare entries to be stored in `/var/lib/nix/.stage`, renaming each entry to its `$cas`. -1. They atomically move prepared entries to `/var/lib/nix`, in reverse dependency order, meaning dependencies of an entry are moved first. +1. They prepare entries to be stored in `/var/lib/nix/.stage`, renaming each entry to its `$cas` and `$cas.m`. +1. They atomically move prepared entries to `/var/lib/nix`, in reverse dependency order, meaning dependencies of an entry are moved first, and `$cas.m` is moved before `$cas` Note that to ensure atomicity, `.stage` needs to be on the same filesystem as the Store. -Note that when two writers are trying to install the same path, one of them might get an error, but the end result will be the same (as long as the `$cas` is self-valid). So multiple writers can also be on separate hosts, in a trusted setting. +Note that when two writers are trying to install the same `$cas` or `$cas.m`, one of them might get an error, but the end result will be the same (as long as the `$cas` is self-valid). So multiple writers can also be on separate hosts, in a trusted setting. ### Verification -A path in the Store is verified by calculating its `$cas`. If the `$cas` doesn't match, the path is moved to `/var/lib/nix/.quarantaine`, where a sysadmin has to investigate. +A path in the Store is verified by calculating its `$cas` (see above). If the `$cas` doesn't match, the path is moved to `/var/lib/nix/.quarantaine`, where a sysadmin has to investigate. Any process with write access to `/var/lib/nix` and `/var/lib/nix/.quarantaine` can do this, for example the Store daemon. @@ -235,16 +282,18 @@ Any process with write access to `/var/lib/nix` and `/var/lib/nix/.quarantaine` Garbage collection needs to identify store paths that are not used by anything on any of the systems sharing the same store. Here we propose a simple mechanism for coordination, but any mechanism is acceptable. -- a host with store write access decides to run garbage collection -- it checks that `.gc/running_gc` does not exist or contains a very old timestamp, and writes a unique number to `.gc/will_gc` -- after waiting long enough to prevent collisions (for example 10 seconds), it reads `.gc/will_gc` and verifies it contains the unique number it wrote -- then it clears out `.gc` except for the files `.gc/will_gc` and adds the file `.gc/running_gc` containing the current timestamp -- each host's store daemon monitors `.gc/running_gc` at some interval, for example 1 minute -- while this file exists, the daemon must record its required `$cas` entries, by creating 0-length files named `.gc/$cas` - - entries that are referenced in `nix-dependencies` don't have to be marked -- the writer waits long enough for all the hosts to record their GC roots, for example 10 minutes -- after the wait period expired, the writer host scans for store paths that are not marked and not part of a marked entry's `nix-dependencies`. Each path is atomically moved to `.gc` and deleted -- finally, the writer host empties the `.gc` directory, leaving the `running_gc` file for last +- A host with store write access decides to run garbage collection. +- It checks that `.gc/running_gc` does not exist or contains a very old timestamp, and writes a unique number to `.gc/will_gc`. +- After waiting long enough to prevent collisions (for example 10 seconds), it reads `.gc/will_gc` and verifies it contains the unique number it wrote. +- It clears out `.gc/` except for the file `.gc/will_gc` and adds the file `.gc/running_gc` containing the current timestamp. +- While it waits for other hosts, it checks the Store for `$cas.m` files that don't have a matching `$cas`. +- Each host's store daemon monitors `.gc/running_gc` at some interval, for example 1 minute. +- While this file exists, the daemon must record its root `$cas` entries, by creating 0-length files named `.gc/$cas`. +- The writer waits long enough for all the hosts to record their GC roots, for example 10 minutes. +- It verifies that `.gc/will_gc` still contains its unique number +- After the wait period expired, the writer host scans for store paths that are part of the own and other GC roots. Each `$cas` is atomically moved to `.gc` and deleted; `$cas.m` is also deleted. +- The `$cas.m` files that still don't have their matching `$cas` are removed. Note that when installing, the `$cas.m` will appear shortly before `$cas` since everything is prepared. +- Finally, the writer host empties the `.gc` directory, leaving the `running_gc` file for last. For a single-user installation or a non-shared Nix store, none of this is necessary, and the GC process remains unchanged, except for the new locations to search for GC roots. See next section. @@ -258,13 +307,15 @@ Since a profile points to an immutable `$cas` path, it is the same across system However, a profile link itself is trusted information, and should be shared between users and systems only when they trust each other. -Known paths that can contain profiles: +Known paths for profiles: - `$NIX_CONF_DIR/profiles`: per-user profiles and auto roots -- `/var/lib/nix-profiles`: NixOS system and shared profiles, and auto roots, maintained by the `root` user +- `/var/lib/nix-profiles`: NixOS system and shared profiles, maintained by the `root` user Other paths can of course be used, but the GC won't know about them, so a link should be maintained in one of the directories above. +For enabling garbage collection, these directories also contain "auto-roots", generated from builds. + Example: This creates or updates the user-specific profile "vscode", adding Python 3: ```sh @@ -280,9 +331,10 @@ There is no real need for migrating stores, since `/nix` and `/var/lib/nix` can To migrate an existing output from `/nix/store/$out-$name` to `/var/lib/nix/$cas`, the following approach will work most of the time: - migrate all its dependencies using the below steps -- for all files of `$out-$name`, replace all strings of the form `/nix/store/$out-$name` with `/var/lib/nix/$filler/$cas`. `$cas` has the same length as `$out` and the minimum length of `$name` is 1, so there is always room for the full `$cas`. The `$filler` is a string with length `l = length($name) - 1` of the form `./././/`, that is, repeat `./` `floor(l/2)` times and append `/` if `l` is odd. +- generate the metadata file if there are dependencies +- calculate the `$cas` as described above, but instead of handling `$build/$out`, replace all strings of the form `/nix/store/$out-$name` with `/var/lib/nix/$filler/$cas`. `$cas` has the same length as `$out` and the minimum length of `$name` is 1, so there is always room for the full `$cas`. The `$filler` is a string with length `l = length($name) - 1` of the form `./././/`, that is, repeat `./` `floor(l/2)` times and append `/` if `l` is odd. - do the same with symlinks, but consider relative paths as well -- self-references must be updated after `$cas` is known, their contents must be skipped while calculating `$cas`, as described elsewhere +- self-references must be updated after `$cas` is known, as described above - once `$cas` is determined, the patched derivation can be placed in `/var/lib/nix/$cas` and the metadata recorded in the Trust DB This process will fail if the derivation refers to the Store in ways that aren't visible, like different string encoding and calculated paths. @@ -304,11 +356,16 @@ To begin managing an existing Store with a Store Daemon, these steps are perform - Stop Store Daemon. - Change permissions on the Store as desired. +### Repairing an entry + +If a `$cas` entry does not validate, it can be downloaded from any cache that has it, without trust. If no cache has it, it cannot be repaired. + ## Implementation - NixPkgs needs to be audited to remove hard-coded `/nix` names, replacing it with the store path variable (TODO look up name). - The Nix tools and Hydra need to be be branched to support the new location and store semantics. The tools either use the old location and semantics, or the new one. -- The binary cache server needs to serve `$cas` as compressed files and trust mappings in an incremental way +- The binary cache server needs to serve `$cas` entries as compressed files that include the entry and the `$cas.m` file. +- The trust mapping server needs to serve trust mappings in an incremental way. For example, as a JSON array of added and changed mappings since some timestamp. ## Alternative options @@ -316,15 +373,6 @@ There are a few choices made in this RFC, here we describe alternatives and why ### Keep store at `/nix/store` -Since the names are always different, the `$cas` entries could stay in `/nix/store`. The benefit would be that NixPkgs doesn't have to be audited for hardcoded `/nix` paths. +Since the names are always different, the `$cas` entries and `$cas.m` files could stay in `/nix/store`. The benefit would be that NixPkgs doesn't have to be audited for hardcoded `/nix` paths. However, this keeps the problem of some installations not having permission to create a `/nix` directory, and makes it much harder to share the store between hosts (as long as non-`$cas` entries are present). - -### Always store dependency list in `$cas` entry - -This would require single-file entries to be a directory instead, and then for symmetry and simplicity the directory entries would require the same. -So a path `/var/lib/nix/$cas` becomes for example `/var/lib/nix/$cas/_` (short non-descript name to keep references short), and the dependencies would be at `/var/lib/nix/$cas/deps`. -This allows adding other objective metadata, like late binding information, and other information that might be desired in the future. -Another benefit would be that root entries would be enough to know what paths to keep during garbage collection. - -However, this increases storage somewhat. From b0b655ed1e659f3782c7994588b173f98a0317cb Mon Sep 17 00:00:00 2001 From: Wout Mertens Date: Mon, 9 Nov 2020 07:42:59 +0100 Subject: [PATCH 10/10] 17: explain metadata alternatives --- rfcs/0017-intensional-store.md | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/rfcs/0017-intensional-store.md b/rfcs/0017-intensional-store.md index 13ae23443..00c3851ab 100644 --- a/rfcs/0017-intensional-store.md +++ b/rfcs/0017-intensional-store.md @@ -376,3 +376,16 @@ There are a few choices made in this RFC, here we describe alternatives and why Since the names are always different, the `$cas` entries and `$cas.m` files could stay in `/nix/store`. The benefit would be that NixPkgs doesn't have to be audited for hardcoded `/nix` paths. However, this keeps the problem of some installations not having permission to create a `/nix` directory, and makes it much harder to share the store between hosts (as long as non-`$cas` entries are present). + +### No metadata + +Not keeping metadata in the Store means that the Store by itself doesn't have enough information to do garbage collection, nor to let a system boot from a given `$cas` + +### Keep metadata in directory with Store entry + +Since the Store entries can be files or directories, that means that files would have to be put in a directory, for example `$cas` becomes `$cas/_`. +Then directory entries would have to do the same for symmetry. This requires many code changes and requires extra storage, even if an entry doesn't have any runtime dependencies. + +### Add name to metadata + +The name and version are subjective data. Adding it to metadata would cause `$cas` to change and increases storage. The only consumer of this data is the end user, who should also have the data available in their Trust DB.