Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking issue for RFC 92: Dynamic derivations #6316

Open
4 tasks
Ericson2314 opened this issue Mar 25, 2022 · 17 comments
Open
4 tasks

Tracking issue for RFC 92: Dynamic derivations #6316

Ericson2314 opened this issue Mar 25, 2022 · 17 comments
Assignees
Labels
RFC Related to an accepted RFC significant Novel ideas, large API changes, notable refactorings, issues with RFC potential, etc.

Comments

@Ericson2314
Copy link
Member

Ericson2314 commented Mar 25, 2022

Info

Steps

Here are the PRs to review:

Preparatory work

  1. Low level <drvPath>^<outputName> installable syntax to match existing <highLevelInstallable>^<outputNames> syntax  #4543
  2. Parse string context elements properly #7543
  3. Get rid of .drv special-casing for store path installable #7600
  4. Introduce StoreReferences and ContentAddressWithReferences #3746
  5. Derivations can output "text-hashed" data #3959
  6. Make more string values work as installables #7601
  7. Give queryPartialDerivationOutputMap an evalStore parameter #8724
  8. Test and begin documentation of the ATerm format for derivations #8927
  9. Improve derivation parsing #8938

Actual implementation

  1. Upgrade downstreamPlaceholder to a type with methods #8353
  2. Make the Derived Path family of types inductive for dynamic derivations #8369
  3. Create (experimental) outputOf primop. #8813
  4. Dynamic derivations RFC 92 #4628
  5. Revert "Revert "Adapt scheduler to work with dynamic derivations #9415

Quality of life / Nice to have

CC @tomberek

@Ericson2314
Copy link
Member Author

FYI, the middle 2 PRs might be better to review together, as the second one revises the StorePathDescriptor the first one creates.

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/rfc-92-status-update/27441/1

@roberth roberth added the significant Novel ideas, large API changes, notable refactorings, issues with RFC potential, etc. label Jun 2, 2023
@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/zurich-23-05-zhf-hackathon-and-workshop-report/29093/1

Ericson2314 added a commit to obsidiansystems/nix that referenced this issue Jul 20, 2023
This makes it more useful. In general, the derivation will be in one
store, and the realisation info is in another.

This also helps us avoid duplication. See how `resolveDerivedPath` is
now simpler because it uses `queryPartialDerivationOutputMap`. In NixOS#8369
we get more flavors of derived path, and need more code to resolve them
all, and this problem only gets worse.

The fact that we need a new method to deal with the multiple dispatch is
unfortunate, but this generally relates to the fact that `Store` is a
sub-par interface, too bulky/unwieldy and conflating separate concerns.
Solving that is out of scope of this PR.

This is part of the RFC 92 work. See tracking issue NixOS#6316
Ericson2314 added a commit to obsidiansystems/nix that referenced this issue Aug 11, 2023
In the Nix language, given a drv path, we should be able to construct
another string referencing to one of its output. We can do this today
with `(import drvPath).output`, but this only works for derivations we
already have.

With dynamic derivations, however, that doesn't work well because the
`drvPath` isn't yet built: importing it like would need to trigger IFD,
when the whole point of this feature is to do "dynamic build graph"
without IFD!

Instead, what we want to do is create a placeholder value with the right
string context to refer to the output of the as-yet unbuilt derivation.
A new primop in the language, analogous to `builtins.placeholder` can be
used to create one. This will achieve all the right properties. The
placeholder machinery also will match out the `outPath` attribute for CA
derivations works.

In 60b7121 we added that type of placeholder,
and the derived path and string holder changes necessary to support it.
Now, we can wire up the primop.

Part of RFC 92: dynamic derivations (tracking issue NixOS#6316)
@tomberek
Copy link
Contributor

With #8369 the largest pieces of internal re-configuring are done. This means the internal representations are ready to accommodate the user-facing changes of a new primop: #8813

Stay tuned for more exciting chagnes!

Ericson2314 added a commit to obsidiansystems/nix that referenced this issue Aug 11, 2023
In the Nix language, given a drv path, we should be able to construct
another string referencing to one of its output. We can do this today
with `(import drvPath).output`, but this only works for derivations we
already have.

With dynamic derivations, however, that doesn't work well because the
`drvPath` isn't yet built: importing it like would need to trigger IFD,
when the whole point of this feature is to do "dynamic build graph"
without IFD!

Instead, what we want to do is create a placeholder value with the right
string context to refer to the output of the as-yet unbuilt derivation.
A new primop in the language, analogous to `builtins.placeholder` can be
used to create one. This will achieve all the right properties. The
placeholder machinery also will match out the `outPath` attribute for CA
derivations works.

In 60b7121 we added that type of
placeholder, and the derived path and string holder changes necessary to
support it. Then in the previous commit we cleaned up the code
(inspiration finally hit me!) to deduplicate the code and expose exactly
what we need. Now, we can wire up the primop trivally!

Part of RFC 92: dynamic derivations (tracking issue NixOS#6316)
Ericson2314 added a commit to obsidiansystems/nix that referenced this issue Aug 11, 2023
In the Nix language, given a drv path, we should be able to construct
another string referencing to one of its output. We can do this today
with `(import drvPath).output`, but this only works for derivations we
already have.

With dynamic derivations, however, that doesn't work well because the
`drvPath` isn't yet built: importing it like would need to trigger IFD,
when the whole point of this feature is to do "dynamic build graph"
without IFD!

Instead, what we want to do is create a placeholder value with the right
string context to refer to the output of the as-yet unbuilt derivation.
A new primop in the language, analogous to `builtins.placeholder` can be
used to create one. This will achieve all the right properties. The
placeholder machinery also will match out the `outPath` attribute for CA
derivations works.

In 60b7121 we added that type of
placeholder, and the derived path and string holder changes necessary to
support it. Then in the previous commit we cleaned up the code
(inspiration finally hit me!) to deduplicate the code and expose exactly
what we need. Now, we can wire up the primop trivally!

Part of RFC 92: dynamic derivations (tracking issue NixOS#6316)

Co-authored-by: Robert Hensing <roberth@users.noreply.github.com>
Ericson2314 added a commit to obsidiansystems/nix that referenced this issue Aug 14, 2023
In the Nix language, given a drv path, we should be able to construct
another string referencing to one of its output. We can do this today
with `(import drvPath).output`, but this only works for derivations we
already have.

With dynamic derivations, however, that doesn't work well because the
`drvPath` isn't yet built: importing it like would need to trigger IFD,
when the whole point of this feature is to do "dynamic build graph"
without IFD!

Instead, what we want to do is create a placeholder value with the right
string context to refer to the output of the as-yet unbuilt derivation.
A new primop in the language, analogous to `builtins.placeholder` can be
used to create one. This will achieve all the right properties. The
placeholder machinery also will match out the `outPath` attribute for CA
derivations works.

In 60b7121 we added that type of
placeholder, and the derived path and string holder changes necessary to
support it. Then in the previous commit we cleaned up the code
(inspiration finally hit me!) to deduplicate the code and expose exactly
what we need. Now, we can wire up the primop trivally!

Part of RFC 92: dynamic derivations (tracking issue NixOS#6316)

Co-authored-by: Robert Hensing <roberth@users.noreply.github.com>
Ericson2314 added a commit to obsidiansystems/nix that referenced this issue Aug 14, 2023
In the Nix language, given a drv path, we should be able to construct
another string referencing to one of its output. We can do this today
with `(import drvPath).output`, but this only works for derivations we
already have.

With dynamic derivations, however, that doesn't work well because the
`drvPath` isn't yet built: importing it like would need to trigger IFD,
when the whole point of this feature is to do "dynamic build graph"
without IFD!

Instead, what we want to do is create a placeholder value with the right
string context to refer to the output of the as-yet unbuilt derivation.
A new primop in the language, analogous to `builtins.placeholder` can be
used to create one. This will achieve all the right properties. The
placeholder machinery also will match out the `outPath` attribute for CA
derivations works.

In 60b7121 we added that type of
placeholder, and the derived path and string holder changes necessary to
support it. Then in the previous commit we cleaned up the code
(inspiration finally hit me!) to deduplicate the code and expose exactly
what we need. Now, we can wire up the primop trivally!

Part of RFC 92: dynamic derivations (tracking issue NixOS#6316)

Co-authored-by: Robert Hensing <roberth@users.noreply.github.com>
Ericson2314 added a commit to obsidiansystems/nix that referenced this issue Aug 14, 2023
In the Nix language, given a drv path, we should be able to construct
another string referencing to one of its output. We can do this today
with `(import drvPath).output`, but this only works for derivations we
already have.

With dynamic derivations, however, that doesn't work well because the
`drvPath` isn't yet built: importing it like would need to trigger IFD,
when the whole point of this feature is to do "dynamic build graph"
without IFD!

Instead, what we want to do is create a placeholder value with the right
string context to refer to the output of the as-yet unbuilt derivation.
A new primop in the language, analogous to `builtins.placeholder` can be
used to create one. This will achieve all the right properties. The
placeholder machinery also will match out the `outPath` attribute for CA
derivations works.

In 60b7121 we added that type of
placeholder, and the derived path and string holder changes necessary to
support it. Then in the previous commit we cleaned up the code
(inspiration finally hit me!) to deduplicate the code and expose exactly
what we need. Now, we can wire up the primop trivally!

Part of RFC 92: dynamic derivations (tracking issue NixOS#6316)

Co-authored-by: Robert Hensing <roberth@users.noreply.github.com>
Ericson2314 added a commit to obsidiansystems/nix that referenced this issue Aug 14, 2023
In the Nix language, given a drv path, we should be able to construct
another string referencing to one of its output. We can do this today
with `(import drvPath).output`, but this only works for derivations we
already have.

With dynamic derivations, however, that doesn't work well because the
`drvPath` isn't yet built: importing it like would need to trigger IFD,
when the whole point of this feature is to do "dynamic build graph"
without IFD!

Instead, what we want to do is create a placeholder value with the right
string context to refer to the output of the as-yet unbuilt derivation.
A new primop in the language, analogous to `builtins.placeholder` can be
used to create one. This will achieve all the right properties. The
placeholder machinery also will match out the `outPath` attribute for CA
derivations works.

In 60b7121 we added that type of
placeholder, and the derived path and string holder changes necessary to
support it. Then in the previous commit we cleaned up the code
(inspiration finally hit me!) to deduplicate the code and expose exactly
what we need. Now, we can wire up the primop trivally!

Part of RFC 92: dynamic derivations (tracking issue NixOS#6316)

Co-authored-by: Robert Hensing <roberth@users.noreply.github.com>
Ericson2314 added a commit to obsidiansystems/nix that referenced this issue Aug 14, 2023
In the Nix language, given a drv path, we should be able to construct
another string referencing to one of its output. We can do this today
with `(import drvPath).output`, but this only works for derivations we
already have.

With dynamic derivations, however, that doesn't work well because the
`drvPath` isn't yet built: importing it like would need to trigger IFD,
when the whole point of this feature is to do "dynamic build graph"
without IFD!

Instead, what we want to do is create a placeholder value with the right
string context to refer to the output of the as-yet unbuilt derivation.
A new primop in the language, analogous to `builtins.placeholder` can be
used to create one. This will achieve all the right properties. The
placeholder machinery also will match out the `outPath` attribute for CA
derivations works.

In 60b7121 we added that type of
placeholder, and the derived path and string holder changes necessary to
support it. Then in the previous commit we cleaned up the code
(inspiration finally hit me!) to deduplicate the code and expose exactly
what we need. Now, we can wire up the primop trivally!

Part of RFC 92: dynamic derivations (tracking issue NixOS#6316)

Co-authored-by: Robert Hensing <roberth@users.noreply.github.com>
Ericson2314 added a commit to obsidiansystems/nix that referenced this issue Aug 14, 2023
In the Nix language, given a drv path, we should be able to construct
another string referencing to one of its output. We can do this today
with `(import drvPath).output`, but this only works for derivations we
already have.

With dynamic derivations, however, that doesn't work well because the
`drvPath` isn't yet built: importing it like would need to trigger IFD,
when the whole point of this feature is to do "dynamic build graph"
without IFD!

Instead, what we want to do is create a placeholder value with the right
string context to refer to the output of the as-yet unbuilt derivation.
A new primop in the language, analogous to `builtins.placeholder` can be
used to create one. This will achieve all the right properties. The
placeholder machinery also will match out the `outPath` attribute for CA
derivations works.

In 60b7121 we added that type of
placeholder, and the derived path and string holder changes necessary to
support it. Then in the previous commit we cleaned up the code
(inspiration finally hit me!) to deduplicate the code and expose exactly
what we need. Now, we can wire up the primop trivally!

Part of RFC 92: dynamic derivations (tracking issue NixOS#6316)

Co-authored-by: Robert Hensing <roberth@users.noreply.github.com>
Ericson2314 added a commit to obsidiansystems/nix that referenced this issue Aug 14, 2023
In the Nix language, given a drv path, we should be able to construct
another string referencing to one of its output. We can do this today
with `(import drvPath).output`, but this only works for derivations we
already have.

With dynamic derivations, however, that doesn't work well because the
`drvPath` isn't yet built: importing it like would need to trigger IFD,
when the whole point of this feature is to do "dynamic build graph"
without IFD!

Instead, what we want to do is create a placeholder value with the right
string context to refer to the output of the as-yet unbuilt derivation.
A new primop in the language, analogous to `builtins.placeholder` can be
used to create one. This will achieve all the right properties. The
placeholder machinery also will match out the `outPath` attribute for CA
derivations works.

In 60b7121 we added that type of
placeholder, and the derived path and string holder changes necessary to
support it. Then in the previous commit we cleaned up the code
(inspiration finally hit me!) to deduplicate the code and expose exactly
what we need. Now, we can wire up the primop trivally!

Part of RFC 92: dynamic derivations (tracking issue NixOS#6316)

Co-authored-by: Robert Hensing <roberth@users.noreply.github.com>
Ericson2314 added a commit to obsidiansystems/nix that referenced this issue Aug 14, 2023
In the Nix language, given a drv path, we should be able to construct
another string referencing to one of its output. We can do this today
with `(import drvPath).output`, but this only works for derivations we
already have.

With dynamic derivations, however, that doesn't work well because the
`drvPath` isn't yet built: importing it like would need to trigger IFD,
when the whole point of this feature is to do "dynamic build graph"
without IFD!

Instead, what we want to do is create a placeholder value with the right
string context to refer to the output of the as-yet unbuilt derivation.
A new primop in the language, analogous to `builtins.placeholder` can be
used to create one. This will achieve all the right properties. The
placeholder machinery also will match out the `outPath` attribute for CA
derivations works.

In 60b7121 we added that type of
placeholder, and the derived path and string holder changes necessary to
support it. Then in the previous commit we cleaned up the code
(inspiration finally hit me!) to deduplicate the code and expose exactly
what we need. Now, we can wire up the primop trivally!

Part of RFC 92: dynamic derivations (tracking issue NixOS#6316)

Co-authored-by: Robert Hensing <roberth@users.noreply.github.com>
Ericson2314 added a commit to obsidiansystems/nix that referenced this issue Aug 14, 2023
In the Nix language, given a drv path, we should be able to construct
another string referencing to one of its output. We can do this today
with `(import drvPath).output`, but this only works for derivations we
already have.

With dynamic derivations, however, that doesn't work well because the
`drvPath` isn't yet built: importing it like would need to trigger IFD,
when the whole point of this feature is to do "dynamic build graph"
without IFD!

Instead, what we want to do is create a placeholder value with the right
string context to refer to the output of the as-yet unbuilt derivation.
A new primop in the language, analogous to `builtins.placeholder` can be
used to create one. This will achieve all the right properties. The
placeholder machinery also will match out the `outPath` attribute for CA
derivations works.

In 60b7121 we added that type of
placeholder, and the derived path and string holder changes necessary to
support it. Then in the previous commit we cleaned up the code
(inspiration finally hit me!) to deduplicate the code and expose exactly
what we need. Now, we can wire up the primop trivally!

Part of RFC 92: dynamic derivations (tracking issue NixOS#6316)

Co-authored-by: Robert Hensing <roberth@users.noreply.github.com>
@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2023-08-25-nix-team-meeting-minutes-82/32283/1

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/nixcon-governance-workshop/32705/9

@roberth
Copy link
Member

roberth commented Jan 17, 2024

If a dynamic derivation A performs nix derivation add B, does that create a "deriver" relation between two .drv store paths?
Technically it could, but I don't think it should, because

  • The deriver relation is non-local, which makes any use of it questionable.
  • This will create new deriver rows whenever a small build is reused; a scaling hazard. 🐉
  • When the A builder injects a derivation from its inputDrvs into B, this derivation is not derived by A, so these deriver entries do not follow the closure property. That may be ok, but does not help with storing the deriver relation more efficiently, and scores no extra points.

So while we could consider such deriver rows to be valid, adding them consistently is probably more trouble than it's worth.

We should still have a deriver row for A and B, and I believe that's enough for querying any relevant paths through the combined deriver and reference graphs.

TODO

  • document that nix derivation add from within a build does not create a deriver row.
  • document design decision

@physics-enthusiast
Copy link

Does closureInfo still get every single derivation that is involved in a build with this?

@roberth
Copy link
Member

roberth commented Mar 21, 2024

@physics-enthusiast

Consider the following derivations:

  • A
  • A.out - a dynamic derivation
  • B, which has A.out as an input, and uses exportReferencesGraph, like closureInfo does

I wouldn't expect it to list B to list A, because A.out need not have a reference to A.
Note also that exportReferenceGraph works on the references relation, which for "normal" inputs only contains outputs, maybe some constant paths (such as "${./script.sh}"), but rarely actual .drv files.
The deriver relation is impure and is not included by exportReferenceGraph.

So, probably not, but this should be checked and documented.

@physics-enthusiast
Copy link

physics-enthusiast commented Mar 22, 2024

I wouldn't expect it to list B to list A, because A.out need not have a reference to A.

Could a case not be made for including a reference to A.drv in A.out? Any derivation referencing A.out must first have a reference to A, and from a dependency tracking perspective A must build for A.out to exist (and hence build) so I think arguably A is still a build-time dependency. This is especially important if you are trying to do something like backing up all FODs needed to build a particular package, since the content of A.out might depend on the result of some FOD. The exportReferencesGraph resolver could stop at the dynamic derivations if the original query was not to a derivation (so that someone trying to get run-time dependencies doesn't accidentally pull the whole build-time closure) , since drvs have no run-time dependencies.

@physics-enthusiast
Copy link

physics-enthusiast commented Mar 22, 2024

Actually, having thought about it further, maybe dynamic derivations should not be allowed to be fixed-output at all. Even with CA derivations, a FOD dynamically generated by a nondeterministic derivation is not guarenteed to be reproducible from source. You could for example code in several possible URLs and their respective hashes, and then have A pick one at random to generate the A.out FOD, which could in turn produce completely different outputs. Content addressing preserves reproducibility within a single Trust DB (so between rebuilds within a system, and between systems sharing a binary cache), but someone independently building the exact same expression could in theory end up with an entirely different realisation.

@Dessix
Copy link

Dessix commented Mar 22, 2024

@physics-enthusiast Isn't that why FODs require a hash as one of their parameters, to verify that they were reproduced?

@physics-enthusiast
Copy link

physics-enthusiast commented Mar 22, 2024

@Dessix the issue is that if the FOD is built dynamically (by another derivation), that hash can also be altered dynamically (i.e. during build-time rather than eval-time), breaking reproducibility. Come to think of it, setting hashes via IFD seem to have this problem too.
Edit: wait, is that what all those "*2nix" libraries using IFDs are doing?
Edit 2: turns out flakes block IFD, so at least in pure eval IFD->FOD shouldn't be a problem

@roberth
Copy link
Member

roberth commented Mar 22, 2024

Could a case not be made for including a reference to A.drv in A.out?

A could choose to add a true reference, but that would cause unnecessary rebuilds of A.out.

I suppose something could be done for exportReferencesGraph by basing the result on the DerivingPath rather than the realised, opaque StorePath, ie the whole /nix/store/foo.drv^bar^baz, and not just /nix/store/qux-baz.
That way it can return info about foo.drv^bar.
This should probably be the behavior of a new attribute though, because existing code based on exportReferencesGraph will expect to only see the references of the final realised path.

FOD

As mentioned by @Dessix, FODs don't introduce any new build impurities.

Dynamic derivations do turn actual build impurities into "instantiation impurities", but this is a necessary evil if we want to have this kind of dynamism. Anything that writes derivations should be written with care, whether that's our own evaluator or a dynamic derivation deriver builder. (Do we have a term for the derivation that produces another derivation? Dynamic derivation seems to refer to the output, not the derivation writer if that's what we want to call it.)

Come to think of it, setting hashes via IFD seem to have this problem too.

IFD is a somewhat more benign source of instantiation impurities, because at least you get the opportunity to do some of the processing in the Nix language, which has reproducibility as a design goal, especially with pure mode.

Edit 2: turns out flakes block IFD

Generally IFD is allowed, but some of the metadata commands forbid it, under the assumption that they'd be evaluated and indexed centrally, and those commands should obey the same restrictions. However, that's not what flakestry.dev or flakehub do, as they just accept the readily evaluated JSON from CI. I know of at least one flake that disables the IFD restriction before uploading to a registry.

@Ericson2314
Copy link
Member Author

@roberth Is this different from the question of whether CA derivations (which includes derivation-producing derivations) ought to use the deriver field for their outputs?

@roberth
Copy link
Member

roberth commented Mar 22, 2024

@Ericson2314 a lot has been discussed, so I can't tell what this refers to. Also not sure what is the exact use of the deriver field in that context.

I can see a vague parallel with the tracking of build input hashes (transitively) for any build, and CA in particular.

Also relevant is the idea of those "large step realisation" objects #8947, which would become somewhat useless in cases where the discussed exportReferencesGraph extension is used.

Coming to think of it, exportReferencesGraph could use a fresh design that takes both RFC 92 derivers, as well as derivation closures into account.
IIRC diffing closures to get an accurate set of build dependencies (to allow offline builds from one closure to the next) is also still an unsolved problem, and its potential solution is probably also impacted by RFC 92.

@physics-enthusiast
Copy link

@roberth The possibility of offline rebuilds without having to copy over the entire store is actually a large part of the reason I raised these queries in the first place. At least in theory it should be possible to gather a binary cache of only the FODs in the build-time closure of a package and rebuild that package with it. My concern was that one of the things instantiation-time impurities can do that build-time impurities cannot is introduce additional external dependencies. This means that even if you could get every FOD that was used in the build (which RFC 92 would also prevent us from doing without the exportReferencesGraph extension you mentioned), it would still be possible for an offline rebuild to fail due to one of the FOD hashes being changed (or an entirely new FOD generated) by said impurity. To be fair, right now the only way I can think of for this to happen is if a dependency was intentionally misbehaving (since all of the possible hashes would still need to be specified manually), but the fact that it's even possible is still at least somewhat concerning for me. Maybe the ability to generate an "instantiation-time closure" of sorts could be one of the solutions to this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
RFC Related to an accepted RFC significant Novel ideas, large API changes, notable refactorings, issues with RFC potential, etc.
Projects
None yet
Development

No branches or pull requests

6 participants