Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC 0109] Nixpkgs Generated Code Policy #109

Closed
wants to merge 20 commits into from
Closed
Changes from 19 commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
210 changes: 210 additions & 0 deletions rfcs/0109-nixpkgs-generated-code-policy.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,210 @@
---
feature: nixpkgs-generated-code-policy
start-date: 2021-10-12
author: John Ericson (@Ericson2314)
co-authors: (find a buddy later to help out with the RFC)
shepherd-team: @L-as @sternenseemann @tomberek @DavHau
shepherd-leader: @sternenseemann
lheckemann marked this conversation as resolved.
Show resolved Hide resolved
Comment on lines +4 to +7
Copy link
Member

@lheckemann lheckemann Aug 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
author: John Ericson (@Ericson2314)
co-authors: (find a buddy later to help out with the RFC)
shepherd-team: @L-as @sternenseemann @tomberek @DavHau
shepherd-leader: @sternenseemann
author: @sternenseemann
co-authors: @Ericson2314
shepherd-team: @L-as @kfearsoff @tomberek @DavHau
shepherd-leader: @tomberek

Apologies for the previous suggestion, where I mistakenly missed out making sternenseemann the author.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, I thought @Ericson2314 wrote this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

related-issues: (will contain links to implementation PRs)
---

# Summary
[summary]: #summary

Nixpkgs contains non-trivial amounts of generated, rather than hand-written code.
We want to start systematizing to make it easier to maintain.
There is plenty of future work building upon this we could do, but we stop here for now to avoid needing to change any tools (Nix, Hydra, etc.).

# Motivation
[motivation]: #motivation

Nixpkgs, along with every other distro, also faces a looming crisis: new open source software is increasingly not intended to be packaged by distros at all.
Many languages now support very large library ecosystems, with dependencies expressed in a language-specific package manager.

Right now, to deal with these packages, we either convert by hand, or commit lots of generated code into Nixpkgs.
But I don't think either of those options is healthy or sustainable.
The problem with the first is sheer effort; we'll never be able to keep up.
The problem with the second is bloating Nixpkgs but more importantly reproducibility: If someone wants to update that generated code it is unclear how.
All these mean that potential users coming from this new model of development find Nix / Nixpkgs cumbersome and unsuited to their needs.
Comment on lines +24 to +28
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with the second is bloating Nixpkgs but more importantly reproducibility: If someone wants to update that generated code it is unclear how.

I think this needs an explanation as to why reproducibility affects simplicity of updating the packages.

I don't see a clear relationship between reproducibility and the difficulty updating a package.
We can make update mechanisms that are very easy to use without guaranteeing reproducibility, or vice versa have reproducible mechanisms which are hard to use.

For example, an update interface like nix run .#my-packages.update would be very simple to use, but its result not necessarily reproducible.

Vice verse, an update mechanism can be perfectly reproducible, but if it lacks a good UI, and instead requires the user to read the source code of nixpkgs to find the right commands or attributes to trigger it, it is difficult to use despite being reproducible.

Currently the RFC seems to focus on reproducibility only. I think the statement about the UX improvement should either be dropped, or an explanation added as to how this is achieved.

All these mean that potential users coming from this new model of development find Nix / Nixpkgs cumbersome and unsuited to their needs.

I think it makes sense to clarify which users are targeted by the RFC. Is it nixpkgs maintainers or downstream users or both?

What downstream users need is a way to create, update and modify package sets to their needs. For them, these interactions need to be simple, more than reproducible. The modifications become manifested in the code anyways and reproducibility is ensured from there on.
Increasing the complexity of tools in order to comply with the RFC can introduce a cost for these users, with no added benefit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs an explanation as to why reproducibility affects simplicity of updating the packages.

In the sense that when it is easy to reproduce the state of the generated code as present in the repository, it is easier to also effectuate desired changes. If it is, for example, not possible to update generated code without pulling in all available updates, it may be harder to just change a specific, little thing.

This would be my account of this, but perhaps the paragraph can be improved.

Is it nixpkgs maintainers

Yes.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. This part is not talking about "reproducibility" as a Nix concept: it talks about extending, updating or modifying existing code. The wording might require some polish here.


The lowest hanging fruit is to systematize our generated code.
We should ensure anyone can update the generated code, which means it should be built in derivations not some ad-hoc way.
In short, we should apply the same level of rigour that we do for packages themselves to generated code.

# Detailed design
[design]: #detailed-design

## Nixpkgs

1. Establish the policy that all generated code in nixpkgs must be produced by a derivation.
The derivation should be built by CI (so exposed as some Nixpkgs in some fashion).

2. Implement script(s) for maintainers which automatically builds these derivations and vendors their results to the appropriate places.
Running such scripts should be sufficient to regenerated all generated code in Nixpkgs.

Greenfield tooling should not be merged unless it complies with the policy from day one.
Existing non-compliant tooling doesn't need to be ripped out of Nixpkgs, but the "grace period" in which is brought to compliance should be bounded.

3. Ensure via CI that the vendored generated code is exactly what running the scripts produce.
This check should be one of the "channel blocking" CI jobs.

# Examples and Interactions
[examples-and-interactions]: #examples-and-interactions

## Impurities

Many `lang2nix`-type tools have impure steps today.
Since these tools must be only invoked inside the derivations to generate code, the impure inputs must be gotten via fixed output derivations.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would strongly recommend against Fixed-ouput drvs. they lead to a situation where it’s trivial to run into a reproducibility problem. The prefetching step as mentioned a few lines later is the way to go, since that way only one maintainer generates the file and we are not at the mercy of $version resolver being reproducible.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Version solvers etc. should be executed inside the normal derivation. Fixed output derivations would be package repository indices etc.

Copy link
Member Author

@Ericson2314 Ericson2314 Aug 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Profpatsch I am a bit confused what you mean? I mean fixed output derivations to just fetch things, not fixed output derivations that to a lot of arbitrary "vendoring" work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, but then I’d just call them “fixed output fetchers” here, to make the distinction.

Cause you can also used FODs for doing arbitrary stuff.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are not at the mercy of $version resolver being reproducible.

Aren't we that anyway no matter how we implement it? Vendored FOD or not? We somehow need to pin dependency versions and that part must be reproducible.

Copy link
Member

@sternenseemann sternenseemann Sep 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that is necessary, in most cases the input addressed approach will be ok: We have fixed-output data files as an input (which is updated via an "impure step" i.e. an update script) and run a solver in a regular derivation (which is hopefully deterministic!).

For taming quicklisp I don't think impure derivations are useful, since they are still sandboxed, aren't they?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the quicklisp lockfile fully self-contained? The ideal is when the lockfile can be converted to Nix because it has all the information we need, and the hashes are compatible with Nix. In most cases, there are often missing pieces that require querying the Internet.

For example the yarn.lock files are fully self-contained unless one of the dependencies points to a git repo. Then you have to go fetch that repo and calculate the Nix hash out of it.

That's where you would typically reach out for FOD, __noChroot or the new "impure" attribute.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Quicklisp we look at the packages themselves to check the dependency information (index contains some claims about dependencies that are usually correct but we have seen some exceptions so we don't trust that blindly)

I don't think Quicklisp has lockfiles: normal mode of operations works within a single globally coordinated (and tested) version set.

Copy link
Member

@DavHau DavHau Feb 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Ericson2314 I think your suggested change from above is good and should be applied.

The approach of running a version resolver inside a derivation is not a practical approach for some ecosystems.
The reason usually being that there is no package index readily available to feed into the solver derivation.

As mentioned above, some ecosystems expose dependency info only inside package sources, not offering a central index.

We can build our own package index by crawling all packages regularly, but the upstream resolver will most likely be incompatible to our custom package index.
-> we now have to invent our own dependency resolver as well.

I already went through all of that when building mach-nix which is a nix code generator for python using a custom python package index pypi-deps-db.

My personal experience is that this approach introduces significant maintenance overhead. Not only must the crawler/db/resolver be compatible to the complete history of standardization of the ecosystem but also adopt all upcoming standards to not fall out of sync.

Therefore I think it is a good idea for the RFC to allow a minimal amount of impure operations. This simply allows to use upstream tooling to lock down the inputs.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to clarify how the dependency tree should look like. Is my intuition here correct?

resolver (pure) - run result (impure) - fetcher (pure) -|- dep1 (pure) -|- package deps (pure)
                                                        |- dep2 (pure) -|
                                                        |- dep3 (pure) -|

EDIT: I think this page from dream2nix documentation describes exactly what we want to achieve, does it not? If so, would be nice to reference it.

This might require changes to those tools to separate the pure work from the impuire gather steps.

Additionally, as @7c6f434c point out, some upstream tooling thinks it is being pure, but the "lock files" (or similar) pinning mechanism it provides isn't up to the task for Nix's purposes.
Copy link

@chayleaf chayleaf Oct 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can give a specific example of Gradle lockfiles. It specifies the package and its version, but it doesn't specify its hash. There is a separate mechanism for storing dependencies' hashes, but due to the way Gradle works the same version may be fetched from a different repository in a different part of the build graph, which may have a different hash, but Gradle only stores one hash per dependency

Quicklisp, for example, uses a "weird mix of MD5 constraints and SHA1 constraints" that isn't really up to the task.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not clear what issue is being pointed out. Is it about hashes not being cryptographically strong, or that they were forced out of Nix ecosystem, or that it's a mix of the two?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most generally, any hash we cannot cajoul into a store path or fixed output derivation.

It needs to be a hash algo we support, and (assuming no one else supports NAR) a hash of a flat file (such as a tarball).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can just support the ones we don't right now, but I do think it's mainly cryptographic strength. We don't want to support non-cryptographic hashes, yet we support SHA1 ironically.

Another example would be using git commit hashes, which, since we don't want to download the whole history, are not good enough on their own.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't true. Given a commit, you can fetch the tree corresponding to that commit without fetching any of the parent commits (or their data).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And just to make it explicit: indeed the downloaded content can be verified (the previous history is mediated in hashing by just the hash of the parent commit, which is small)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@L-as
Err i mean the commit hash doesn't work as a fixed output hash. Yes, it does identify what to download, but isn't sufficient to very the downloaded tree in isolation.

@7c6f434c Yes, I mentioned such Merkle inclusion proofs as future work in a comment in the IPFS RFC: https://github.com/NixOS/rfcs/pull/133/files#r957445525. But I think they are well out of scope for this one.

Any idea how I might make this clearer?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing the part about downloading the whole history and saying that git commit hashes are not supported by Nix would be an improvement. If you want to give a justification why they probably won't be supported really soon, say they are more complicated to verify than normal content hashes. I just agreed that whole-history justification doesn't justify things because it's not completely true; I agree that extra complexity (and git-specificity) justifies lack of Nix support.


A concrete example of a change that would bring such tooling into compliance is via "prefetching" to build a map of insufficient upstream-tool keys (say a pair of a name and lousy hash) to higher quality hashes for fixed output derivations.
Ericson2314 marked this conversation as resolved.
Show resolved Hide resolved
The prefetching step would be run impurely but do as little work as possible, and the remaining bulk of the work would be done purely in derivations.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Ericson2314 marked this conversation as resolved.
Show resolved Hide resolved
Copy link

@ghost ghost Sep 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd actually only prefetch the crates.io index

@sternenseemann, fetching the crates.io index is pure (it's an FOD). So if that's an example of "prefetching", why does the text describe "prefetching" as being "run impurely"?

This terminology is a landmine that we will regret. If you prefer something besides "map-building" then great, go with that. Just please don't overload an already-heavily-used term to mean the opposite (i.e. impure) of what it currently means.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am convinced in getting rid of "prefetch". @sternenseemann among other things the "pre" makes a terminology based on statefulness. This is sort of confusing and generally not how we want people to think about Nix things, so an alternative non-state-oriented term seems like an upgrade to me.


Updating fixed output hashes and similar --- including running such a prefetch script as described above --- however, is perfectly normal and not affected by this RFC.
Such updates, as opposed to regenerations of Nix code, can be performed by hand, or with update bots like today.
The update bots would just need to learn to run the regeneration script (or risk failing CI because the vendored generated code is caught as being out of date).

## Idempotency and bootstrapping

The test that the generated sources are up to date will have to work by regenerating those generated sources and then taking a diff.
That means the regeneration process hash to be idempotent in that running it twice is the same as running it once.

This is a bit tricker than it sounds, because many `lang2nix` tools rely on their own output.
E.g. the Nix packaging for `cabal2nix` is itself generated with `cabal2nix`.
Sane setups should work fine --- after all, it would be really weird if two valid builds of `cabal2nix` behaved so differently as to generate different code --- but it still an issue worth being aware of.

(That we continue to vendor code does at least "unroll" the bootstrapping to avoid issues that we would have with, say, import-from-derivation alone.
The vendored code works analogously to the prebuilt bootstrapping tools in this case.)

@sternenseemann reminds me that some `lang2nix` tools might pin a Nixpkgs today, for various reasons.
But in this plan the tools must be built with the current Nixpkgs in the CI job ensuring sources are up to date.
`lang2nix` tools must therefore be kept continuously working when built against the latest Nixpkgs.

## What CI to use?

The easiest, and most important foundational step to do is just add a regular `release.nix` job for Hydra to test.
We might, however, want to catch these issues earlier at PR merge time, with ofborg or GitHub actions.
That is fine too.

## Who does the work?

In the short term, this is a decent chunk of work for `lang2nix` tool authors and language-specific packages maintainers, who must work to ensure their tools and workflows are brought into line with this policy.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is... not really helping much.

We can't just say "lang2nix tool authors and language maintainers I dunno" will do it. This is not an actionable point. I think the point of RFCs is to provide a clear path forward; shifting the burden of complying to an "idea" that is not even standardized onto the community will get us nowhere.

Let's start talking with @DavHau , a lot. He's the biggest player out there. He has spent quite a while developing the solution. He has spent no less time working on formal definitions that are of utmost importance here: he documented architecture, its components in detail, extension details, and he has quite detailed API details in the works. This is outstanding groundwork that exists and is working already; let's go ahead and discuss the critical details while we're in a discussion stage. Going through with the PR that states "the community will figure it out somehow" will at best leave @DavHau to do all the work himself without upstream support; at worst it will render his work obsolete because we have decided to do something else long after the discussion phase has ended.

I propose @DavHau to become a shepherd. He might not have the time for it, or might decline for any other reason. This is fine. But we NEED his input on this matter; he's the most competent person in the room on the topic.

We should also consider the lengths at which we are willing to go to. At the moment of me writing this, 495/500 npm packages are built using dream2nix correctly. That sounds good enough to deserve some serious evaluation regarding upstreaming the efforts. I propose to start discussion with maintainers of nodePackages about it.

That won't always be fun!

On the flip side, a major cost of today's situation is since so many of the workflows are more an "oral tradition" to the maintainers and not fully reproducible, one-off contributors often need a lot of hand-holding.
@sternenseemann tells me he must spend a lot of manual time shepherding PRs, because those PR authors are unable to jump through the hoops themselves.

# Drawbacks
[drawbacks]: #drawbacks

This is now a very conservative RFC so I do not think there are any drawbacks as to the goals themselves.

Bringing our tools into compliance with this policy will take effort, and of course that effort could be spent elsewhere, so there is opportunity cost to be aware of.
But given the general level of concern over the sustainability of Nixpkgs, I think the benefits are worth the costs.

# Alternatives
[alternatives]: #alternatives

No good this time, we had other ideas but they are reframed as *possible* future work.
It is unclear which of the alternative "2nd steps" is better, or whether we ought to try to jump ahead straight to the "3rd step".

The plan proposed here is unquestionably the most conservative one, and basically a prerequisite of all the others --- a first step no matter what we plan to do afterwords.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using standard IFD might be worth including.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is below, no? Because hydra will stall out with no good GUI feedback with IFD I don't think it is a realistic proposal yet :/.

# Unresolved questions
[unresolved]: #unresolved-questions

How long should the "grace period" be for bringing existing tooling into compliance be?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I proposed above - I think we should begin with detailed discussion with dream2nix maintainer. After that, the question resolves itself - the "grace period" is as long as he and relevant nixpkgs maintainers have agreed on.


# Future work
[future]: #future-work

## A possible 2nd step: Vendor generated code "out of tree"

The first issue that remains after this RFC is generated code still bloats the Nixpkgs history.
It would be nice to get it "out of tree" (outside the Nixpkgs repo) so this is no longer the case.
In our shepherd discussions we had two ideas for how this might proceed.

It was tempting to go straight to proposing one of these as part of the RFC proper,
but they both contained enough hard-to-surmount issues that we figured it was better to start something more conservative first.

### Alternative 1: Dump in other repo and fetch it

We could opt to offload all generated code into a separate repository which would become an optional additional input to nixpkgs.
This could be done via an extra `fetchTarball`, possibly a (somehow synced) channel or, in the presence of experimental features, a flake input.

#### Drawbacks

- This would be a truly breaking change to nixpkgs user interface:
Either an additional input would need to be provided or fetched (which wouldn't interact well with restrict-eval).

- Generated code becomes a second class as the extra input would need to be optional for this reason.
This is problematic for central packages that use code generation already today (pandoc, cachix, …).

- Similar Bootstrapping problems as the other alternative below: new generated code needs nixpkgs and a previous version of the generated code.

- `builtins.fetch*` is a nuisance to deal with at the moment and would probably need to be improved to make this work.
E.g. gcrooting this evaluation only dependency could prove tricky without changes to Nix.

- Extra bureaucracy would be involved with updating the generated repository and the reference to it in nixpkgs.
Additionally, special support in CI would be required for this.

### Alternative 2: Nixpkgs itself becomes a derivation output

This alternative implementation was proposed by @L-as at the meeting.
The idea is that nixpkgs would become a derivation that builds a “regular” nixpkgs source tree by augmenting files available statically with code generation.

The upside of this would be that there would only be one instance of IFD that can ever happen, namely when the source tree is built.
The produced store path then would require no IFD, and it would be obvious what relates to IFD and what doesn't.

In practice, IFD would not be necessary for users of nixpkgs if we can design a mechanism that allows the dynamically produced nixpkgs source tree to be used as a channel.
Then the IFD would only need to be executed when working on nixpkgs.

#### Drawbacks

- This approach creates a bootstrapping problem for the entirety of nixpkgs, not just for the IFD parts.
It would be necessary to build the new nixpkgs source tree using an old version of the nixpkgs source tree.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds awful, I hate how manual bootstrap-tools are, doing it to the whole Nixpkgs feels even worse.

The question, though: Why can't only the generated parts of Nixpkgs be bootstrapped using the rest? That would better than either alternative, IMHO.
In fact, I wanted to do this for all-packages.nix in NixOS/nixpkgs#50643 (4 years before #140 and NixOS/nixpkgs#237439).

The ultimate idea (which was abandoned after the initial pushback) there was to eventually generate most of all-packages.nix by walking the Nixpkgs tree via a simple shell script.
So, stdenv and the knot of packages supporting it would have been left in a tiny hand-written equivalent of all-packages.nix, and the rest of all-packages.nix would have been auto-generated.

Copy link
Member

@sternenseemann sternenseemann Jul 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't only the generated parts of Nixpkgs be bootstrapped using the rest?

Because this RFC should pose an incremental step, not a grand new design. In particular, the goal is to integrate all current codegen into this scheme while keeping changes necessitated by this to a minimum. Codegen tools in nixpkgs are packaged using code generated by them already today, so the bootstrap problem already exists and can not be alleviated easily.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so the bootstrap problem already exists and can not be alleviated easily.

Well, yes, but there would be quite a difference between bootstrapping from an autogenerated .nix file that depends on current version of Nixpkgs (and which you can still read and understand, even if it is technically programmatically generated) and bootstrapping from an autogenerated .nix file that depends on an outdated version of Nixpkgs that bootstraps from an autogenerated .nix file that depends on an even more outdated version of Nixpkgs, which bootstraps....

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's why this is listed as an alternative to the actual proposal which does not require that!

This could either be done using a fixed “nixpkgs bootstrap tarball” which occasionally needs to be bumped manually as code generation tools require newer dependencies, or by pulling in the latest nixpkgs source tree produced by e.g. Hydra.
The latter approach of course runs the risk of getting stuck at a bad nixpkgs revision which is unable to build the next ones fixing the problem.

- Working on nixpkgs may involve more friction: It'd require a bootstrap nixpkgs to be available and executing the IFD for the nixpkgs source tree, likely involving hundreds of derivations.

- Hydra jobsets would need to be sequenced: First the new nixpkgs source tree would need to be built before it can be passed on to the regular `nixpkgs:trunk`, `nixos:trunk-combined` etc. jobsets.

- Channel release would change significantly: Instead of having a nixpkgs git revision from which a channel tarball is produced (mostly by adding version information to the tree), a checkout of nixpkgs would produce a store path from which the channel tarball would be produced.
This could especially pose a problem for the experimental Flakes feature which currently (to my knowledge) assumes that inputs are git repositories.

## A possible 3rd step: Import from derivation

Even if we store the generated sources outside of tree, we are still doing the tedious work of semi-manually remaining a build cache (this time of Nix code).
Isn't that what Nix itself is for!

"import from derivation" is a technique where Nix code can simply import the result of a build, with no vendoring generated code in-tree or out-of-tree needed.

There are a number of implementation issues with it, however, that means we can't simply enable it on `hydra.nixos.org` today.
We have some "low tech" mitigations that were the original body of this RFC,
but they still require changing tools (Hydra), which adds latency and risk to the project.

## Getting upstream tools to agree on how to pin source code

A source of frustration outlined in the [Impurities](#impurities) section is when upstream tools think they are pinning exactly dependencies down, but nonetheless do so in a way that isn't good enough for our purposes.
A long standing goal of mine is to try to communicate these concerns back upstream, and nudge everyone agreeing on a common definition of what a pinned deps looks like.

I think policies such as this RFC proposes will allow us to get our `lang2nix` infrastructure in a state not only more legible to ourselves (Nix users and contributors) but also to upstream developers who won't want to spend too long investigating what exactly our requirements are.
That will make such concerns easier to communicate, and I think unlock the gradual convergence on a standard.
That's the hope at least!

## Reaching developers, more broadly

This proposal is far from the final decision on how language-specific ecosystems packages should be dealt with.
I make no predictions for the far future, it is possible we will eventually land on something completely different.

However, I think this RFC will help us reach a very big milestone where the `lang2nix` ecosystem and Nixpkgs will both be talking to each other a bit better, not just Nixpkgs saying things but not listening to a chaotic and disorganized `lang2nix` ecosystem.
This culture shift I think will be the main and most important legacy of this RFC.

A lot of developers come to the Nix ecosystem, and find that the tools work great for sysadmin-y or power-user-y things (NixOS, home-manager, etc.) but the development experience is not nearly as clearly better than using language-specific tools in comparison.
(I prefer it, but the tradeoffs are very complex.)
With the new both-ways communication described above, I think we'll have a huge leg up in refining best practices so that ultimately we have better developement workflows, and retain these people better.