Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkgs is a mess #107539

Closed
Atemu opened this issue Dec 24, 2020 · 32 comments
Closed

pkgs is a mess #107539

Atemu opened this issue Dec 24, 2020 · 32 comments
Assignees
Labels
9.needs: community feedback significant Novel ideas, large API changes, notable refactorings, issues with RFC potential, etc.

Comments

@Atemu
Copy link
Member

Atemu commented Dec 24, 2020

The mess

There are a ton of packages which is fine of course (that's what it's for) but there are also a lot of things that aren't packages. Most notably:

  • Package subsets (i.e. pythonPackages, haskellPackages)
  • Functions (i.e. pkgs.runCommand)
  • Function subsets
  • Old, broken stuff that doesn't even eval (not even with tryEval in some extreme cases :/)

Keeping track of package subsets has already caused a few headaches (see #102508) and this also just doesn't feel right. We have the power to be declarative and neat thanks to our functional idioms, we can do better.

Fixing it

I was thinking of pulling subsets out of all-packages, into a new attrset and then "overlaying" it back on top of pkgs to get the same pkgs set we have right now (plus maybe a "subsets" attribute).
Functions should probably get the same treatment or maybe even moved to lib.

Some things I'd like

I'd like to get rid of pkgs.AAAAAASomeThingsFailToEvaluate. Everything under pkgs should at least eval so far to be type-checkable. Completely broken or highly experimental stuff belongs elsewhere IMO.

cc @garbas

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/nixpkgs-has-been-the-largest-repository-for-months/10667/1

@7c6f434c
Copy link
Member

7c6f434c commented Dec 24, 2020 via email

@kira-bruneau
Copy link
Contributor

kira-bruneau commented Dec 24, 2020

I think I might be missing something, but if everything that's not a package is going to be overlaid back into pkgs, what's the point of adding a layer of indirection?

I would still want to access non-packages when importing nixpkgs directly. For example:

{ pkgs ? import <nixpkgs> {} }:

pkgs.lib

EDIT: Would the set that only contains packages be available through pkgs.pkgs? Would something like pkgs.pkgsi686Linux also only have packages?

@Atemu
Copy link
Member Author

Atemu commented Dec 25, 2020

You will still not be able to nix-instantiate Nixpkgs in the current sense, because on any given platform some packages fail to evaluate if only because they are for a different platform.

They fail in a well behaved manner that even print a helpful message telling me why it can't eval.

I can also still type check it (e.g. lib.isDerivation pkgs.firefox works on my mac), have a look at its meta to find out whether it's supported on my platform, look at its description and all its other attributes like inputs, source etc. etc. Basically everything works but instantiating it fully.
I'm not sure how precisely this works but I think you're allowed to eval the entire attrset but aren't allowed to make a derivation out of it.

Contrast this to things I can't even begin to touch with tryEval like pkgs.androidndkPkgs_9_0, yeah...

This was also more of a "stretch goal", something I'd like to see on top of the other proposed changes. Should've worded that clearer.

if everything that's not a package is going to be overlaid back into pkgs, what's the point of adding a layer of indirection?

That there is a trivial way to make an attrset of just packages without all the other stuff. Currently, there is no way to know whether something under pkgs is a package, subset, function or can even be evaluated.

Instead of only having a messy set, I'd like a bunch of clean sets with clear purposes and boundaries between them. For compatibility and convenience sake, I'd then like to re-create the messy set by combining the clean ones.

What we gain by that is having these clean sets and using them to organise our huge amount of packages moving forward so that we don't go back to having such a mess.

I would still want to access non-packages when importing nixpkgs directly.

Good point. Things like lib or other entry points to non-package sets should remain. They should be declared outside the "real" pkgs set though of course and only "overlaid" for convenience.

Would the set that only contains packages be available through pkgs.pkgs?

I don't think that'd work as pkgs is nixpkgs itself I think and you need nixpkgs to have a pkgs attribute to reach the top-level.

Have a look at pkgs.pkgs.pkgs.pkgs.pkgs.pkgs.pkgs.pkgs.pkgs. ;)

@7c6f434c
Copy link
Member

7c6f434c commented Dec 25, 2020 via email

@xaverdh
Copy link
Contributor

xaverdh commented Dec 25, 2020

I support the idea of having a cleaner structured interface like that.
As mentioned already, this would allow to easily walk just the "proper" packages for packages.json, but also make things like wrapper functions and the like more easily discoverable. Currently things are very hard to find, especially if you don't even know they exist in the first place (thinking of wrapMpv and friends).

@Atemu
Copy link
Member Author

Atemu commented Dec 25, 2020

Note that the attribute is added specifically because the helpful message is super unhelpful when you evaluate the entire nixpkgs because of messing up arguments and have no idea why some package would even be evaluated. The intentional failure actually mentions the idea of entire Nixpkgs being evaluated, so it is useful.

We could add an AAAAfullEvaluationWarning = builtins.trace "Warning: You're evaluating the full Nixpkgs! This can take a lot of resources and time!." { }; to "catch" cases like that but I'd rather come to edge cases like these later and first lay out a plan on how we should even begin to improve things.

A goal I'd like to reach in the process of that because its somewhat related is to make nixpkgs fully evaluable with only a few well-defined exceptions that you simply exclude but that can come later. Should've been clearer on that.

make things like wrapper functions and the like more easily discoverable.

Very good point! I only learned about such things because I had some closer looks at the pkgs set while adding packages or working on packages.json but I can't imagine non-Nixpkgs contributors ever doing that.

@7c6f434c
Copy link
Member

7c6f434c commented Dec 25, 2020 via email

@Atemu
Copy link
Member Author

Atemu commented Dec 27, 2020

when you are doing a thing that will eventually fail (unfree/wrong platform will not go away)

The problem isn't that it's doing that, it's how. It's a dumb abort as soon as you try to touch it.

Packages that aren't available for your platform, are unfree, broken or have some other well-defined defect don't behave like that.

A simple well-defined exception of darwin?

I think you're misunderstanding me, I don't want everything under pkgs to be a derivation that can be instantiated or even built, I want to get pkgs into a state where I can work with its attributes without aborting unexpectedly (or worse).

Quote:

Everything under pkgs should at least eval so far to be type-checkable.

Something simple like filterAttrs (n: v: !v.meta.broken) pkgs should be possible in a well-behaved set of pkgs IMO.

Note that a ton of such wrappers need to be maintained together with some package

Why would that be an issue?

Wrappers/generators go to their respective sets and the packages set can then freely use them to make actual packages.

So is your first step to find what should be in pkgs-lib?

I think a good first step would be to define what categories of user-visible things there are in nixpkgs.

What I can think of on the top of my head:

  • constants: just values (e.g. lib.trivial.release)
  • pkgs: constants that can be instantiated to a single drv when not defect and have some metadata (pname, version, meta)
  • functions: lambdas with a name
  • generators: functions that eval to a pkg (or set of pkgs?) given some arguments (e.g. pkgs.runCommand, fetchers)
  • wrappers: generators that take a pkg and return a new pkg that extends the old one
  • sets of all the above

What else is there?

After we've done that we think about how to integrate them into nixpkgs data structure, move around all the things that are in the wrong place and then throw them together to get back a superset of the current pkgs.

@7c6f434c
Copy link
Member

7c6f434c commented Dec 27, 2020 via email

@Atemu
Copy link
Member Author

Atemu commented Dec 31, 2020

We still have some packages that behave like that

I think as long as we have those it is fine for the fail-early attribute to do the immediate abort;

That is precisely what I'd like to see changed in that "stretch goal". Quote:

Completely broken or highly experimental stuff belongs elsewhere IMO.

If those packages didn't exist or were confined to well-defined places, we wouldn't need AAAAAASomeThingsFailToEvaluate in the main pkgs set.

filtering for being a function

Assuming we can type-check all attrs, yes, that'd work.

It wouldn't be particularly tidy and more complicated than it needs to be though and I'm not a fan of that.

filtering or recursion for being a sub-package-set

How would you find out whether something is a set of packages? Typecheck recursively?
That'd fail at pkgs.broken.AAAAAASomeThingsFailToEvaluate.

Also, some sets in pkgs contain packages, wrappers, generators and even functions at the same time. What kind of set are those?

Here functions overlap with generators and wrappers overlap half with generators and half with pkgs, right?

The wave function collapsing inside my flash drive also overlaps with "it stores cat pictures" but that doesn't make it a helpful description.

@7c6f434c
Copy link
Member

7c6f434c commented Dec 31, 2020 via email

@FRidh
Copy link
Member

FRidh commented Jan 2, 2021

I was thinking of pulling subsets out of all-packages, into a new attrset and then "overlaying" it back on top of pkgs to get the same pkgs set we have right now (plus maybe a "subsets" attribute).

Sounds good to me, minus the subsets attribute. Annoying bit is to keep attributes in the package set and subsets set aligned. As an alternative, it is possible to just group the sets together at for example the bottom of the file. Given the size of the file, I don't mind having it separate though.

I can see that in the future we want to have this in a separate set anyway because of the splicing of sub package sets (won't go into details here).

Functions should probably get the same treatment or maybe even moved to lib.

Pure functions, yes. Functions that build derivations, no.

Regarding grouping of items within subsets, that maybe should be part of NixOS/rfcs#83.

@FRidh
Copy link
Member

FRidh commented Jan 3, 2021

I was thinking of pulling subsets out of all-packages, into a new attrset and then "overlaying" it back on top of pkgs to get the same pkgs set we have right now

We should also have a release-*.nix that have all these sets as attributes, so we can easily test evaluate a package set. Writing this as I am tracking a recursion error on master...

@fgaz
Copy link
Member

fgaz commented Jan 11, 2021

Related: #7866, #8801, #39169 (comment) (somewhat), #39561

@tadfisher
Copy link
Contributor

Package subsets (i.e. pythonPackages, haskellPackages)

Most of these are generated package sets pulled from some external repository. They should all be migrated to separate repositories, perhaps in nix-community, and flake-ified. CI can then automate updates without requiring manual review.

Obviously this would need to wait until flakes are in Nix stable.

@bqv
Copy link
Contributor

bqv commented Feb 15, 2021

C.f. the emacs packagesets

@7c6f434c
Copy link
Member

7c6f434c commented Feb 15, 2021 via email

@tadfisher
Copy link
Contributor

Separated of corresponding language implementations, to maximise the fun of coordinating CI.

Can you elaborate? We're talking about package repositories; presumably the inputs to these flakes would be the same language implementations we have currently in nixpkgs. The main coordination point would be test suites to ensure updates don't occur when a representative sample of packages fail to build.

Except for every language ecosystem I have heard about that applies various fix-ups manually on every large update

Can you provide examples?

@7c6f434c
Copy link
Member

7c6f434c commented Feb 15, 2021 via email

@tadfisher
Copy link
Contributor

So now to update Python you need to make the PythonPackages flake unbuildable without locking, then separately fix it?

That is more or less the status quo, is it not? Except we just skip the "run bespoke shell script in nixpkgs" step to generate Nix expressions. I wouldn't suggest committing changes which don't pass tests.

Haskell ecosystem has a ton of overrides. Common Lisp ecosystem has a ton of overrides (relative to the number of included packages). Python ecosystem seems to be mainly carefully picking versions to be able to have a single preferred version in most cases (which is not always the latest at PyPI, obviously).

These are fair points. Again, I don't think this is any different from the status quo, except we are reacting to changes more quickly when we can automate updating and reporting failures.

@bqv
Copy link
Contributor

bqv commented Feb 15, 2021

Worth remembering that everything in flakes is locked, so there is a very clear and expressive way to show which configurations are "tested and known to work", so e.g. Python updates wouldn't make it into a pythonpackages flake until updated. Hence with that and what @tadfisher pointed out, I don't think that's a real issue

@7c6f434c
Copy link
Member

7c6f434c commented Feb 15, 2021 via email

@jtojnar
Copy link
Member

jtojnar commented Feb 15, 2021

There is no reason why the staging step cannot happen in the same repository, instead of flake. In fact it already does – peti updates stuff on haskell-updates branch and only merges it to master when it passes certain level of QA. Python package set does the same in pull requests. The level of QA could be increased if we have more resources but that is completely orthogonal to breaking nixpkgs into flakes.

@7c6f434c
Copy link
Member

7c6f434c commented Feb 15, 2021 via email

@jtojnar
Copy link
Member

jtojnar commented Feb 15, 2021

Also the language ecosystems are not completely independent – there are core packages (e.g. libinput) that depend on many python modules so the split would introduce a cycle.

@jtojnar How do you propose flattening the pkgs set, then? Should there be further input arguments to the nixpkgs function, e.g. pythonVersion, haskellVersion, etc?

I have no idea how to go about flattening.

@tadfisher
Copy link
Contributor

@jtojnar How do you propose flattening the pkgs set, then? Should there be further input arguments to the nixpkgs function, e.g. pythonVersion, haskellVersion, etc?

@Atemu
Copy link
Member Author

Atemu commented Feb 26, 2021

I'd highly appreciate if we could get back on topic here; splitting Nixpkgs wouldn't change a thing about the inability to know whether an attr of pkgs is a subset, package, function or evaluable and would probably introduce an even greater mess as you wouldn't even know whether something is in the same repo or not.

Related: #7866, #8801, #39169 (comment) (somewhat), #39561

Not really, these are the organisation of real, working packages in all-packages.nix and the Nixpkgs directory layout, not pkgs, the attrset visible to users.

all-packages.nix is also in dire need of a cleanup but a fix for this issue would only touch it so far as in that non-package declarations may be bundled together and/or moved to other files.

Sounds good to me, minus the subsets attribute. Annoying bit is to keep attributes in the package set and subsets set aligned.

I'm not sure what you mean with that.

I'd like to remove the need of needing to keep anything aligned by declaring them in one single point of truth.

I.e. the only way to add a subset could be to make add an attr in subsets.nix and Nixpkgs automatically combines that attrset with all-packages to make the pkgs set.
This way it'd be trivial to also export a pkgs set without subsets, just the set of subsets etc.

Nope, the _packages_ are neither completely broken not experimental. They work just fine when imported into a different package subset.

I'm not sure what kind of packages you're talking about here (an example would help).

I haven't come across anything outrageous yet.

Their inclusion in the wrong subset is kind of broken, but well, it is the simplest way to manage language ecosystems for multiple language versions.

Again, not sure which packages you're talking about but if the information on which language version a package works exists, it could be used to mark those packages as broken or even exclude them from the list entirely.

If you think precondition violation should be done via meta.broken and not via assertions, there is another old issue

That'd be ideal.

(without much consensus or traction).

even the part of your proposal that is clearly a useful uniformisation, and would need only localised changes, has somehow failed to gain traction previously.

I don't care how many issues about (somewhat) related problems there have been and how much traction they have gotten in the past, this is an issue which still exists and has now shown first signs of needing to get fixed sooner rather than later.

opposed to splitting in Nixpkgs what can be just filtered by an external function if the cleanups succeed

The problem is: While you may be able to filter it with an external function at some point (I was after deleting a few attrs), it's still not possible to know where you should look for more packages and where not.

This is mostly about discoverability

To get a list of working packages in Nixpkgs, you currently need to:

  1. remove completely broken stuff from Nixpkgs source code
  2. Filter out broken stuff with tryEval
  3. Filter out non-derivations
  4. Filter out broken/unavailable packages using meta.broken

This is what's required to get a list in just one dimension however and as we all know: Nixpkgs is a tree, not a list.

Even if we could type-check for isAttrs && !isDerivation and then do step 2-4 for those sets' attrs (with the big assumption that they only contain things that can be tryEval'd), we'd still need to know about subsets nested under derivations (e.g. emacs.pkgs).

Obviating the need for:

  1. can trivially be done by cleaning up a bit
  2. needs every package to be well-behaved and broken stuff (re)moved
  3. would require every attr of pkgs to be a package
  4. could be done by providing a set of non-broken packages ready to use but doesn't really need to at that point

The subset issue can only be solved with an authorative list of subsets and where to find them in my eyes.

I think a classification should include some discussion of overlaps of the classes listed on equal footing.

AFAICT there is no overlap other than that some categories are more specific versions of others. I had assumed that's what you meant by "overlap".
If not, please provide some examples because I really don't know how those could overlap.

If it does not, well, is there any evidence it is a useful classification?

None whatsoever. If you can provide a better one; feel free. As I said, those were just some suggestions off the top of my head.

I don't actually care all that much about how they're categorised or what they're called, what I really is want a clear distinction between packages and things that aren't packages so that we can sort pkgs accordingly.
Distinction between the non-package things would also be useful though.

@7c6f434c
Copy link
Member

I'm not sure what kind of packages you're talking about here (an example would help).

Like Python packages which only work for PyPy or only for Python3 (note that PyPy for Python2 language is still supported upstream) or something like that.

And having multiple subsets with Python packages for different implementations of Python built form the same individual-package expressions — also with an option to pass an expression the Python package set and quickly pick to what Python it corresponds — is a feature, not a bug.

I don't care how many issues about (somewhat) related problems there have been and how much traction they have gotten in the past.

Oh well, then you will a few months later when this still has a lively discussion and goes nowhere in practice.

this is an issue

An issue, singular? I think you combine multiple complaints, and I prefer the current state to the proposed alternative re: subsets.

subsets nested under derivations (e.g. emacs.pkgs).

So, what I see as definite cleanups that could be done in parallel and that I would support:

  • convert everything to use meta.broken instead of assertions, at least on everything reachable from Nixpkgs
  • make sure outPath only happens with derivations
  • agree on some marker attributes for subsets, and make sure subsets-as-package-attributes are direct attributes

Then what you want re: package enumeration should become possible to provide as a library function.

But then you will probably need to push ahead specific narrow kinds of changes so that you do not get stuck designing a huge treewide overhaul where even the plan will need a table of contents.

AFAICT there is no overlap other than that some categories are more specific versions of others. I had assumed that's what you meant by "overlap".

Hm, you are right, all overlap cases are inclusion / «more-specific-version».

If it does not, well, is there any evidence it is a useful classification?

None whatsoever. If you can provide a better one; feel free. As I said, those were just some suggestions off the top of my head.

Is there any evidence there is a classification that is actually useful?

To get a list of working packages in Nixpkgs,

Which is not really well-specified, not useful to most, and definitely not worth giving up on something actually valuable, like subsets.

@bqv
Copy link
Contributor

bqv commented Feb 27, 2021

  • convert everything to use meta.broken instead of assertions, at least on everything reachable from Nixpkgs

This seems to me as though it should have been the case from the start

@7c6f434c
Copy link
Member

7c6f434c commented Feb 27, 2021 via email

@roberth
Copy link
Member

roberth commented Aug 13, 2021

I'm closing this because the issue by itself is not actionable and the conversation has ended.
To continue the discussion, please open a new issue/issues proposing specific changes. This keeps the conversation more focused.

@roberth roberth closed this as completed Aug 13, 2021
@infinisil infinisil added the significant Novel ideas, large API changes, notable refactorings, issues with RFC potential, etc. label Sep 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
9.needs: community feedback significant Novel ideas, large API changes, notable refactorings, issues with RFC potential, etc.
Projects
None yet
Development

No branches or pull requests